Обсуждение: Need help with performance tuning pg12 on linux

Поиск
Список
Период
Сортировка

Need help with performance tuning pg12 on linux

От
"Wilson, Maria Louise (LARC-E301)[RSES]"
Дата:

Hello folks!

 

I am having a complex query slowing over time increasing in duration.  If anyone has a few cycles that they could lend a hand or just point me in the right direction with this – I would surely appreciate it!  Fairly beefy Linux server with Postgres 12 (latest) – this particular query has been getting slower over time & seemingly slowing everything else down.  The server is dedicated entirely to this particular database.  Let me know if I can provide any additional information!!  Thanks in advance!

 

Here’s my background – Linux RHEL 8 – PostgreSQL 12.17.  –

MemTotal:       263216840 kB

MemFree:         3728224 kB

MemAvailable:   197186864 kB

Buffers:            6704 kB

Cached:         204995024 kB

SwapCached:        19244 kB

 

free -m

              total        used        free      shared  buff/cache   available

Mem:         257047       51860        3722       10718      201464      192644

Swap:          4095         855        3240

 

Here are a few of the settings in our postgres server:

max_connections = 300                   # (change requires restart)

shared_buffers = 10GB

temp_buffers = 24MB

work_mem = 2GB

maintenance_work_mem = 1GB

 

most everything else is set to the default.

 

The query is complex with several joins:

 

SELECT anon_1.granule_collection_id AS anon_1_granule_collection_id, anon_1.granule_create_date AS anon_1_granule_create_date, anon_1.granule_delete_date AS anon_1_granule_delete_date, ST_AsGeoJSON(anon_1.granule_geography) AS anon_1_granule_geography, ST_AsGeoJSON(anon_1.granule_geometry) AS anon_1_granule_geometry, anon_1.granule_is_active AS anon_1_granule_is_active, anon_1.granule_properties AS anon_1_granule_properties, anon_1.granule_update_date AS anon_1_granule_update_date, anon_1.granule_uuid AS anon_1_granule_uuid, anon_1.granule_visibility_last_update_date AS anon_1_granule_visibility_last_update_date, anon_1.granule_visibility_id AS anon_1_granule_visibility_id, collection_1.id AS collection_1_id, collection_1.entry_id AS collection_1_entry_id, collection_1.short_name AS collection_1_short_name, collection_1.version AS collection_1_version, file_1.id AS file_1_id, file_1.location AS file_1_location, file_1.md5 AS file_1_md5, file_1.name AS file_1_name, file_1.size AS file_1_size, file_1.type AS file_1_type, visibility_1.id AS visibility_1_id, visibility_1.name AS visibility_1_name, visibility_1.value AS visibility_1_value

        FROM (SELECT granule.collection_id AS granule_collection_id, granule.create_date AS granule_create_date, granule.delete_date AS granule_delete_date, granule.geography AS granule_geography, granule.geometry AS granule_geometry, granule.is_active AS granule_is_active, granule.properties AS granule_properties, granule.update_date AS granule_update_date, granule.uuid AS granule_uuid, granule.visibility_last_update_date AS granule_visibility_last_update_date, granule.visibility_id AS granule_visibility_id

        FROM granule JOIN collection ON collection.id = granule.collection_id

        WHERE granule.is_active = true AND (collection.entry_id LIKE 'AJAX_CO2_CH4_1' OR collection.entry_id LIKE 'AJAX_O3_1' OR collection.entry_id LIKE 'AJAX_CH2O_1' OR collection.entry_id LIKE 'AJAX_MMS_1') AND ((granule.properties #>> '{temporal_extent, range_date_times, 0, beginning_date_time}') > '2015-10-06T23:59:59+00:00' OR (granule.properties #>> '{temporal_extent, single_date_times, 0}') > '2015-10-06T23:59:59+00:00' OR (granule.properties #>> '{temporal_extent, periodic_date_times, 0, start_date}') > '2015-10-06T23:59:59+00:00') AND ((granule.properties #>> '{temporal_extent, range_date_times, 0, end_date_time}') < '2015-10-09T00:00:00+00:00' OR (granule.properties #>> '{temporal_extent, single_date_times, 0}') < '2015-10-09T00:00:00+00:00' OR (granule.properties #>> '{temporal_extent, periodic_date_times, 0, end_date}') < '2015-10-09T00:00:00+00:00') ORDER BY granule.uuid

         LIMIT 26) AS anon_1 LEFT OUTER JOIN collection AS collection_1 ON collection_1.id = anon_1.granule_collection_id LEFT OUTER JOIN (granule_file AS granule_file_1 JOIN file AS file_1 ON file_1.id = granule_file_1.file_id) ON anon_1.granule_uuid = granule_file_1.granule_uuid LEFT OUTER JOIN visibility AS visibility_1 ON visibility_1.id = anon_1.granule_visibility_id ORDER BY anon_1.granule_uuid

 

Here’s the explain:

 

 Sort  (cost=10914809.92..10914810.27 rows=141 width=996)

   Sort Key: granule.uuid

   ->  Hash Left Join  (cost=740539.73..10914804.89 rows=141 width=996)

         Hash Cond: (granule.visibility_id = visibility_1.id)

         ->  Hash Right Join  (cost=740537.56..10914731.81 rows=141 width=1725)

               Hash Cond: (granule_file_1.granule_uuid = granule.uuid)

               ->  Hash Join  (cost=644236.90..10734681.93 rows=22332751 width=223)

                     Hash Cond: (file_1.id = granule_file_1.file_id)

                     ->  Seq Scan on file file_1  (cost=0.00..9205050.88 rows=22068888 width=207)

                     ->  Hash  (cost=365077.51..365077.51 rows=22332751 width=20)

                           ->  Seq Scan on granule_file granule_file_1  (cost=0.00..365077.51 rows=22332751 width=20)

               ->  Hash  (cost=96300.33..96300.33 rows=26 width=1518)

                     ->  Nested Loop Left Join  (cost=96092.55..96300.33 rows=26 width=1518)

                           ->  Limit  (cost=96092.27..96092.33 rows=26 width=1462)

                                 ->  Sort  (cost=96092.27..96100.47 rows=3282 width=1462)

                                       Sort Key: granule.uuid

                                       ->  Nested Loop  (cost=0.56..95998.73 rows=3282 width=1462)

                                             ->  Seq Scan on collection  (cost=0.00..3366.24 rows=1 width=4)

                                                   Filter: (((entry_id)::text ~~ 'AJAX_CO2_CH4_1'::text) OR ((entry_id)::text ~~ 'AJAX_O3_1'::text) OR ((entry_id)::text ~~ 'AJAX_CH2O_1'::text) OR ((entry_id)::text ~~ 'AJAX_MMS_1'::text))

                                             ->  Index Scan using ix_granule_collection_id on granule  (cost=0.56..92445.36 rows=18713 width=1462)

                                                   Index Cond: (collection_id = collection.id)

                                                   Filter: (is_active AND (((properties #>> '{temporal_extent,range_date_times,0,beginning_date_time}'::text[]) > '2015-10-06T23:59:59+00:00'::text) OR ((properties #>> '{temporal_extent,single_d

ate_times,0}'::text[]) > '2015-10-06T23:59:59+00:00'::text) OR ((properties #>> '{temporal_extent,periodic_date_times,0,start_date}'::text[]) > '2015-10-06T23:59:59+00:00'::text)) AND (((properties #>> '{temporal_extent,range_date_times,0,end_

date_time}'::text[]) < '2015-10-09T00:00:00+00:00'::text) OR ((properties #>> '{temporal_extent,single_date_times,0}'::text[]) < '2015-10-09T00:00:00+00:00'::text) OR ((properties #>> '{temporal_extent,periodic_date_times,0,end_date}'::text[])

 < '2015-10-09T00:00:00+00:00'::text)))

                           ->  Index Scan using collection_pkey on collection collection_1  (cost=0.28..7.99 rows=1 width=56)

                                 Index Cond: (id = granule.collection_id)

         ->  Hash  (cost=1.52..1.52 rows=52 width=16)

               ->  Seq Scan on visibility visibility_1  (cost=0.00..1.52 rows=52 width=16)

 

 

Heres a bit about the tables –

 

Granule

Collection

Granule_file

Visibility

 

Granule:

public | granule                            | table | ims_api_writer | 36 GB   | 

 

ims_api=# \d+ granule

                                                       Table "public.granule"

           Column            |            Type             | Collation | Nullable | Default | Storage  | Stats target | Description 

-----------------------------+-----------------------------+-----------+----------+---------+----------+--------------+-------------

 collection_id               | integer                     |           | not null |         | plain    |              | 

 create_date                 | timestamp without time zone |           | not null |         | plain    |              | 

 delete_date                 | timestamp without time zone |           |          |         | plain    |              | 

 geometry                    | geometry(Geometry,4326)     |           |          |         | main     |              | 

 is_active                   | boolean                     |           |          |         | plain    |              | 

 properties                  | jsonb                       |           |          |         | extended |              | 

 update_date                 | timestamp without time zone |           | not null |         | plain    |              | 

 uuid                        | uuid                        |           | not null |         | plain    |              | 

 visibility_id               | integer                     |           | not null |         | plain    |              | 

 geography                   | geography(Geometry,4326)    |           |          |         | main     |              | 

 visibility_last_update_date | timestamp without time zone |           |          |         | plain    |              | 

Indexes:

    "granule_pkey" PRIMARY KEY, btree (uuid)

    "granule_is_active_idx" btree (is_active)

    "granule_properties_producer_id_idx" btree ((properties ->> 'producer_granule_id'::text))

    "granule_update_date_idx" btree (update_date)

    "idx_granule_geometry" gist (geometry)

    "ix_granule_collection_id" btree (collection_id)

Foreign-key constraints:

    "granule_collection_id_fkey" FOREIGN KEY (collection_id) REFERENCES collection(id)

    "granule_visibility_id_fkey" FOREIGN KEY (visibility_id) REFERENCES visibility(id)

Referenced by:

    TABLE "granule_file" CONSTRAINT "granule_file_granule_uuid_fkey" FOREIGN KEY (granule_uuid) REFERENCES granule(uuid)

    TABLE "granule_temporal_range" CONSTRAINT "granule_temporal_range_granule_uuid_fkey" FOREIGN KEY (granule_uuid) REFERENCES granule(uuid)

Triggers:

    granule_temporal_range_trigger AFTER INSERT OR DELETE OR UPDATE ON granule FOR EACH ROW EXECUTE FUNCTION sync_granule_temporal_range()

Access method: heap

 

Collection:

public | collection                         | table | ims_api_writer | 39 MB   | 

 

ims_api=# \d collection

                                                 Table "public.collection"

            Column            |            Type             | Collation | Nullable |                Default                 

------------------------------+-----------------------------+-----------+----------+----------------------------------------

 id                           | integer                     |           | not null | nextval('collection_id_seq'::regclass)

 access_constraints           | text                        |           |          | 

 additional_attributes        | jsonb                       |           |          | 

 ancillary_keywords           | character varying(160)[]    |           |          | 

 create_date                  | timestamp without time zone |           | not null | 

 dataset_language             | character varying(80)[]     |           |          | 

 dataset_progress             | text                        |           |          | 

 data_resolutions             | jsonb                       |           |          | 

 dataset_citation             | jsonb                       |           |          | 

 delete_date                  | timestamp without time zone |           |          | 

 distribution                 | jsonb                       |           |          | 

 doi                          | character varying(220)      |           |          | 

 entry_id                     | character varying(80)       |           | not null | 

 entry_title                  | character varying(1030)     |           |          | 

 geometry                     | geometry(Geometry,4326)     |           |          | 

 is_active                    | boolean                     |           | not null | 

 iso_topic_categories         | character varying[]         |           |          | 

 last_update_date             | timestamp without time zone |           | not null | 

 locations                    | jsonb                       |           |          | 

 long_name                    | character varying(1024)     |           |          | 

 metadata_associations        | jsonb                       |           |          | 

 metadata_dates               | jsonb                       |           |          | 

 personnel                    | jsonb                       |           |          | 

 platforms                    | jsonb                       |           |          | 

 processing_level_id          | integer                     |           |          | 

 product_flag                 | text                        |           |          | 

 project_id                   | integer                     |           |          | 

 properties                   | jsonb                       |           |          | 

 quality                      | jsonb                       |           |          | 

 references                   | character varying(12000)[]  |           |          | 

 related_urls                 | jsonb                       |           |          | 

 summary                      | jsonb                       |           |          | 

 short_name                   | character varying(80)       |           |          | 

 temporal_extents             | jsonb                       |           |          | 

 version                      | character varying(80)       |           |          | 

 use_constraints              | jsonb                       |           |          | 

 version_description          | text                        |           |          | 

 visibility_id                | integer                     |           | not null | 

 world_date                   | timestamp without time zone |           |          | 

 tiling_identification_system | jsonb                       |           |          | 

 collection_data_type         | text                        |           |          | 

 standard_product             | boolean                     |           | not null | false

Indexes:

    "collection_pkey" PRIMARY KEY, btree (id)

    "collection_entry_id_key" UNIQUE CONSTRAINT, btree (entry_id)

    "idx_collection_geometry" gist (geometry)

Foreign-key constraints:

    "collection_processing_level_id_fkey" FOREIGN KEY (processing_level_id) REFERENCES processing_level(id)

    "collection_project_id_fkey" FOREIGN KEY (project_id) REFERENCES project(id)

    "collection_visibility_id_fkey" FOREIGN KEY (visibility_id) REFERENCES visibility(id)

Referenced by:

    TABLE "collection_organization" CONSTRAINT "collection_organization_collection_id_fkey" FOREIGN KEY (collection_id) REFERENCES collection(id)

    TABLE "collection_science_keyword" CONSTRAINT "collection_science_keyword_collection_id_fkey" FOREIGN KEY (collection_id) REFERENCES collection(id)

    TABLE "collection_spatial_processing_hint" CONSTRAINT "collection_spatial_processing_hint_collection_id_fkey" FOREIGN KEY (collection_id) REFERENCES collection(id)

    TABLE "granule" CONSTRAINT "granule_collection_id_fkey" FOREIGN KEY (collection_id) REFERENCES collection(id)

    TABLE "granule_temporal_range" CONSTRAINT "granule_temporal_range_collection_id_fkey" FOREIGN KEY (collection_id) REFERENCES collection(id)

 

 

Granule_file:

 public | granule_file                       | table | ims_api_writer | 1108 MB | 

 

\d granule_file

               Table "public.granule_file"

    Column    |  Type   | Collation | Nullable | Default 

--------------+---------+-----------+----------+---------

 granule_uuid | uuid    |           |          | 

 file_id      | integer |           |          | 

Foreign-key constraints:

    "granule_file_file_id_fkey" FOREIGN KEY (file_id) REFERENCES file(id)

    "granule_file_granule_uuid_fkey" FOREIGN KEY (granule_uuid) REFERENCES granule(uuid)

 

 

Visibility:

public | visibility                         | table | ims_api_writer | 40 kB   | 

 

\d visibility

                                   Table "public.visibility"

 Column |         Type          | Collation | Nullable |                Default                 

--------+-----------------------+-----------+----------+----------------------------------------

 id     | integer               |           | not null | nextval('visibility_id_seq'::regclass)

 name   | character varying(80) |           | not null | 

 value  | integer               |           | not null | 

Indexes:

    "visibility_pkey" PRIMARY KEY, btree (id)

    "visibility_name_key" UNIQUE CONSTRAINT, btree (name)

    "visibility_value_key" UNIQUE CONSTRAINT, btree (value)

Referenced by:

    TABLE "collection" CONSTRAINT "collection_visibility_id_fkey" FOREIGN KEY (visibility_id) REFERENCES visibility(id)

    TABLE "granule" CONSTRAINT "granule_visibility_id_fkey" FOREIGN KEY (visibility_id) REFERENCES visibility(id)

 

 

 

 

Thanks for the help!

 

Maria Wilson

Nasa/Langley Research Center

Hampton, Virginia USA

m.l.wilson@nasa.gov

 

Re: Need help with performance tuning pg12 on linux

От
Frits Hoogland
Дата:
Hi Maria, could you please run explain analyse for the problem query?
The ‘analyze’ addition will track actual spent time and show statistics to validate the planner’s assumptions.

Frits Hoogland




On 27 Dec 2023, at 16:38, Wilson, Maria Louise (LARC-E301)[RSES] <m.l.wilson@nasa.gov> wrote:

Hello folks!
 
I am having a complex query slowing over time increasing in duration.  If anyone has a few cycles that they could lend a hand or just point me in the right direction with this – I would surely appreciate it!  Fairly beefy Linux server with Postgres 12 (latest) – this particular query has been getting slower over time & seemingly slowing everything else down.  The server is dedicated entirely to this particular database.  Let me know if I can provide any additional information!!  Thanks in advance!
 
Here’s my background – Linux RHEL 8 – PostgreSQL 12.17.  – 
MemTotal:       263216840 kB
MemFree:         3728224 kB
MemAvailable:   197186864 kB
Buffers:            6704 kB
Cached:         204995024 kB
SwapCached:        19244 kB
 
free -m
              total        used        free      shared  buff/cache   available
Mem:         257047       51860        3722       10718      201464      192644
Swap:          4095         855        3240
 
Here are a few of the settings in our postgres server:
max_connections = 300                   # (change requires restart)
shared_buffers = 10GB
temp_buffers = 24MB
work_mem = 2GB
maintenance_work_mem = 1GB
 
most everything else is set to the default.
 
The query is complex with several joins:
 
SELECT anon_1.granule_collection_id AS anon_1_granule_collection_id, anon_1.granule_create_date AS anon_1_granule_create_date, anon_1.granule_delete_date AS anon_1_granule_delete_date, ST_AsGeoJSON(anon_1.granule_geography) AS anon_1_granule_geography, ST_AsGeoJSON(anon_1.granule_geometry) AS anon_1_granule_geometry, anon_1.granule_is_active AS anon_1_granule_is_active, anon_1.granule_properties AS anon_1_granule_properties, anon_1.granule_update_date AS anon_1_granule_update_date, anon_1.granule_uuid AS anon_1_granule_uuid, anon_1.granule_visibility_last_update_date AS anon_1_granule_visibility_last_update_date, anon_1.granule_visibility_id AS anon_1_granule_visibility_id, collection_1.id AS collection_1_id, collection_1.entry_id AS collection_1_entry_id, collection_1.short_name AS collection_1_short_name, collection_1.version AS collection_1_version, file_1.id AS file_1_id, file_1.location AS file_1_location, file_1.md5 AS file_1_md5, file_1.name AS file_1_name, file_1.size AS file_1_size, file_1.type AS file_1_type, visibility_1.id AS visibility_1_id, visibility_1.name AS visibility_1_name, visibility_1.value AS visibility_1_value
        FROM (SELECT granule.collection_id AS granule_collection_id, granule.create_date AS granule_create_date, granule.delete_date AS granule_delete_date, granule.geography AS granule_geography, granule.geometry AS granule_geometry, granule.is_active AS granule_is_active, granule.properties AS granule_properties, granule.update_date AS granule_update_date, granule.uuid AS granule_uuid, granule.visibility_last_update_date AS granule_visibility_last_update_date, granule.visibility_id AS granule_visibility_id
        FROM granule JOIN collection ON collection.id = granule.collection_id
        WHERE granule.is_active = true AND (collection.entry_id LIKE 'AJAX_CO2_CH4_1' OR collection.entry_id LIKE 'AJAX_O3_1' OR collection.entry_id LIKE 'AJAX_CH2O_1' OR collection.entry_id LIKE 'AJAX_MMS_1') AND ((granule.properties #>> '{temporal_extent, range_date_times, 0, beginning_date_time}') > '2015-10-06T23:59:59+00:00' OR (granule.properties #>> '{temporal_extent, single_date_times, 0}') > '2015-10-06T23:59:59+00:00' OR (granule.properties #>> '{temporal_extent, periodic_date_times, 0, start_date}') > '2015-10-06T23:59:59+00:00') AND ((granule.properties #>> '{temporal_extent, range_date_times, 0, end_date_time}') < '2015-10-09T00:00:00+00:00' OR (granule.properties #>> '{temporal_extent, single_date_times, 0}') < '2015-10-09T00:00:00+00:00' OR (granule.properties #>> '{temporal_extent, periodic_date_times, 0, end_date}') < '2015-10-09T00:00:00+00:00') ORDER BY granule.uuid
         LIMIT 26) AS anon_1 LEFT OUTER JOIN collection AS collection_1 ON collection_1.id = anon_1.granule_collection_id LEFT OUTER JOIN (granule_file AS granule_file_1 JOIN file AS file_1 ON file_1.id = granule_file_1.file_id) ON anon_1.granule_uuid = granule_file_1.granule_uuid LEFT OUTER JOIN visibility AS visibility_1 ON visibility_1.id = anon_1.granule_visibility_id ORDER BY anon_1.granule_uuid
 
Here’s the explain:
 
 Sort  (cost=10914809.92..10914810.27 rows=141 width=996)
   Sort Key: granule.uuid
   ->  Hash Left Join  (cost=740539.73..10914804.89 rows=141 width=996)
         Hash Cond: (granule.visibility_id = visibility_1.id)
         ->  Hash Right Join  (cost=740537.56..10914731.81 rows=141 width=1725)
               Hash Cond: (granule_file_1.granule_uuid = granule.uuid)
               ->  Hash Join  (cost=644236.90..10734681.93 rows=22332751 width=223)
                     Hash Cond: (file_1.id = granule_file_1.file_id)
                     ->  Seq Scan on file file_1  (cost=0.00..9205050.88 rows=22068888 width=207)
                     ->  Hash  (cost=365077.51..365077.51 rows=22332751 width=20)
                           ->  Seq Scan on granule_file granule_file_1  (cost=0.00..365077.51 rows=22332751 width=20)
               ->  Hash  (cost=96300.33..96300.33 rows=26 width=1518)
                     ->  Nested Loop Left Join  (cost=96092.55..96300.33 rows=26 width=1518)
                           ->  Limit  (cost=96092.27..96092.33 rows=26 width=1462)
                                 ->  Sort  (cost=96092.27..96100.47 rows=3282 width=1462)
                                       Sort Key: granule.uuid
                                       ->  Nested Loop  (cost=0.56..95998.73 rows=3282 width=1462)
                                             ->  Seq Scan on collection  (cost=0.00..3366.24 rows=1 width=4)
                                                   Filter: (((entry_id)::text ~~ 'AJAX_CO2_CH4_1'::text) OR ((entry_id)::text ~~ 'AJAX_O3_1'::text) OR ((entry_id)::text ~~ 'AJAX_CH2O_1'::text) OR ((entry_id)::text ~~ 'AJAX_MMS_1'::text))
                                             ->  Index Scan using ix_granule_collection_id on granule  (cost=0.56..92445.36 rows=18713 width=1462)
                                                   Index Cond: (collection_id = collection.id)
                                                   Filter: (is_active AND (((properties #>> '{temporal_extent,range_date_times,0,beginning_date_time}'::text[]) > '2015-10-06T23:59:59+00:00'::text) OR ((properties #>> '{temporal_extent,single_d
ate_times,0}'::text[]) > '2015-10-06T23:59:59+00:00'::text) OR ((properties #>> '{temporal_extent,periodic_date_times,0,start_date}'::text[]) > '2015-10-06T23:59:59+00:00'::text)) AND (((properties #>> '{temporal_extent,range_date_times,0,end_
date_time}'::text[]) < '2015-10-09T00:00:00+00:00'::text) OR ((properties #>> '{temporal_extent,single_date_times,0}'::text[]) < '2015-10-09T00:00:00+00:00'::text) OR ((properties #>> '{temporal_extent,periodic_date_times,0,end_date}'::text[])
 < '2015-10-09T00:00:00+00:00'::text)))
                           ->  Index Scan using collection_pkey on collection collection_1  (cost=0.28..7.99 rows=1 width=56)
                                 Index Cond: (id = granule.collection_id)
         ->  Hash  (cost=1.52..1.52 rows=52 width=16)
               ->  Seq Scan on visibility visibility_1  (cost=0.00..1.52 rows=52 width=16)
 
 
Heres a bit about the tables – 
 
Granule
Collection
Granule_file
Visibility
 
Granule:
public | granule                            | table | ims_api_writer | 36 GB   | 
 
ims_api=# \d+ granule
                                                       Table "public.granule"
           Column            |            Type             | Collation | Nullable | Default | Storage  | Stats target | Description 
-----------------------------+-----------------------------+-----------+----------+---------+----------+--------------+-------------
 collection_id               | integer                     |           | not null |         | plain    |              | 
 create_date                 | timestamp without time zone |           | not null |         | plain    |              | 
 delete_date                 | timestamp without time zone |           |          |         | plain    |              | 
 geometry                    | geometry(Geometry,4326)     |           |          |         | main     |              | 
 is_active                   | boolean                     |           |          |         | plain    |              | 
 properties                  | jsonb                       |           |          |         | extended |              | 
 update_date                 | timestamp without time zone |           | not null |         | plain    |              | 
 uuid                        | uuid                        |           | not null |         | plain    |              | 
 visibility_id               | integer                     |           | not null |         | plain    |              | 
 geography                   | geography(Geometry,4326)    |           |          |         | main     |              | 
 visibility_last_update_date | timestamp without time zone |           |          |         | plain    |              | 
Indexes:
    "granule_pkey" PRIMARY KEY, btree (uuid)
    "granule_is_active_idx" btree (is_active)
    "granule_properties_producer_id_idx" btree ((properties ->> 'producer_granule_id'::text))
    "granule_update_date_idx" btree (update_date)
    "idx_granule_geometry" gist (geometry)
    "ix_granule_collection_id" btree (collection_id)
Foreign-key constraints:
    "granule_collection_id_fkey" FOREIGN KEY (collection_id) REFERENCES collection(id)
    "granule_visibility_id_fkey" FOREIGN KEY (visibility_id) REFERENCES visibility(id)
Referenced by:
    TABLE "granule_file" CONSTRAINT "granule_file_granule_uuid_fkey" FOREIGN KEY (granule_uuid) REFERENCES granule(uuid)
    TABLE "granule_temporal_range" CONSTRAINT "granule_temporal_range_granule_uuid_fkey" FOREIGN KEY (granule_uuid) REFERENCES granule(uuid)
Triggers:
    granule_temporal_range_trigger AFTER INSERT OR DELETE OR UPDATE ON granule FOR EACH ROW EXECUTE FUNCTION sync_granule_temporal_range()
Access method: heap
 
Collection:
public | collection                         | table | ims_api_writer | 39 MB   | 
 
ims_api=# \d collection
                                                 Table "public.collection"
            Column            |            Type             | Collation | Nullable |                Default                 
------------------------------+-----------------------------+-----------+----------+----------------------------------------
 id                           | integer                     |           | not null | nextval('collection_id_seq'::regclass)
 access_constraints           | text                        |           |          | 
 additional_attributes        | jsonb                       |           |          | 
 ancillary_keywords           | character varying(160)[]    |           |          | 
 create_date                  | timestamp without time zone |           | not null | 
 dataset_language             | character varying(80)[]     |           |          | 
 dataset_progress             | text                        |           |          | 
 data_resolutions             | jsonb                       |           |          | 
 dataset_citation             | jsonb                       |           |          | 
 delete_date                  | timestamp without time zone |           |          | 
 distribution                 | jsonb                       |           |          | 
 doi                          | character varying(220)      |           |          | 
 entry_id                     | character varying(80)       |           | not null | 
 entry_title                  | character varying(1030)     |           |          | 
 geometry                     | geometry(Geometry,4326)     |           |          | 
 is_active                    | boolean                     |           | not null | 
 iso_topic_categories         | character varying[]         |           |          | 
 last_update_date             | timestamp without time zone |           | not null | 
 locations                    | jsonb                       |           |          | 
 long_name                    | character varying(1024)     |           |          | 
 metadata_associations        | jsonb                       |           |          | 
 metadata_dates               | jsonb                       |           |          | 
 personnel                    | jsonb                       |           |          | 
 platforms                    | jsonb                       |           |          | 
 processing_level_id          | integer                     |           |          | 
 product_flag                 | text                        |           |          | 
 project_id                   | integer                     |           |          | 
 properties                   | jsonb                       |           |          | 
 quality                      | jsonb                       |           |          | 
 references                   | character varying(12000)[]  |           |          | 
 related_urls                 | jsonb                       |           |          | 
 summary                      | jsonb                       |           |          | 
 short_name                   | character varying(80)       |           |          | 
 temporal_extents             | jsonb                       |           |          | 
 version                      | character varying(80)       |           |          | 
 use_constraints              | jsonb                       |           |          | 
 version_description          | text                        |           |          | 
 visibility_id                | integer                     |           | not null | 
 world_date                   | timestamp without time zone |           |          | 
 tiling_identification_system | jsonb                       |           |          | 
 collection_data_type         | text                        |           |          | 
 standard_product             | boolean                     |           | not null | false
Indexes:
    "collection_pkey" PRIMARY KEY, btree (id)
    "collection_entry_id_key" UNIQUE CONSTRAINT, btree (entry_id)
    "idx_collection_geometry" gist (geometry)
Foreign-key constraints:
    "collection_processing_level_id_fkey" FOREIGN KEY (processing_level_id) REFERENCES processing_level(id)
    "collection_project_id_fkey" FOREIGN KEY (project_id) REFERENCES project(id)
    "collection_visibility_id_fkey" FOREIGN KEY (visibility_id) REFERENCES visibility(id)
Referenced by:
    TABLE "collection_organization" CONSTRAINT "collection_organization_collection_id_fkey" FOREIGN KEY (collection_id) REFERENCES collection(id)
    TABLE "collection_science_keyword" CONSTRAINT "collection_science_keyword_collection_id_fkey" FOREIGN KEY (collection_id) REFERENCES collection(id)
    TABLE "collection_spatial_processing_hint" CONSTRAINT "collection_spatial_processing_hint_collection_id_fkey" FOREIGN KEY (collection_id) REFERENCES collection(id)
    TABLE "granule" CONSTRAINT "granule_collection_id_fkey" FOREIGN KEY (collection_id) REFERENCES collection(id)
    TABLE "granule_temporal_range" CONSTRAINT "granule_temporal_range_collection_id_fkey" FOREIGN KEY (collection_id) REFERENCES collection(id)
 
 
Granule_file:
 public | granule_file                       | table | ims_api_writer | 1108 MB | 
 
\d granule_file
               Table "public.granule_file"
    Column    |  Type   | Collation | Nullable | Default 
--------------+---------+-----------+----------+---------
 granule_uuid | uuid    |           |          | 
 file_id      | integer |           |          | 
Foreign-key constraints:
    "granule_file_file_id_fkey" FOREIGN KEY (file_id) REFERENCES file(id)
    "granule_file_granule_uuid_fkey" FOREIGN KEY (granule_uuid) REFERENCES granule(uuid)
 
 
Visibility:
public | visibility                         | table | ims_api_writer | 40 kB   | 
 
\d visibility
                                   Table "public.visibility"
 Column |         Type          | Collation | Nullable |                Default                 
--------+-----------------------+-----------+----------+----------------------------------------
 id     | integer               |           | not null | nextval('visibility_id_seq'::regclass)
 name   | character varying(80) |           | not null | 
 value  | integer               |           | not null | 
Indexes:
    "visibility_pkey" PRIMARY KEY, btree (id)
    "visibility_name_key" UNIQUE CONSTRAINT, btree (name)
    "visibility_value_key" UNIQUE CONSTRAINT, btree (value)
Referenced by:
    TABLE "collection" CONSTRAINT "collection_visibility_id_fkey" FOREIGN KEY (visibility_id) REFERENCES visibility(id)
    TABLE "granule" CONSTRAINT "granule_visibility_id_fkey" FOREIGN KEY (visibility_id) REFERENCES visibility(id)
 
 
 
 
Thanks for the help!
 
Maria Wilson
Nasa/Langley Research Center
Hampton, Virginia USA

Re: [EXTERNAL] Re: Need help with performance tuning pg12 on linux

От
"Wilson, Maria Louise (LARC-E301)[RSES]"
Дата:

Thanks for the reply!!  Scroll down a bit – the explain is just a bit further down in the email!

Maria

 

From: Frits Hoogland <frits.hoogland@gmail.com>
Date: Wednesday, December 27, 2023 at 10:50 AM
To: "Wilson, Maria Louise (LARC-E301)[RSES]" <m.l.wilson@nasa.gov>
Cc: "pgsql-performance@lists.postgresql.org" <pgsql-performance@lists.postgresql.org>
Subject: [EXTERNAL] Re: Need help with performance tuning pg12 on linux

 

CAUTION: This email originated from outside of NASA.  Please take care when clicking links or opening attachments.  Use the "Report Message" button to report suspicious messages to the NASA SOC.



Hi Maria, could you please run explain analyse for the problem query?

The ‘analyze’ addition will track actual spent time and show statistics to validate the planner’s assumptions.

 

Frits Hoogland

 

 



On 27 Dec 2023, at 16:38, Wilson, Maria Louise (LARC-E301)[RSES] <m.l.wilson@nasa.gov> wrote:

 

Hello folks!

 

I am having a complex query slowing over time increasing in duration.  If anyone has a few cycles that they could lend a hand or just point me in the right direction with this – I would surely appreciate it!  Fairly beefy Linux server with Postgres 12 (latest) – this particular query has been getting slower over time & seemingly slowing everything else down.  The server is dedicated entirely to this particular database.  Let me know if I can provide any additional information!!  Thanks in advance!

 

Here’s my background – Linux RHEL 8 – PostgreSQL 12.17.  – 

MemTotal:       263216840 kB

MemFree:         3728224 kB

MemAvailable:   197186864 kB

Buffers:            6704 kB

Cached:         204995024 kB

SwapCached:        19244 kB

 

free -m

              total        used        free      shared  buff/cache   available

Mem:         257047       51860        3722       10718      201464      192644

Swap:          4095         855        3240

 

Here are a few of the settings in our postgres server:

max_connections = 300                   # (change requires restart)

shared_buffers = 10GB

temp_buffers = 24MB

work_mem = 2GB

maintenance_work_mem = 1GB

 

most everything else is set to the default.

 

The query is complex with several joins:

 

SELECT anon_1.granule_collection_id AS anon_1_granule_collection_id, anon_1.granule_create_date AS anon_1_granule_create_date, anon_1.granule_delete_date AS anon_1_granule_delete_date, ST_AsGeoJSON(anon_1.granule_geography) AS anon_1_granule_geography, ST_AsGeoJSON(anon_1.granule_geometry) AS anon_1_granule_geometry, anon_1.granule_is_active AS anon_1_granule_is_active, anon_1.granule_properties AS anon_1_granule_properties, anon_1.granule_update_date AS anon_1_granule_update_date, anon_1.granule_uuid AS anon_1_granule_uuid, anon_1.granule_visibility_last_update_date AS anon_1_granule_visibility_last_update_date, anon_1.granule_visibility_id AS anon_1_granule_visibility_id, collection_1.id AS collection_1_id, collection_1.entry_id AS collection_1_entry_id, collection_1.short_name AS collection_1_short_name, collection_1.version AS collection_1_version, file_1.id AS file_1_id, file_1.location AS file_1_location, file_1.md5 AS file_1_md5, file_1.name AS file_1_name, file_1.size AS file_1_size, file_1.type AS file_1_type, visibility_1.id AS visibility_1_id, visibility_1.name AS visibility_1_name, visibility_1.value AS visibility_1_value

        FROM (SELECT granule.collection_id AS granule_collection_id, granule.create_date AS granule_create_date, granule.delete_date AS granule_delete_date, granule.geography AS granule_geography, granule.geometry AS granule_geometry, granule.is_active AS granule_is_active, granule.properties AS granule_properties, granule.update_date AS granule_update_date, granule.uuid AS granule_uuid, granule.visibility_last_update_date AS granule_visibility_last_update_date, granule.visibility_id AS granule_visibility_id

        FROM granule JOIN collection ON collection.id = granule.collection_id

        WHERE granule.is_active = true AND (collection.entry_id LIKE 'AJAX_CO2_CH4_1' OR collection.entry_id LIKE 'AJAX_O3_1' OR collection.entry_id LIKE 'AJAX_CH2O_1' OR collection.entry_id LIKE 'AJAX_MMS_1') AND ((granule.properties #>> '{temporal_extent, range_date_times, 0, beginning_date_time}') > '2015-10-06T23:59:59+00:00' OR (granule.properties #>> '{temporal_extent, single_date_times, 0}') > '2015-10-06T23:59:59+00:00' OR (granule.properties #>> '{temporal_extent, periodic_date_times, 0, start_date}') > '2015-10-06T23:59:59+00:00') AND ((granule.properties #>> '{temporal_extent, range_date_times, 0, end_date_time}') < '2015-10-09T00:00:00+00:00' OR (granule.properties #>> '{temporal_extent, single_date_times, 0}') < '2015-10-09T00:00:00+00:00' OR (granule.properties #>> '{temporal_extent, periodic_date_times, 0, end_date}') < '2015-10-09T00:00:00+00:00') ORDER BY granule.uuid

         LIMIT 26) AS anon_1 LEFT OUTER JOIN collection AS collection_1 ON collection_1.id = anon_1.granule_collection_id LEFT OUTER JOIN (granule_file AS granule_file_1 JOIN file AS file_1 ON file_1.id = granule_file_1.file_id) ON anon_1.granule_uuid = granule_file_1.granule_uuid LEFT OUTER JOIN visibility AS visibility_1 ON visibility_1.id = anon_1.granule_visibility_id ORDER BY anon_1.granule_uuid

 

Here’s the explain:

 

 Sort  (cost=10914809.92..10914810.27 rows=141 width=996)

   Sort Key: granule.uuid

   ->  Hash Left Join  (cost=740539.73..10914804.89 rows=141 width=996)

         Hash Cond: (granule.visibility_id = visibility_1.id)

         ->  Hash Right Join  (cost=740537.56..10914731.81 rows=141 width=1725)

               Hash Cond: (granule_file_1.granule_uuid = granule.uuid)

               ->  Hash Join  (cost=644236.90..10734681.93 rows=22332751 width=223)

                     Hash Cond: (file_1.id = granule_file_1.file_id)

                     ->  Seq Scan on file file_1  (cost=0.00..9205050.88 rows=22068888 width=207)

                     ->  Hash  (cost=365077.51..365077.51 rows=22332751 width=20)

                           ->  Seq Scan on granule_file granule_file_1  (cost=0.00..365077.51 rows=22332751 width=20)

               ->  Hash  (cost=96300.33..96300.33 rows=26 width=1518)

                     ->  Nested Loop Left Join  (cost=96092.55..96300.33 rows=26 width=1518)

                           ->  Limit  (cost=96092.27..96092.33 rows=26 width=1462)

                                 ->  Sort  (cost=96092.27..96100.47 rows=3282 width=1462)

                                       Sort Key: granule.uuid

                                       ->  Nested Loop  (cost=0.56..95998.73 rows=3282 width=1462)

                                             ->  Seq Scan on collection  (cost=0.00..3366.24 rows=1 width=4)

                                                   Filter: (((entry_id)::text ~~ 'AJAX_CO2_CH4_1'::text) OR ((entry_id)::text ~~ 'AJAX_O3_1'::text) OR ((entry_id)::text ~~ 'AJAX_CH2O_1'::text) OR ((entry_id)::text ~~ 'AJAX_MMS_1'::text))

                                             ->  Index Scan using ix_granule_collection_id on granule  (cost=0.56..92445.36 rows=18713 width=1462)

                                                   Index Cond: (collection_id = collection.id)

                                                   Filter: (is_active AND (((properties #>> '{temporal_extent,range_date_times,0,beginning_date_time}'::text[]) > '2015-10-06T23:59:59+00:00'::text) OR ((properties #>> '{temporal_extent,single_d

ate_times,0}'::text[]) > '2015-10-06T23:59:59+00:00'::text) OR ((properties #>> '{temporal_extent,periodic_date_times,0,start_date}'::text[]) > '2015-10-06T23:59:59+00:00'::text)) AND (((properties #>> '{temporal_extent,range_date_times,0,end_

date_time}'::text[]) < '2015-10-09T00:00:00+00:00'::text) OR ((properties #>> '{temporal_extent,single_date_times,0}'::text[]) < '2015-10-09T00:00:00+00:00'::text) OR ((properties #>> '{temporal_extent,periodic_date_times,0,end_date}'::text[])

 < '2015-10-09T00:00:00+00:00'::text)))

                           ->  Index Scan using collection_pkey on collection collection_1  (cost=0.28..7.99 rows=1 width=56)

                                 Index Cond: (id = granule.collection_id)

         ->  Hash  (cost=1.52..1.52 rows=52 width=16)

               ->  Seq Scan on visibility visibility_1  (cost=0.00..1.52 rows=52 width=16)

 

 

Heres a bit about the tables – 

 

Granule

Collection

Granule_file

Visibility

 

Granule:

public | granule                            | table | ims_api_writer | 36 GB   | 

 

ims_api=# \d+ granule

                                                       Table "public.granule"

           Column            |            Type             | Collation | Nullable | Default | Storage  | Stats target | Description 

-----------------------------+-----------------------------+-----------+----------+---------+----------+--------------+-------------

 collection_id               | integer                     |           | not null |         | plain    |              | 

 create_date                 | timestamp without time zone |           | not null |         | plain    |              | 

 delete_date                 | timestamp without time zone |           |          |         | plain    |              | 

 geometry                    | geometry(Geometry,4326)     |           |          |         | main     |              | 

 is_active                   | boolean                     |           |          |         | plain    |              | 

 properties                  | jsonb                       |           |          |         | extended |              | 

 update_date                 | timestamp without time zone |           | not null |         | plain    |              | 

 uuid                        | uuid                        |           | not null |         | plain    |              | 

 visibility_id               | integer                     |           | not null |         | plain    |              | 

 geography                   | geography(Geometry,4326)    |           |          |         | main     |              | 

 visibility_last_update_date | timestamp without time zone |           |          |         | plain    |              | 

Indexes:

    "granule_pkey" PRIMARY KEY, btree (uuid)

    "granule_is_active_idx" btree (is_active)

    "granule_properties_producer_id_idx" btree ((properties ->> 'producer_granule_id'::text))

    "granule_update_date_idx" btree (update_date)

    "idx_granule_geometry" gist (geometry)

    "ix_granule_collection_id" btree (collection_id)

Foreign-key constraints:

    "granule_collection_id_fkey" FOREIGN KEY (collection_id) REFERENCES collection(id)

    "granule_visibility_id_fkey" FOREIGN KEY (visibility_id) REFERENCES visibility(id)

Referenced by:

    TABLE "granule_file" CONSTRAINT "granule_file_granule_uuid_fkey" FOREIGN KEY (granule_uuid) REFERENCES granule(uuid)

    TABLE "granule_temporal_range" CONSTRAINT "granule_temporal_range_granule_uuid_fkey" FOREIGN KEY (granule_uuid) REFERENCES granule(uuid)

Triggers:

    granule_temporal_range_trigger AFTER INSERT OR DELETE OR UPDATE ON granule FOR EACH ROW EXECUTE FUNCTION sync_granule_temporal_range()

Access method: heap

 

Collection:

public | collection                         | table | ims_api_writer | 39 MB   | 

 

ims_api=# \d collection

                                                 Table "public.collection"

            Column            |            Type             | Collation | Nullable |                Default                 

------------------------------+-----------------------------+-----------+----------+----------------------------------------

 id                           | integer                     |           | not null | nextval('collection_id_seq'::regclass)

 access_constraints           | text                        |           |          | 

 additional_attributes        | jsonb                       |           |          | 

 ancillary_keywords           | character varying(160)[]    |           |          | 

 create_date                  | timestamp without time zone |           | not null | 

 dataset_language             | character varying(80)[]     |           |          | 

 dataset_progress             | text                        |           |          | 

 data_resolutions             | jsonb                       |           |          | 

 dataset_citation             | jsonb                       |           |          | 

 delete_date                  | timestamp without time zone |           |          | 

 distribution                 | jsonb                       |           |          | 

 doi                          | character varying(220)      |           |          | 

 entry_id                     | character varying(80)       |           | not null | 

 entry_title                  | character varying(1030)     |           |          | 

 geometry                     | geometry(Geometry,4326)     |           |          | 

 is_active                    | boolean                     |           | not null | 

 iso_topic_categories         | character varying[]         |           |          | 

 last_update_date             | timestamp without time zone |           | not null | 

 locations                    | jsonb                       |           |          | 

 long_name                    | character varying(1024)     |           |          | 

 metadata_associations        | jsonb                       |           |          | 

 metadata_dates               | jsonb                       |           |          | 

 personnel                    | jsonb                       |           |          | 

 platforms                    | jsonb                       |           |          | 

 processing_level_id          | integer                     |           |          | 

 product_flag                 | text                        |           |          | 

 project_id                   | integer                     |           |          | 

 properties                   | jsonb                       |           |          | 

 quality                      | jsonb                       |           |          | 

 references                   | character varying(12000)[]  |           |          | 

 related_urls                 | jsonb                       |           |          | 

 summary                      | jsonb                       |           |          | 

 short_name                   | character varying(80)       |           |          | 

 temporal_extents             | jsonb                       |           |          | 

 version                      | character varying(80)       |           |          | 

 use_constraints              | jsonb                       |           |          | 

 version_description          | text                        |           |          | 

 visibility_id                | integer                     |           | not null | 

 world_date                   | timestamp without time zone |           |          | 

 tiling_identification_system | jsonb                       |           |          | 

 collection_data_type         | text                        |           |          | 

 standard_product             | boolean                     |           | not null | false

Indexes:

    "collection_pkey" PRIMARY KEY, btree (id)

    "collection_entry_id_key" UNIQUE CONSTRAINT, btree (entry_id)

    "idx_collection_geometry" gist (geometry)

Foreign-key constraints:

    "collection_processing_level_id_fkey" FOREIGN KEY (processing_level_id) REFERENCES processing_level(id)

    "collection_project_id_fkey" FOREIGN KEY (project_id) REFERENCES project(id)

    "collection_visibility_id_fkey" FOREIGN KEY (visibility_id) REFERENCES visibility(id)

Referenced by:

    TABLE "collection_organization" CONSTRAINT "collection_organization_collection_id_fkey" FOREIGN KEY (collection_id) REFERENCES collection(id)

    TABLE "collection_science_keyword" CONSTRAINT "collection_science_keyword_collection_id_fkey" FOREIGN KEY (collection_id) REFERENCES collection(id)

    TABLE "collection_spatial_processing_hint" CONSTRAINT "collection_spatial_processing_hint_collection_id_fkey" FOREIGN KEY (collection_id) REFERENCES collection(id)

    TABLE "granule" CONSTRAINT "granule_collection_id_fkey" FOREIGN KEY (collection_id) REFERENCES collection(id)

    TABLE "granule_temporal_range" CONSTRAINT "granule_temporal_range_collection_id_fkey" FOREIGN KEY (collection_id) REFERENCES collection(id)

 

 

Granule_file:

 public | granule_file                       | table | ims_api_writer | 1108 MB | 

 

\d granule_file

               Table "public.granule_file"

    Column    |  Type   | Collation | Nullable | Default 

--------------+---------+-----------+----------+---------

 granule_uuid | uuid    |           |          | 

 file_id      | integer |           |          | 

Foreign-key constraints:

    "granule_file_file_id_fkey" FOREIGN KEY (file_id) REFERENCES file(id)

    "granule_file_granule_uuid_fkey" FOREIGN KEY (granule_uuid) REFERENCES granule(uuid)

 

 

Visibility:

public | visibility                         | table | ims_api_writer | 40 kB   | 

 

\d visibility

                                   Table "public.visibility"

 Column |         Type          | Collation | Nullable |                Default                 

--------+-----------------------+-----------+----------+----------------------------------------

 id     | integer               |           | not null | nextval('visibility_id_seq'::regclass)

 name   | character varying(80) |           | not null | 

 value  | integer               |           | not null | 

Indexes:

    "visibility_pkey" PRIMARY KEY, btree (id)

    "visibility_name_key" UNIQUE CONSTRAINT, btree (name)

    "visibility_value_key" UNIQUE CONSTRAINT, btree (value)

Referenced by:

    TABLE "collection" CONSTRAINT "collection_visibility_id_fkey" FOREIGN KEY (visibility_id) REFERENCES visibility(id)

    TABLE "granule" CONSTRAINT "granule_visibility_id_fkey" FOREIGN KEY (visibility_id) REFERENCES visibility(id)

 

 

 

 

Thanks for the help!

 

Maria Wilson

Nasa/Langley Research Center

Hampton, Virginia USA

 

Re: [EXTERNAL] Need help with performance tuning pg12 on linux

От
Frits Hoogland
Дата:
Yes, there is an explain, but that is an explain that is run without ‘analyze’ added to it.
This means the query is parsed and planned, and the resulting parse tree with planner assumptions is shown.

If you add ‘analyze to ‘explain’, the actual query is run and timed, and statistics about actual execution are shown.

Frits Hoogland




On 27 Dec 2023, at 17:01, Wilson, Maria Louise (LARC-E301)[RSES] <m.l.wilson@nasa.gov> wrote:

Thanks for the reply!!  Scroll down a bit – the explain is just a bit further down in the email!
Maria
 
From: Frits Hoogland <frits.hoogland@gmail.com>
Date: Wednesday, December 27, 2023 at 10:50 AM
To: "Wilson, Maria Louise (LARC-E301)[RSES]" <m.l.wilson@nasa.gov>
Cc: "pgsql-performance@lists.postgresql.org" <pgsql-performance@lists.postgresql.org>
Subject: [EXTERNAL] Re: Need help with performance tuning pg12 on linux
 
CAUTION: This email originated from outside of NASA.  Please take care when clicking links or opening attachments.  Use the "Report Message" button to report suspicious messages to the NASA SOC.



Hi Maria, could you please run explain analyse for the problem query?
The ‘analyze’ addition will track actual spent time and show statistics to validate the planner’s assumptions.
 
Frits Hoogland
 

 



On 27 Dec 2023, at 16:38, Wilson, Maria Louise (LARC-E301)[RSES] <m.l.wilson@nasa.gov> wrote:
 
Hello folks!
 
I am having a complex query slowing over time increasing in duration.  If anyone has a few cycles that they could lend a hand or just point me in the right direction with this – I would surely appreciate it!  Fairly beefy Linux server with Postgres 12 (latest) – this particular query has been getting slower over time & seemingly slowing everything else down.  The server is dedicated entirely to this particular database.  Let me know if I can provide any additional information!!  Thanks in advance!
 
Here’s my background – Linux RHEL 8 – PostgreSQL 12.17.  – 
MemTotal:       263216840 kB
MemFree:         3728224 kB
MemAvailable:   197186864 kB
Buffers:            6704 kB
Cached:         204995024 kB
SwapCached:        19244 kB
 
free -m
              total        used        free      shared  buff/cache   available
Mem:         257047       51860        3722       10718      201464      192644
Swap:          4095         855        3240
 
Here are a few of the settings in our postgres server:
max_connections = 300                   # (change requires restart)
shared_buffers = 10GB
temp_buffers = 24MB
work_mem = 2GB
maintenance_work_mem = 1GB
 
most everything else is set to the default.
 
The query is complex with several joins:
 
SELECT anon_1.granule_collection_id AS anon_1_granule_collection_id, anon_1.granule_create_date AS anon_1_granule_create_date, anon_1.granule_delete_date AS anon_1_granule_delete_date, ST_AsGeoJSON(anon_1.granule_geography) AS anon_1_granule_geography, ST_AsGeoJSON(anon_1.granule_geometry) AS anon_1_granule_geometry, anon_1.granule_is_active AS anon_1_granule_is_active, anon_1.granule_properties AS anon_1_granule_properties, anon_1.granule_update_date AS anon_1_granule_update_date, anon_1.granule_uuid AS anon_1_granule_uuid, anon_1.granule_visibility_last_update_date AS anon_1_granule_visibility_last_update_date, anon_1.granule_visibility_id AS anon_1_granule_visibility_id, collection_1.id AS collection_1_id, collection_1.entry_id AS collection_1_entry_id, collection_1.short_name AS collection_1_short_name, collection_1.version AS collection_1_version, file_1.id AS file_1_id, file_1.location AS file_1_location, file_1.md5 AS file_1_md5, file_1.name AS file_1_name, file_1.size AS file_1_size, file_1.type AS file_1_type, visibility_1.id AS visibility_1_id, visibility_1.name AS visibility_1_name, visibility_1.value AS visibility_1_value
        FROM (SELECT granule.collection_id AS granule_collection_id, granule.create_date AS granule_create_date, granule.delete_date AS granule_delete_date, granule.geography AS granule_geography, granule.geometry AS granule_geometry, granule.is_active AS granule_is_active, granule.properties AS granule_properties, granule.update_date AS granule_update_date, granule.uuid AS granule_uuid, granule.visibility_last_update_date AS granule_visibility_last_update_date, granule.visibility_id AS granule_visibility_id
        FROM granule JOIN collection ON collection.id = granule.collection_id
        WHERE granule.is_active = true AND (collection.entry_id LIKE 'AJAX_CO2_CH4_1' OR collection.entry_id LIKE 'AJAX_O3_1' OR collection.entry_id LIKE 'AJAX_CH2O_1' OR collection.entry_id LIKE 'AJAX_MMS_1') AND ((granule.properties #>> '{temporal_extent, range_date_times, 0, beginning_date_time}') > '2015-10-06T23:59:59+00:00' OR (granule.properties #>> '{temporal_extent, single_date_times, 0}') > '2015-10-06T23:59:59+00:00' OR (granule.properties #>> '{temporal_extent, periodic_date_times, 0, start_date}') > '2015-10-06T23:59:59+00:00') AND ((granule.properties #>> '{temporal_extent, range_date_times, 0, end_date_time}') < '2015-10-09T00:00:00+00:00' OR (granule.properties #>> '{temporal_extent, single_date_times, 0}') < '2015-10-09T00:00:00+00:00' OR (granule.properties #>> '{temporal_extent, periodic_date_times, 0, end_date}') < '2015-10-09T00:00:00+00:00') ORDER BY granule.uuid
         LIMIT 26) AS anon_1 LEFT OUTER JOIN collection AS collection_1 ON collection_1.id = anon_1.granule_collection_id LEFT OUTER JOIN (granule_file AS granule_file_1 JOIN file AS file_1 ON file_1.id = granule_file_1.file_id) ON anon_1.granule_uuid = granule_file_1.granule_uuid LEFT OUTER JOIN visibility AS visibility_1 ON visibility_1.id = anon_1.granule_visibility_id ORDER BY anon_1.granule_uuid
 
Here’s the explain:
 
 Sort  (cost=10914809.92..10914810.27 rows=141 width=996)
   Sort Key: granule.uuid
   ->  Hash Left Join  (cost=740539.73..10914804.89 rows=141 width=996)
         Hash Cond: (granule.visibility_id = visibility_1.id)
         ->  Hash Right Join  (cost=740537.56..10914731.81 rows=141 width=1725)
               Hash Cond: (granule_file_1.granule_uuid = granule.uuid)
               ->  Hash Join  (cost=644236.90..10734681.93 rows=22332751 width=223)
                     Hash Cond: (file_1.id = granule_file_1.file_id)
                     ->  Seq Scan on file file_1  (cost=0.00..9205050.88 rows=22068888 width=207)
                     ->  Hash  (cost=365077.51..365077.51 rows=22332751 width=20)
                           ->  Seq Scan on granule_file granule_file_1  (cost=0.00..365077.51 rows=22332751 width=20)
               ->  Hash  (cost=96300.33..96300.33 rows=26 width=1518)
                     ->  Nested Loop Left Join  (cost=96092.55..96300.33 rows=26 width=1518)
                           ->  Limit  (cost=96092.27..96092.33 rows=26 width=1462)
                                 ->  Sort  (cost=96092.27..96100.47 rows=3282 width=1462)
                                       Sort Key: granule.uuid
                                       ->  Nested Loop  (cost=0.56..95998.73 rows=3282 width=1462)
                                             ->  Seq Scan on collection  (cost=0.00..3366.24 rows=1 width=4)
                                                   Filter: (((entry_id)::text ~~ 'AJAX_CO2_CH4_1'::text) OR ((entry_id)::text ~~ 'AJAX_O3_1'::text) OR ((entry_id)::text ~~ 'AJAX_CH2O_1'::text) OR ((entry_id)::text ~~ 'AJAX_MMS_1'::text))
                                             ->  Index Scan using ix_granule_collection_id on granule  (cost=0.56..92445.36 rows=18713 width=1462)
                                                   Index Cond: (collection_id = collection.id)
                                                   Filter: (is_active AND (((properties #>> '{temporal_extent,range_date_times,0,beginning_date_time}'::text[]) > '2015-10-06T23:59:59+00:00'::text) OR ((properties #>> '{temporal_extent,single_d
ate_times,0}'::text[]) > '2015-10-06T23:59:59+00:00'::text) OR ((properties #>> '{temporal_extent,periodic_date_times,0,start_date}'::text[]) > '2015-10-06T23:59:59+00:00'::text)) AND (((properties #>> '{temporal_extent,range_date_times,0,end_
date_time}'::text[]) < '2015-10-09T00:00:00+00:00'::text) OR ((properties #>> '{temporal_extent,single_date_times,0}'::text[]) < '2015-10-09T00:00:00+00:00'::text) OR ((properties #>> '{temporal_extent,periodic_date_times,0,end_date}'::text[])
 < '2015-10-09T00:00:00+00:00'::text)))
                           ->  Index Scan using collection_pkey on collection collection_1  (cost=0.28..7.99 rows=1 width=56)
                                 Index Cond: (id = granule.collection_id)
         ->  Hash  (cost=1.52..1.52 rows=52 width=16)
               ->  Seq Scan on visibility visibility_1  (cost=0.00..1.52 rows=52 width=16)
 
 
Heres a bit about the tables – 
 
Granule
Collection
Granule_file
Visibility
 
Granule:
public | granule                            | table | ims_api_writer | 36 GB   | 
 
ims_api=# \d+ granule
                                                       Table "public.granule"
           Column            |            Type             | Collation | Nullable | Default | Storage  | Stats target | Description 
-----------------------------+-----------------------------+-----------+----------+---------+----------+--------------+-------------
 collection_id               | integer                     |           | not null |         | plain    |              | 
 create_date                 | timestamp without time zone |           | not null |         | plain    |              | 
 delete_date                 | timestamp without time zone |           |          |         | plain    |              | 
 geometry                    | geometry(Geometry,4326)     |           |          |         | main     |              | 
 is_active                   | boolean                     |           |          |         | plain    |              | 
 properties                  | jsonb                       |           |          |         | extended |              | 
 update_date                 | timestamp without time zone |           | not null |         | plain    |              | 
 uuid                        | uuid                        |           | not null |         | plain    |              | 
 visibility_id               | integer                     |           | not null |         | plain    |              | 
 geography                   | geography(Geometry,4326)    |           |          |         | main     |              | 
 visibility_last_update_date | timestamp without time zone |           |          |         | plain    |              | 
Indexes:
    "granule_pkey" PRIMARY KEY, btree (uuid)
    "granule_is_active_idx" btree (is_active)
    "granule_properties_producer_id_idx" btree ((properties ->> 'producer_granule_id'::text))
    "granule_update_date_idx" btree (update_date)
    "idx_granule_geometry" gist (geometry)
    "ix_granule_collection_id" btree (collection_id)
Foreign-key constraints:
    "granule_collection_id_fkey" FOREIGN KEY (collection_id) REFERENCES collection(id)
    "granule_visibility_id_fkey" FOREIGN KEY (visibility_id) REFERENCES visibility(id)
Referenced by:
    TABLE "granule_file" CONSTRAINT "granule_file_granule_uuid_fkey" FOREIGN KEY (granule_uuid) REFERENCES granule(uuid)
    TABLE "granule_temporal_range" CONSTRAINT "granule_temporal_range_granule_uuid_fkey" FOREIGN KEY (granule_uuid) REFERENCES granule(uuid)
Triggers:
    granule_temporal_range_trigger AFTER INSERT OR DELETE OR UPDATE ON granule FOR EACH ROW EXECUTE FUNCTION sync_granule_temporal_range()
Access method: heap
 
Collection:
public | collection                         | table | ims_api_writer | 39 MB   | 
 
ims_api=# \d collection
                                                 Table "public.collection"
            Column            |            Type             | Collation | Nullable |                Default                 
------------------------------+-----------------------------+-----------+----------+----------------------------------------
 id                           | integer                     |           | not null | nextval('collection_id_seq'::regclass)
 access_constraints           | text                        |           |          | 
 additional_attributes        | jsonb                       |           |          | 
 ancillary_keywords           | character varying(160)[]    |           |          | 
 create_date                  | timestamp without time zone |           | not null | 
 dataset_language             | character varying(80)[]     |           |          | 
 dataset_progress             | text                        |           |          | 
 data_resolutions             | jsonb                       |           |          | 
 dataset_citation             | jsonb                       |           |          | 
 delete_date                  | timestamp without time zone |           |          | 
 distribution                 | jsonb                       |           |          | 
 doi                          | character varying(220)      |           |          | 
 entry_id                     | character varying(80)       |           | not null | 
 entry_title                  | character varying(1030)     |           |          | 
 geometry                     | geometry(Geometry,4326)     |           |          | 
 is_active                    | boolean                     |           | not null | 
 iso_topic_categories         | character varying[]         |           |          | 
 last_update_date             | timestamp without time zone |           | not null | 
 locations                    | jsonb                       |           |          | 
 long_name                    | character varying(1024)     |           |          | 
 metadata_associations        | jsonb                       |           |          | 
 metadata_dates               | jsonb                       |           |          | 
 personnel                    | jsonb                       |           |          | 
 platforms                    | jsonb                       |           |          | 
 processing_level_id          | integer                     |           |          | 
 product_flag                 | text                        |           |          | 
 project_id                   | integer                     |           |          | 
 properties                   | jsonb                       |           |          | 
 quality                      | jsonb                       |           |          | 
 references                   | character varying(12000)[]  |           |          | 
 related_urls                 | jsonb                       |           |          | 
 summary                      | jsonb                       |           |          | 
 short_name                   | character varying(80)       |           |          | 
 temporal_extents             | jsonb                       |           |          | 
 version                      | character varying(80)       |           |          | 
 use_constraints              | jsonb                       |           |          | 
 version_description          | text                        |           |          | 
 visibility_id                | integer                     |           | not null | 
 world_date                   | timestamp without time zone |           |          | 
 tiling_identification_system | jsonb                       |           |          | 
 collection_data_type         | text                        |           |          | 
 standard_product             | boolean                     |           | not null | false
Indexes:
    "collection_pkey" PRIMARY KEY, btree (id)
    "collection_entry_id_key" UNIQUE CONSTRAINT, btree (entry_id)
    "idx_collection_geometry" gist (geometry)
Foreign-key constraints:
    "collection_processing_level_id_fkey" FOREIGN KEY (processing_level_id) REFERENCES processing_level(id)
    "collection_project_id_fkey" FOREIGN KEY (project_id) REFERENCES project(id)
    "collection_visibility_id_fkey" FOREIGN KEY (visibility_id) REFERENCES visibility(id)
Referenced by:
    TABLE "collection_organization" CONSTRAINT "collection_organization_collection_id_fkey" FOREIGN KEY (collection_id) REFERENCES collection(id)
    TABLE "collection_science_keyword" CONSTRAINT "collection_science_keyword_collection_id_fkey" FOREIGN KEY (collection_id) REFERENCES collection(id)
    TABLE "collection_spatial_processing_hint" CONSTRAINT "collection_spatial_processing_hint_collection_id_fkey" FOREIGN KEY (collection_id) REFERENCES collection(id)
    TABLE "granule" CONSTRAINT "granule_collection_id_fkey" FOREIGN KEY (collection_id) REFERENCES collection(id)
    TABLE "granule_temporal_range" CONSTRAINT "granule_temporal_range_collection_id_fkey" FOREIGN KEY (collection_id) REFERENCES collection(id)
 
 
Granule_file:
 public | granule_file                       | table | ims_api_writer | 1108 MB | 
 
\d granule_file
               Table "public.granule_file"
    Column    |  Type   | Collation | Nullable | Default 
--------------+---------+-----------+----------+---------
 granule_uuid | uuid    |           |          | 
 file_id      | integer |           |          | 
Foreign-key constraints:
    "granule_file_file_id_fkey" FOREIGN KEY (file_id) REFERENCES file(id)
    "granule_file_granule_uuid_fkey" FOREIGN KEY (granule_uuid) REFERENCES granule(uuid)
 
 
Visibility:
public | visibility                         | table | ims_api_writer | 40 kB   | 
 
\d visibility
                                   Table "public.visibility"
 Column |         Type          | Collation | Nullable |                Default                 
--------+-----------------------+-----------+----------+----------------------------------------
 id     | integer               |           | not null | nextval('visibility_id_seq'::regclass)
 name   | character varying(80) |           | not null | 
 value  | integer               |           | not null | 
Indexes:
    "visibility_pkey" PRIMARY KEY, btree (id)
    "visibility_name_key" UNIQUE CONSTRAINT, btree (name)
    "visibility_value_key" UNIQUE CONSTRAINT, btree (value)
Referenced by:
    TABLE "collection" CONSTRAINT "collection_visibility_id_fkey" FOREIGN KEY (visibility_id) REFERENCES visibility(id)
    TABLE "granule" CONSTRAINT "granule_visibility_id_fkey" FOREIGN KEY (visibility_id) REFERENCES visibility(id)
 
 
 
 
Thanks for the help!
 
Maria Wilson
Nasa/Langley Research Center
Hampton, Virginia USA

Re: [EXTERNAL] Need help with performance tuning pg12 on linux

От
"Wilson, Maria Louise (LARC-E301)[RSES]"
Дата:

Thanks!!  See below:

 

explain (analyze, buffers)

 

 Sort  (cost=10914852.09..10914852.45 rows=141 width=996) (actual time=46036.486..46036.540 rows=4 loops=1)

   Sort Key: granule.uuid

   Sort Method: quicksort  Memory: 32kB

   Buffers: shared hit=784286 read=8346129

   ->  Hash Left Join  (cost=740575.40..10914847.06 rows=141 width=996) (actual time=46036.366..46036.457 rows=4 loops=1)

         Hash Cond: (granule.visibility_id = visibility_1.id)

         Buffers: shared hit=784283 read=8346129

         ->  Hash Right Join  (cost=740573.23..10914773.99 rows=141 width=1725) (actual time=46036.148..46036.208 rows=4 loops=1)

               Hash Cond: (granule_file_1.granule_uuid = granule.uuid)

               Buffers: shared hit=784282 read=8346129

               ->  Hash Join  (cost=644250.54..10734700.30 rows=22333224 width=223) (actual time=7864.023..44546.392 rows=22325462 loops=1)

                     Hash Cond: (file_1.id = granule_file_1.file_id)

                     Buffers: shared hit=780882 read=8345236

                     ->  Seq Scan on file file_1  (cost=0.00..9205050.88 rows=22068888 width=207) (actual time=402.706..25222.525 rows=22057988 loops=1)

                           Buffers: shared hit=639126 read=8345236

                     ->  Hash  (cost=365085.24..365085.24 rows=22333224 width=20) (actual time=7288.228..7288.235 rows=22325462 loops=1)

                           Buckets: 33554432  Batches: 1  Memory Usage: 1391822kB

                           Buffers: shared hit=141753

                           ->  Seq Scan on granule_file granule_file_1  (cost=0.00..365085.24 rows=22333224 width=20) (actual time=0.030..2151.380 rows=22325462 loops=1)

                                 Buffers: shared hit=141753

               ->  Hash  (cost=96322.36..96322.36 rows=26 width=1518) (actual time=16.631..16.672 rows=4 loops=1)

                     Buckets: 1024  Batches: 1  Memory Usage: 13kB

                     Buffers: shared hit=3400 read=893

                     ->  Nested Loop Left Join  (cost=96114.18..96322.36 rows=26 width=1518) (actual time=16.605..16.658 rows=4 loops=1)

                           Buffers: shared hit=3400 read=893

                           ->  Limit  (cost=96113.90..96113.97 rows=26 width=1462) (actual time=16.585..16.621 rows=4 loops=1)

                                 Buffers: shared hit=3388 read=893

                                 ->  Sort  (cost=96113.90..96126.16 rows=4902 width=1462) (actual time=16.583..16.610 rows=4 loops=1)

                                       Sort Key: granule.uuid

                                       Sort Method: quicksort  Memory: 32kB

                                       Buffers: shared hit=3388 read=893

                                       ->  Nested Loop  (cost=0.56..95974.19 rows=4902 width=1462) (actual time=3.805..16.585 rows=4 loops=1)

                                             Buffers: shared hit=3388 read=893

                                             ->  Seq Scan on collection  (cost=0.00..3341.70 rows=1 width=4) (actual time=0.670..5.734 rows=4 loops=1)

                                                   Filter: (((entry_id)::text ~~ 'AJAX_CO2_CH4_1'::text) OR ((entry_id)::text ~~ 'AJAX_O3_1'::text) OR ((entry_id)::text ~~ 'AJAX_CH2O_1'::text) OR ((entry_id)::text ~~ 'AJAX_MMS_1'::text))

                                                   Rows Removed by Filter: 2481

                                                   Buffers: shared hit=3292

                                             ->  Index Scan using ix_granule_collection_id on granule  (cost=0.56..92445.36 rows=18713 width=1462) (actual time=1.342..2.705 rows=1 loops=4)

                                                   Index Cond: (collection_id = collection.id)

                                                   Filter: (is_active AND (((properties #>> '{temporal_extent,range_date_times,0,beginning_date_time}'::text[]) > '2015-10-06T23:59:59+00:00'::text) OR ((properties #>> '{temporal_extent,single_d

ate_times,0}'::text[]) > '2015-10-06T23:59:59+00:00'::text) OR ((properties #>> '{temporal_extent,periodic_date_times,0,start_date}'::text[]) > '2015-10-06T23:59:59+00:00'::text)) AND (((properties #>> '{temporal_extent,range_date_times,0,end_

date_time}'::text[]) < '2015-10-09T00:00:00+00:00'::text) OR ((properties #>> '{temporal_extent,single_date_times,0}'::text[]) < '2015-10-09T00:00:00+00:00'::text) OR ((properties #>> '{temporal_extent,periodic_date_times,0,end_date}'::text[])

 < '2015-10-09T00:00:00+00:00'::text)))

                                                   Rows Removed by Filter: 243

                                                   Buffers: shared hit=96 read=893

                           ->  Index Scan using collection_pkey on collection collection_1  (cost=0.28..8.00 rows=1 width=56) (actual time=0.005..0.005 rows=1 loops=4)

                                 Index Cond: (id = granule.collection_id)

                                 Buffers: shared hit=12

         ->  Hash  (cost=1.52..1.52 rows=52 width=16) (actual time=0.038..0.038 rows=52 loops=1)

               Buckets: 1024  Batches: 1  Memory Usage: 11kB

               Buffers: shared hit=1

               ->  Seq Scan on visibility visibility_1  (cost=0.00..1.52 rows=52 width=16) (actual time=0.019..0.022 rows=52 loops=1)

                     Buffers: shared hit=1

 Planning Time: 3.510 ms

 Execution Time: 46084.516 ms

(52 rows)

 

 

 

From: Frits Hoogland <frits.hoogland@gmail.com>
Date: Wednesday, December 27, 2023 at 11:07 AM
To: "Wilson, Maria Louise (LARC-E301)[RSES]" <m.l.wilson@nasa.gov>
Cc: "pgsql-performance@lists.postgresql.org" <pgsql-performance@lists.postgresql.org>
Subject: Re: [EXTERNAL] Need help with performance tuning pg12 on linux

 

CAUTION: This email originated from outside of NASA.  Please take care when clicking links or opening attachments.  Use the "Report Message" button to report suspicious messages to the NASA SOC.



Yes, there is an explain, but that is an explain that is run without ‘analyze’ added to it.

This means the query is parsed and planned, and the resulting parse tree with planner assumptions is shown.

 

If you add ‘analyze to ‘explain’, the actual query is run and timed, and statistics about actual execution are shown.

 

Frits Hoogland

 

 



On 27 Dec 2023, at 17:01, Wilson, Maria Louise (LARC-E301)[RSES] <m.l.wilson@nasa.gov> wrote:

 

Thanks for the reply!!  Scroll down a bit – the explain is just a bit further down in the email!

Maria

 

From: Frits Hoogland <frits.hoogland@gmail.com>
Date: Wednesday, December 27, 2023 at 10:50 AM
To: "Wilson, Maria Louise (LARC-E301)[RSES]" <m.l.wilson@nasa.gov>
Cc: "pgsql-performance@lists.postgresql.org" <pgsql-performance@lists.postgresql.org>
Subject: [EXTERNAL] Re: Need help with performance tuning pg12 on linux

 

CAUTION: This email originated from outside of NASA.  Please take care when clicking links or opening attachments.  Use the "Report Message" button to report suspicious messages to the NASA SOC.




Hi Maria, could you please run explain analyse for the problem query?

The ‘analyze’ addition will track actual spent time and show statistics to validate the planner’s assumptions.

 

Frits Hoogland

 

 




On 27 Dec 2023, at 16:38, Wilson, Maria Louise (LARC-E301)[RSES] <m.l.wilson@nasa.gov> wrote:

 

Hello folks!

 

I am having a complex query slowing over time increasing in duration.  If anyone has a few cycles that they could lend a hand or just point me in the right direction with this – I would surely appreciate it!  Fairly beefy Linux server with Postgres 12 (latest) – this particular query has been getting slower over time & seemingly slowing everything else down.  The server is dedicated entirely to this particular database.  Let me know if I can provide any additional information!!  Thanks in advance!

 

Here’s my background – Linux RHEL 8 – PostgreSQL 12.17.  – 

MemTotal:       263216840 kB

MemFree:         3728224 kB

MemAvailable:   197186864 kB

Buffers:            6704 kB

Cached:         204995024 kB

SwapCached:        19244 kB

 

free -m

              total        used        free      shared  buff/cache   available

Mem:         257047       51860        3722       10718      201464      192644

Swap:          4095         855        3240

 

Here are a few of the settings in our postgres server:

max_connections = 300                   # (change requires restart)

shared_buffers = 10GB

temp_buffers = 24MB

work_mem = 2GB

maintenance_work_mem = 1GB

 

most everything else is set to the default.

 

The query is complex with several joins:

 

SELECT anon_1.granule_collection_id AS anon_1_granule_collection_id, anon_1.granule_create_date AS anon_1_granule_create_date, anon_1.granule_delete_date AS anon_1_granule_delete_date, ST_AsGeoJSON(anon_1.granule_geography) AS anon_1_granule_geography, ST_AsGeoJSON(anon_1.granule_geometry) AS anon_1_granule_geometry, anon_1.granule_is_active AS anon_1_granule_is_active, anon_1.granule_properties AS anon_1_granule_properties, anon_1.granule_update_date AS anon_1_granule_update_date, anon_1.granule_uuid AS anon_1_granule_uuid, anon_1.granule_visibility_last_update_date AS anon_1_granule_visibility_last_update_date, anon_1.granule_visibility_id AS anon_1_granule_visibility_id, collection_1.id AS collection_1_id, collection_1.entry_id AS collection_1_entry_id, collection_1.short_name AS collection_1_short_name, collection_1.version AS collection_1_version, file_1.id AS file_1_id, file_1.location AS file_1_location, file_1.md5 AS file_1_md5, file_1.name AS file_1_name, file_1.size AS file_1_size, file_1.type AS file_1_type, visibility_1.id AS visibility_1_id, visibility_1.name AS visibility_1_name, visibility_1.value AS visibility_1_value

        FROM (SELECT granule.collection_id AS granule_collection_id, granule.create_date AS granule_create_date, granule.delete_date AS granule_delete_date, granule.geography AS granule_geography, granule.geometry AS granule_geometry, granule.is_active AS granule_is_active, granule.properties AS granule_properties, granule.update_date AS granule_update_date, granule.uuid AS granule_uuid, granule.visibility_last_update_date AS granule_visibility_last_update_date, granule.visibility_id AS granule_visibility_id

        FROM granule JOIN collection ON collection.id = granule.collection_id

        WHERE granule.is_active = true AND (collection.entry_id LIKE 'AJAX_CO2_CH4_1' OR collection.entry_id LIKE 'AJAX_O3_1' OR collection.entry_id LIKE 'AJAX_CH2O_1' OR collection.entry_id LIKE 'AJAX_MMS_1') AND ((granule.properties #>> '{temporal_extent, range_date_times, 0, beginning_date_time}') > '2015-10-06T23:59:59+00:00' OR (granule.properties #>> '{temporal_extent, single_date_times, 0}') > '2015-10-06T23:59:59+00:00' OR (granule.properties #>> '{temporal_extent, periodic_date_times, 0, start_date}') > '2015-10-06T23:59:59+00:00') AND ((granule.properties #>> '{temporal_extent, range_date_times, 0, end_date_time}') < '2015-10-09T00:00:00+00:00' OR (granule.properties #>> '{temporal_extent, single_date_times, 0}') < '2015-10-09T00:00:00+00:00' OR (granule.properties #>> '{temporal_extent, periodic_date_times, 0, end_date}') < '2015-10-09T00:00:00+00:00') ORDER BY granule.uuid

         LIMIT 26) AS anon_1 LEFT OUTER JOIN collection AS collection_1 ON collection_1.id = anon_1.granule_collection_id LEFT OUTER JOIN (granule_file AS granule_file_1 JOIN file AS file_1 ON file_1.id = granule_file_1.file_id) ON anon_1.granule_uuid = granule_file_1.granule_uuid LEFT OUTER JOIN visibility AS visibility_1 ON visibility_1.id = anon_1.granule_visibility_id ORDER BY anon_1.granule_uuid

 

Here’s the explain:

 

 Sort  (cost=10914809.92..10914810.27 rows=141 width=996)

   Sort Key: granule.uuid

   ->  Hash Left Join  (cost=740539.73..10914804.89 rows=141 width=996)

         Hash Cond: (granule.visibility_id = visibility_1.id)

         ->  Hash Right Join  (cost=740537.56..10914731.81 rows=141 width=1725)

               Hash Cond: (granule_file_1.granule_uuid = granule.uuid)

               ->  Hash Join  (cost=644236.90..10734681.93 rows=22332751 width=223)

                     Hash Cond: (file_1.id = granule_file_1.file_id)

                     ->  Seq Scan on file file_1  (cost=0.00..9205050.88 rows=22068888 width=207)

                     ->  Hash  (cost=365077.51..365077.51 rows=22332751 width=20)

                           ->  Seq Scan on granule_file granule_file_1  (cost=0.00..365077.51 rows=22332751 width=20)

               ->  Hash  (cost=96300.33..96300.33 rows=26 width=1518)

                     ->  Nested Loop Left Join  (cost=96092.55..96300.33 rows=26 width=1518)

                           ->  Limit  (cost=96092.27..96092.33 rows=26 width=1462)

                                 ->  Sort  (cost=96092.27..96100.47 rows=3282 width=1462)

                                       Sort Key: granule.uuid

                                       ->  Nested Loop  (cost=0.56..95998.73 rows=3282 width=1462)

                                             ->  Seq Scan on collection  (cost=0.00..3366.24 rows=1 width=4)

                                                   Filter: (((entry_id)::text ~~ 'AJAX_CO2_CH4_1'::text) OR ((entry_id)::text ~~ 'AJAX_O3_1'::text) OR ((entry_id)::text ~~ 'AJAX_CH2O_1'::text) OR ((entry_id)::text ~~ 'AJAX_MMS_1'::text))

                                             ->  Index Scan using ix_granule_collection_id on granule  (cost=0.56..92445.36 rows=18713 width=1462)

                                                   Index Cond: (collection_id = collection.id)

                                                   Filter: (is_active AND (((properties #>> '{temporal_extent,range_date_times,0,beginning_date_time}'::text[]) > '2015-10-06T23:59:59+00:00'::text) OR ((properties #>> '{temporal_extent,single_d

ate_times,0}'::text[]) > '2015-10-06T23:59:59+00:00'::text) OR ((properties #>> '{temporal_extent,periodic_date_times,0,start_date}'::text[]) > '2015-10-06T23:59:59+00:00'::text)) AND (((properties #>> '{temporal_extent,range_date_times,0,end_

date_time}'::text[]) < '2015-10-09T00:00:00+00:00'::text) OR ((properties #>> '{temporal_extent,single_date_times,0}'::text[]) < '2015-10-09T00:00:00+00:00'::text) OR ((properties #>> '{temporal_extent,periodic_date_times,0,end_date}'::text[])

 < '2015-10-09T00:00:00+00:00'::text)))

                           ->  Index Scan using collection_pkey on collection collection_1  (cost=0.28..7.99 rows=1 width=56)

                                 Index Cond: (id = granule.collection_id)

         ->  Hash  (cost=1.52..1.52 rows=52 width=16)

               ->  Seq Scan on visibility visibility_1  (cost=0.00..1.52 rows=52 width=16)

 

 

Heres a bit about the tables – 

 

Granule

Collection

Granule_file

Visibility

 

Granule:

public | granule                            | table | ims_api_writer | 36 GB   | 

 

ims_api=# \d+ granule

                                                       Table "public.granule"

           Column            |            Type             | Collation | Nullable | Default | Storage  | Stats target | Description 

-----------------------------+-----------------------------+-----------+----------+---------+----------+--------------+-------------

 collection_id               | integer                     |           | not null |         | plain    |              | 

 create_date                 | timestamp without time zone |           | not null |         | plain    |              | 

 delete_date                 | timestamp without time zone |           |          |         | plain    |              | 

 geometry                    | geometry(Geometry,4326)     |           |          |         | main     |              | 

 is_active                   | boolean                     |           |          |         | plain    |              | 

 properties                  | jsonb                       |           |          |         | extended |              | 

 update_date                 | timestamp without time zone |           | not null |         | plain    |              | 

 uuid                        | uuid                        |           | not null |         | plain    |              | 

 visibility_id               | integer                     |           | not null |         | plain    |              | 

 geography                   | geography(Geometry,4326)    |           |          |         | main     |              | 

 visibility_last_update_date | timestamp without time zone |           |          |         | plain    |              | 

Indexes:

    "granule_pkey" PRIMARY KEY, btree (uuid)

    "granule_is_active_idx" btree (is_active)

    "granule_properties_producer_id_idx" btree ((properties ->> 'producer_granule_id'::text))

    "granule_update_date_idx" btree (update_date)

    "idx_granule_geometry" gist (geometry)

    "ix_granule_collection_id" btree (collection_id)

Foreign-key constraints:

    "granule_collection_id_fkey" FOREIGN KEY (collection_id) REFERENCES collection(id)

    "granule_visibility_id_fkey" FOREIGN KEY (visibility_id) REFERENCES visibility(id)

Referenced by:

    TABLE "granule_file" CONSTRAINT "granule_file_granule_uuid_fkey" FOREIGN KEY (granule_uuid) REFERENCES granule(uuid)

    TABLE "granule_temporal_range" CONSTRAINT "granule_temporal_range_granule_uuid_fkey" FOREIGN KEY (granule_uuid) REFERENCES granule(uuid)

Triggers:

    granule_temporal_range_trigger AFTER INSERT OR DELETE OR UPDATE ON granule FOR EACH ROW EXECUTE FUNCTION sync_granule_temporal_range()

Access method: heap

 

Collection:

public | collection                         | table | ims_api_writer | 39 MB   | 

 

ims_api=# \d collection

                                                 Table "public.collection"

            Column            |            Type             | Collation | Nullable |                Default                 

------------------------------+-----------------------------+-----------+----------+----------------------------------------

 id                           | integer                     |           | not null | nextval('collection_id_seq'::regclass)

 access_constraints           | text                        |           |          | 

 additional_attributes        | jsonb                       |           |          | 

 ancillary_keywords           | character varying(160)[]    |           |          | 

 create_date                  | timestamp without time zone |           | not null | 

 dataset_language             | character varying(80)[]     |           |          | 

 dataset_progress             | text                        |           |          | 

 data_resolutions             | jsonb                       |           |          | 

 dataset_citation             | jsonb                       |           |          | 

 delete_date                  | timestamp without time zone |           |          | 

 distribution                 | jsonb                       |           |          | 

 doi                          | character varying(220)      |           |          | 

 entry_id                     | character varying(80)       |           | not null | 

 entry_title                  | character varying(1030)     |           |          | 

 geometry                     | geometry(Geometry,4326)     |           |          | 

 is_active                    | boolean                     |           | not null | 

 iso_topic_categories         | character varying[]         |           |          | 

 last_update_date             | timestamp without time zone |           | not null | 

 locations                    | jsonb                       |           |          | 

 long_name                    | character varying(1024)     |           |          | 

 metadata_associations        | jsonb                       |           |          | 

 metadata_dates               | jsonb                       |           |          | 

 personnel                    | jsonb                       |           |          | 

 platforms                    | jsonb                       |           |          | 

 processing_level_id          | integer                     |           |          | 

 product_flag                 | text                        |           |          | 

 project_id                   | integer                     |           |          | 

 properties                   | jsonb                       |           |          | 

 quality                      | jsonb                       |           |          | 

 references                   | character varying(12000)[]  |           |          | 

 related_urls                 | jsonb                       |           |          | 

 summary                      | jsonb                       |           |          | 

 short_name                   | character varying(80)       |           |          | 

 temporal_extents             | jsonb                       |           |          | 

 version                      | character varying(80)       |           |          | 

 use_constraints              | jsonb                       |           |          | 

 version_description          | text                        |           |          | 

 visibility_id                | integer                     |           | not null | 

 world_date                   | timestamp without time zone |           |          | 

 tiling_identification_system | jsonb                       |           |          | 

 collection_data_type         | text                        |           |          | 

 standard_product             | boolean                     |           | not null | false

Indexes:

    "collection_pkey" PRIMARY KEY, btree (id)

    "collection_entry_id_key" UNIQUE CONSTRAINT, btree (entry_id)

    "idx_collection_geometry" gist (geometry)

Foreign-key constraints:

    "collection_processing_level_id_fkey" FOREIGN KEY (processing_level_id) REFERENCES processing_level(id)

    "collection_project_id_fkey" FOREIGN KEY (project_id) REFERENCES project(id)

    "collection_visibility_id_fkey" FOREIGN KEY (visibility_id) REFERENCES visibility(id)

Referenced by:

    TABLE "collection_organization" CONSTRAINT "collection_organization_collection_id_fkey" FOREIGN KEY (collection_id) REFERENCES collection(id)

    TABLE "collection_science_keyword" CONSTRAINT "collection_science_keyword_collection_id_fkey" FOREIGN KEY (collection_id) REFERENCES collection(id)

    TABLE "collection_spatial_processing_hint" CONSTRAINT "collection_spatial_processing_hint_collection_id_fkey" FOREIGN KEY (collection_id) REFERENCES collection(id)

    TABLE "granule" CONSTRAINT "granule_collection_id_fkey" FOREIGN KEY (collection_id) REFERENCES collection(id)

    TABLE "granule_temporal_range" CONSTRAINT "granule_temporal_range_collection_id_fkey" FOREIGN KEY (collection_id) REFERENCES collection(id)

 

 

Granule_file:

 public | granule_file                       | table | ims_api_writer | 1108 MB | 

 

\d granule_file

               Table "public.granule_file"

    Column    |  Type   | Collation | Nullable | Default 

--------------+---------+-----------+----------+---------

 granule_uuid | uuid    |           |          | 

 file_id      | integer |           |          | 

Foreign-key constraints:

    "granule_file_file_id_fkey" FOREIGN KEY (file_id) REFERENCES file(id)

    "granule_file_granule_uuid_fkey" FOREIGN KEY (granule_uuid) REFERENCES granule(uuid)

 

 

Visibility:

public | visibility                         | table | ims_api_writer | 40 kB   | 

 

\d visibility

                                   Table "public.visibility"

 Column |         Type          | Collation | Nullable |                Default                 

--------+-----------------------+-----------+----------+----------------------------------------

 id     | integer               |           | not null | nextval('visibility_id_seq'::regclass)

 name   | character varying(80) |           | not null | 

 value  | integer               |           | not null | 

Indexes:

    "visibility_pkey" PRIMARY KEY, btree (id)

    "visibility_name_key" UNIQUE CONSTRAINT, btree (name)

    "visibility_value_key" UNIQUE CONSTRAINT, btree (value)

Referenced by:

    TABLE "collection" CONSTRAINT "collection_visibility_id_fkey" FOREIGN KEY (visibility_id) REFERENCES visibility(id)

    TABLE "granule" CONSTRAINT "granule_visibility_id_fkey" FOREIGN KEY (visibility_id) REFERENCES visibility(id)

 

 

 

 

Thanks for the help!

 

Maria Wilson

Nasa/Langley Research Center

Hampton, Virginia USA

 

Re: [EXTERNAL] Need help with performance tuning pg12 on linux

От
Matheus de Oliveira
Дата:
               ->  Hash Join  (cost=644250.54..10734700.30 rows=22333224 width=223) (actual time=7864.023..44546.392 rows=22325462 loops=1)
                     Hash Cond: (file_1.id = granule_file_1.file_id)
                     Buffers: shared hit=780882 read=8345236
                     ->  Seq Scan on file file_1  (cost=0.00..9205050.88 rows=22068888 width=207) (actual time=402.706..25222.525 rows=22057988 loops=1)
                           Buffers: shared hit=639126 read=8345236
                     ->  Hash  (cost=365085.24..365085.24 rows=22333224 width=20) (actual time=7288.228..7288.235 rows=22325462 loops=1)
                           Buckets: 33554432  Batches: 1  Memory Usage: 1391822kB
                           Buffers: shared hit=141753
                           ->  Seq Scan on granule_file granule_file_1  (cost=0.00..365085.24 rows=22333224 width=20) (actual time=0.030..2151.380 rows=22325462 loops=1)
                                 Buffers: shared hit=141753


This part above is the most expensive so far, and taking a look at your `granule_file` table on the first message, it has no indexes nor constraints, which certainly looks like a mistake. I'd start optimizing this, you could add an index on it, but seems that you need a primary key on both columns of this (junction?) table:

    ALTER TABLE granule_file ADD PRIMARY KEY (granule_uuid, file_id);

There are certainly more things to optimize on this query, but I prefer doing one thing at a time. Could you try with the PK and send the EXPLAIN ANALYZE of the query again after that?

Best regards,
Matheus de Oliveira

Re: [EXTERNAL] Need help with performance tuning pg12 on linux

От
"Wilson, Maria Louise (LARC-E301)[RSES]"
Дата:

Thanks for the reply!!  Having some issues due to nulls….  Any other thoughts?

 

i=# ALTER TABLE granule_file ADD PRIMARY KEY (granule_uuid, file_id);

ERROR:  column "granule_uuid" contains null values

 

 

From: Matheus de Oliveira <matioli.matheus@gmail.com>
Date: Wednesday, December 27, 2023 at 11:36 AM
To: "Wilson, Maria Louise (LARC-E301)[RSES]" <m.l.wilson@nasa.gov>
Cc: Frits Hoogland <frits.hoogland@gmail.com>, "pgsql-performance@lists.postgresql.org" <pgsql-performance@lists.postgresql.org>
Subject: Re: [EXTERNAL] Need help with performance tuning pg12 on linux

 

CAUTION: This email originated from outside of NASA.  Please take care when clicking links or opening attachments.  Use the "Report Message" button to report suspicious messages to the NASA SOC.



               ->  Hash Join  (cost=644250.54..10734700.30 rows=22333224 width=223) (actual time=7864.023..44546.392 rows=22325462 loops=1)
                     Hash Cond: (file_1.id = granule_file_1.file_id)
                     Buffers: shared hit=780882 read=8345236
                     ->  Seq Scan on file file_1  (cost=0.00..9205050.88 rows=22068888 width=207) (actual time=402.706..25222.525 rows=22057988 loops=1)
                           Buffers: shared hit=639126 read=8345236
                     ->  Hash  (cost=365085.24..365085.24 rows=22333224 width=20) (actual time=7288.228..7288.235 rows=22325462 loops=1)
                           Buckets: 33554432  Batches: 1  Memory Usage: 1391822kB
                           Buffers: shared hit=141753
                           ->  Seq Scan on granule_file granule_file_1  (cost=0.00..365085.24 rows=22333224 width=20) (actual time=0.030..2151.380 rows=22325462 loops=1)
                                 Buffers: shared hit=141753

 

This part above is the most expensive so far, and taking a look at your `granule_file` table on the first message, it has no indexes nor constraints, which certainly looks like a mistake. I'd start optimizing this, you could add an index on it, but seems that you need a primary key on both columns of this (junction?) table:

    ALTER TABLE granule_file ADD PRIMARY KEY (granule_uuid, file_id);

 

There are certainly more things to optimize on this query, but I prefer doing one thing at a time. Could you try with the PK and send the EXPLAIN ANALYZE of the query again after that?

 

Best regards,

Matheus de Oliveira

Re: [EXTERNAL] Need help with performance tuning pg12 on linux

От
Ranier Vilela
Дата:
Em qua., 27 de dez. de 2023 às 14:11, Wilson, Maria Louise (LARC-E301)[RSES] <m.l.wilson@nasa.gov> escreveu:

Thanks for the reply!!  Having some issues due to nulls….  Any other thoughts?

 

i=# ALTER TABLE granule_file ADD PRIMARY KEY (granule_uuid, file_id);

ERROR:  column "granule_uuid" contains null values

Well, uuid is a bad datatype for primary keys.
If possible in the long run, consider replacing them with bigint.

Can you try a index:
CREATE INDEX granule_file_file_id_key ON granule_file USING btree(file_id);

Although granule_file has an index as a foreign key, it seems to me that it is not being considered.

My 2cents.

Best regards,
Ranier Vilela

Re: [EXTERNAL] Need help with performance tuning pg12 on linux

От
"Wilson, Maria Louise (LARC-E301)[RSES]"
Дата:

Here is the before analyze :

 

Sort  (cost=3208325.03..3208325.33 rows=117 width=997) (actual time=56683.259..56683.307 rows=4 loops=1)

   Sort Key: granule.uuid

   Sort Method: quicksort  Memory: 32kB

   Buffers: shared hit=40 read=795724, temp read=630171 written=630171

   ->  Hash Left Join  (cost=1844145.13..3208321.02 rows=117 width=997) (actual time=56683.080..56683.184 rows=4 loops=1)

         Hash Cond: (granule.visibility_id = visibility_1.id)

         Buffers: shared hit=37 read=795724, temp read=630171 written=630171

         ->  Hash Right Join  (cost=1844142.96..3208260.02 rows=117 width=1678) (actual time=56682.840..56682.891 rows=4 loops=1)

               Hash Cond: (granule_file_1.granule_uuid = granule.uuid)

               Buffers: shared hit=36 read=795724, temp read=630171 written=630171

               ->  Hash Join  (cost=1752547.97..3034700.90 rows=21856786 width=224) (actual time=21966.799..55011.964 rows=21855206 loops=1)

                     Hash Cond: (granule_file_1.file_id = file_1.id)

                     Buffers: shared hit=2 read=794153, temp read=630171 written=630171

                     ->  Seq Scan on granule_file granule_file_1  (cost=0.00..357270.86 rows=21856786 width=20) (actual time=0.334..3267.188 rows=21855206 loops=1)

                           Buffers: shared read=138703

                     ->  Hash  (cost=871329.32..871329.32 rows=21587732 width=208) (actual time=13425.791..13425.795 rows=21587732 loops=1)

                           Buckets: 8388608  Batches: 8  Memory Usage: 710896kB

                           Buffers: shared hit=2 read=655450, temp written=537221

                           ->  Seq Scan on file file_1  (cost=0.00..871329.32 rows=21587732 width=208) (actual time=0.277..5520.726 rows=21587732 loops=1)

                                 Buffers: shared hit=2 read=655450

               ->  Hash  (cost=91594.67..91594.67 rows=26 width=1470) (actual time=189.702..189.736 rows=4 loops=1)

                     Buckets: 1024  Batches: 1  Memory Usage: 13kB

                     Buffers: shared hit=34 read=1571

                     ->  Nested Loop Left Join  (cost=91434.88..91594.67 rows=26 width=1470) (actual time=189.653..189.704 rows=4 loops=1)

                           Buffers: shared hit=34 read=1571

                           ->  Limit  (cost=91434.60..91434.67 rows=26 width=1414) (actual time=189.444..189.473 rows=4 loops=1)

                                 Buffers: shared hit=23 read=1570

                                 ->  Sort  (cost=91434.60..91446.86 rows=4903 width=1414) (actual time=189.441..189.462 rows=4 loops=1)

                                       Sort Key: granule.uuid

                                       Sort Method: quicksort  Memory: 32kB

                                       Buffers: shared hit=23 read=1570

                                       ->  Nested Loop  (cost=0.56..91294.86 rows=4903 width=1414) (actual time=22.534..189.403 rows=4 loops=1)

                                             Buffers: shared hit=23 read=1570

                                             ->  Seq Scan on collection  (cost=0.00..653.62 rows=1 width=4) (actual time=3.706..14.783 rows=4 loops=1)

                                                   Filter: (((entry_id)::text ~~ 'AJAX_CO2_CH4_1'::text) OR ((entry_id)::text ~~ 'AJAX_O3_1'::text) OR ((entry_id)::text ~~ 'AJAX_CH2O_1'::text) OR ((entry_id)::text ~~ 'AJAX_MMS_1'::text))

                                                   Rows Removed by Filter: 2477

                                                   Buffers: shared hit=2 read=602

                                             ->  Index Scan using ix_granule_collection_id on granule  (cost=0.56..90455.52 rows=18572 width=1414) (actual time=21.662..43.645 rows=1 loops=4)

                                                   Index Cond: (collection_id = collection.id)

                                                   Filter: (is_active AND (((properties #>> '{temporal_extent,range_date_times,0,beginning_date_time}'::text[]) > '2015-10-06T23:59:59+00:00'::text) OR ((properties #>> '{temporal_extent,single_date_times,0}

'::text[]) > '2015-10-06T23:59:59+00:00'::text) OR ((properties #>> '{temporal_extent,periodic_date_times,0,start_date}'::text[]) > '2015-10-06T23:59:59+00:00'::text)) AND (((properties #>> '{temporal_extent,range_date_times,0,end_date_time}'::text[]) < '

2015-10-09T00:00:00+00:00'::text) OR ((properties #>> '{temporal_extent,single_date_times,0}'::text[]) < '2015-10-09T00:00:00+00:00'::text) OR ((properties #>> '{temporal_extent,periodic_date_times,0,end_date}'::text[]) < '2015-10-09T00:00:00+00:00'::text

)))

                                                   Rows Removed by Filter: 243

                                                   Buffers: shared hit=21 read=968

                           ->  Index Scan using collection_pkey on collection collection_1  (cost=0.28..6.14 rows=1 width=56) (actual time=0.052..0.052 rows=1 loops=4)

                                 Index Cond: (id = granule.collection_id)

                                 Buffers: shared hit=11 read=1

         ->  Hash  (cost=1.52..1.52 rows=52 width=16) (actual time=0.054..0.054 rows=52 loops=1)

               Buckets: 1024  Batches: 1  Memory Usage: 11kB

               Buffers: shared hit=1

               ->  Seq Scan on visibility visibility_1  (cost=0.00..1.52 rows=52 width=16) (actual time=0.032..0.036 rows=52 loops=1)

                     Buffers: shared hit=1

 Planning Time: 14.580 ms

 Execution Time: 56764.259 ms

(52 rows)

 

 

Then added the index:

CREATE INDEX granule_file_file_id_key ON granule_file USING btree(file_id);

CREATE INDEX

 

vacuum (analyze, verbose) granule_file;

 

 

& heres the new analyze:

 

 Sort  (cost=3208262.52..3208262.79 rows=105 width=997) (actual time=64720.414..64720.435 rows=4 loops=1)

   Sort Key: granule.uuid

   Sort Method: quicksort  Memory: 32kB

   Buffers: shared hit=140349 read=655418, temp read=630171 written=630171

   ->  Hash Left Join  (cost=1844145.13..3208259.00 rows=105 width=997) (actual time=64720.258..64720.325 rows=4 loops=1)

         Hash Cond: (granule.visibility_id = visibility_1.id)

         Buffers: shared hit=140346 read=655418, temp read=630171 written=630171

         ->  Hash Right Join  (cost=1844142.96..3208204.03 rows=105 width=1678) (actual time=64720.083..64720.105 rows=4 loops=1)

               Hash Cond: (granule_file_1.granule_uuid = granule.uuid)

               Buffers: shared hit=140345 read=655418, temp read=630171 written=630171

               ->  Hash Join  (cost=1752547.97..3034652.34 rows=21854840 width=224) (actual time=11945.807..63203.012 rows=21855206 loops=1)

                     Hash Cond: (granule_file_1.file_id = file_1.id)

                     Buffers: shared hit=138740 read=655418, temp read=630171 written=630171

                     ->  Seq Scan on granule_file granule_file_1  (cost=0.00..357251.40 rows=21854840 width=20) (actual time=0.017..3103.893 rows=21855206 loops=1)

                           Buffers: shared hit=138703

                     ->  Hash  (cost=871329.32..871329.32 rows=21587732 width=208) (actual time=11891.143..11891.146 rows=21587732 loops=1)

                           Buckets: 8388608  Batches: 8  Memory Usage: 710896kB

                           Buffers: shared hit=34 read=655418, temp written=537221

                           ->  Seq Scan on file file_1  (cost=0.00..871329.32 rows=21587732 width=208) (actual time=0.081..3996.438 rows=21587732 loops=1)

                                 Buffers: shared hit=34 read=655418

               ->  Hash  (cost=91594.67..91594.67 rows=26 width=1470) (actual time=19.728..19.740 rows=4 loops=1)

                     Buckets: 1024  Batches: 1  Memory Usage: 13kB

                     Buffers: shared hit=1605

                     ->  Nested Loop Left Join  (cost=91434.88..91594.67 rows=26 width=1470) (actual time=19.684..19.708 rows=4 loops=1)

                           Buffers: shared hit=1605

                           ->  Limit  (cost=91434.60..91434.67 rows=26 width=1414) (actual time=19.650..19.660 rows=4 loops=1)

                                 Buffers: shared hit=1593

                                 ->  Sort  (cost=91434.60..91446.86 rows=4903 width=1414) (actual time=19.648..19.656 rows=4 loops=1)

                                       Sort Key: granule.uuid

                                       Sort Method: quicksort  Memory: 32kB

                                       Buffers: shared hit=1593

                                       ->  Nested Loop  (cost=0.56..91294.86 rows=4903 width=1414) (actual time=2.765..19.609 rows=4 loops=1)

                                             Buffers: shared hit=1593

                                             ->  Seq Scan on collection  (cost=0.00..653.62 rows=1 width=4) (actual time=1.789..8.057 rows=4 loops=1)

                                                   Filter: (((entry_id)::text ~~ 'AJAX_CO2_CH4_1'::text) OR ((entry_id)::text ~~ 'AJAX_O3_1'::text) OR ((entry_id)::text ~~ 'AJAX_CH2O_1'::text) OR ((entry_id)::text ~~ 'AJAX_MMS_1'::text))

                                                   Rows Removed by Filter: 2477

                                                   Buffers: shared hit=604

                                             ->  Index Scan using ix_granule_collection_id on granule  (cost=0.56..90455.52 rows=18572 width=1414) (actual time=1.311..2.881 rows=1 loops=4)

                                                   Index Cond: (collection_id = collection.id)

                                                   Filter: (is_active AND (((properties #>> '{temporal_extent,range_date_times,0,beginning_date_time}'::text[]) > '2015-10-06T23:59:59+00:00'::text) OR ((properties #>> '{temporal_extent,single_date_times,0}

'::text[]) > '2015-10-06T23:59:59+00:00'::text) OR ((properties #>> '{temporal_extent,periodic_date_times,0,start_date}'::text[]) > '2015-10-06T23:59:59+00:00'::text)) AND (((properties #>> '{temporal_extent,range_date_times,0,end_date_time}'::text[]) < '

2015-10-09T00:00:00+00:00'::text) OR ((properties #>> '{temporal_extent,single_date_times,0}'::text[]) < '2015-10-09T00:00:00+00:00'::text) OR ((properties #>> '{temporal_extent,periodic_date_times,0,end_date}'::text[]) < '2015-10-09T00:00:00+00:00'::text

)))

                                                   Rows Removed by Filter: 243

                                                   Buffers: shared hit=989

                           ->  Index Scan using collection_pkey on collection collection_1  (cost=0.28..6.14 rows=1 width=56) (actual time=0.008..0.008 rows=1 loops=4)

                                 Index Cond: (id = granule.collection_id)

                                 Buffers: shared hit=12

         ->  Hash  (cost=1.52..1.52 rows=52 width=16) (actual time=0.045..0.045 rows=52 loops=1)

               Buckets: 1024  Batches: 1  Memory Usage: 11kB

               Buffers: shared hit=1

               ->  Seq Scan on visibility visibility_1  (cost=0.00..1.52 rows=52 width=16) (actual time=0.026..0.029 rows=52 loops=1)

                     Buffers: shared hit=1

 Planning Time: 7.354 ms

 Execution Time: 64789.927 ms

(52 rows)

 

 

From: Ranier Vilela <ranier.vf@gmail.com>
Date: Wednesday, December 27, 2023 at 12:23 PM
To: "Wilson, Maria Louise (LARC-E301)[RSES]" <m.l.wilson@nasa.gov>
Cc: Matheus de Oliveira <matioli.matheus@gmail.com>, Frits Hoogland <frits.hoogland@gmail.com>, "pgsql-performance@lists.postgresql.org" <pgsql-performance@lists.postgresql.org>
Subject: Re: [EXTERNAL] Need help with performance tuning pg12 on linux

 

CAUTION: This email originated from outside of NASA.  Please take care when clicking links or opening attachments.  Use the "Report Message" button to report suspicious messages to the NASA SOC.



Em qua., 27 de dez. de 2023 às 14:11, Wilson, Maria Louise (LARC-E301)[RSES] <m.l.wilson@nasa.gov> escreveu:

Thanks for the reply!!  Having some issues due to nulls….  Any other thoughts?

 

i=# ALTER TABLE granule_file ADD PRIMARY KEY (granule_uuid, file_id);

ERROR:  column "granule_uuid" contains null values

Well, uuid is a bad datatype for primary keys.

If possible in the long run, consider replacing them with bigint.

 

Can you try a index:

CREATE INDEX granule_file_file_id_key ON granule_file USING btree(file_id);

 

Although granule_file has an index as a foreign key, it seems to me that it is not being considered.

 

My 2cents.

 

Best regards,

Ranier Vilela

Re: [EXTERNAL] Need help with performance tuning pg12 on linux

От
Matheus de Oliveira
Дата:


On Wed, Dec 27, 2023 at 2:11 PM Wilson, Maria Louise (LARC-E301)[RSES] <m.l.wilson@nasa.gov> wrote:

Thanks for the reply!!  Having some issues due to nulls….  Any other thoughts?

 

i=# ALTER TABLE granule_file ADD PRIMARY KEY (granule_uuid, file_id);

ERROR:  column "granule_uuid" contains null values

 


Seems like an odd design for your table. Check if those rows with null value make any sense for your design.

In any case, for performance, you can try a plain index:

    CREATE INDEX ON granule_file (granule_uuid);

Since you are filtering for granule_uuid first, an index starting with this column seems to make more sense (that is why I made the PK starting with it before).

A composite index is not really necessary, but could help if you get an index-only scan, if you wanna try:

    CREATE INDEX ON granule_file (granule_uuid, file_id);

Best regards,
--
Matheus de Oliveira


Re: [EXTERNAL] Need help with performance tuning pg12 on linux

От
Matheus de Oliveira
Дата:

On Wed, Dec 27, 2023 at 2:23 PM Ranier Vilela <ranier.vf@gmail.com> wrote:
...

Although granule_file has an index as a foreign key, it seems to me that it is not being considered.

You seem to be mistaken here, a foreign key does not automatically create an index on the columns, you need to do it by yourself you really want that.

Best regards,
--
Matheus de Oliveira