Re: [EXTERNAL] Need help with performance tuning pg12 on linux

Поиск
Список
Период
Сортировка
От Wilson, Maria Louise (LARC-E301)[RSES]
Тема Re: [EXTERNAL] Need help with performance tuning pg12 on linux
Дата
Msg-id 2C02F3A1-4125-47D8-8E9D-ED078AB006F7@nasa.gov
обсуждение исходный текст
Ответ на Re: [EXTERNAL] Need help with performance tuning pg12 on linux  (Matheus de Oliveira <matioli.matheus@gmail.com>)
Ответы Re: [EXTERNAL] Need help with performance tuning pg12 on linux  (Ranier Vilela <ranier.vf@gmail.com>)
Re: [EXTERNAL] Need help with performance tuning pg12 on linux  (Matheus de Oliveira <matioli.matheus@gmail.com>)
Список pgsql-performance

Thanks for the reply!!  Having some issues due to nulls….  Any other thoughts?

 

i=# ALTER TABLE granule_file ADD PRIMARY KEY (granule_uuid, file_id);

ERROR:  column "granule_uuid" contains null values

 

 

From: Matheus de Oliveira <matioli.matheus@gmail.com>
Date: Wednesday, December 27, 2023 at 11:36 AM
To: "Wilson, Maria Louise (LARC-E301)[RSES]" <m.l.wilson@nasa.gov>
Cc: Frits Hoogland <frits.hoogland@gmail.com>, "pgsql-performance@lists.postgresql.org" <pgsql-performance@lists.postgresql.org>
Subject: Re: [EXTERNAL] Need help with performance tuning pg12 on linux

 

CAUTION: This email originated from outside of NASA.  Please take care when clicking links or opening attachments.  Use the "Report Message" button to report suspicious messages to the NASA SOC.



               ->  Hash Join  (cost=644250.54..10734700.30 rows=22333224 width=223) (actual time=7864.023..44546.392 rows=22325462 loops=1)
                     Hash Cond: (file_1.id = granule_file_1.file_id)
                     Buffers: shared hit=780882 read=8345236
                     ->  Seq Scan on file file_1  (cost=0.00..9205050.88 rows=22068888 width=207) (actual time=402.706..25222.525 rows=22057988 loops=1)
                           Buffers: shared hit=639126 read=8345236
                     ->  Hash  (cost=365085.24..365085.24 rows=22333224 width=20) (actual time=7288.228..7288.235 rows=22325462 loops=1)
                           Buckets: 33554432  Batches: 1  Memory Usage: 1391822kB
                           Buffers: shared hit=141753
                           ->  Seq Scan on granule_file granule_file_1  (cost=0.00..365085.24 rows=22333224 width=20) (actual time=0.030..2151.380 rows=22325462 loops=1)
                                 Buffers: shared hit=141753

 

This part above is the most expensive so far, and taking a look at your `granule_file` table on the first message, it has no indexes nor constraints, which certainly looks like a mistake. I'd start optimizing this, you could add an index on it, but seems that you need a primary key on both columns of this (junction?) table:

    ALTER TABLE granule_file ADD PRIMARY KEY (granule_uuid, file_id);

 

There are certainly more things to optimize on this query, but I prefer doing one thing at a time. Could you try with the PK and send the EXPLAIN ANALYZE of the query again after that?

 

Best regards,

Matheus de Oliveira

В списке pgsql-performance по дате отправления:

Предыдущее
От: Matheus de Oliveira
Дата:
Сообщение: Re: [EXTERNAL] Need help with performance tuning pg12 on linux
Следующее
От: Ranier Vilela
Дата:
Сообщение: Re: [EXTERNAL] Need help with performance tuning pg12 on linux