RE: Query Performance after pg_restore

Поиск
Список
Период
Сортировка
От Murthy Nunna
Тема RE: Query Performance after pg_restore
Дата
Msg-id DM8PR09MB6677B1633B7059F3B4C44593B899A@DM8PR09MB6677.namprd09.prod.outlook.com
обсуждение исходный текст
Ответ на Re: Query Performance after pg_restore  (Laurenz Albe <laurenz.albe@cybertec.at>)
Список pgsql-admin
Thank you all for responding to my performance issue.
With great respect to all, based on the responses I received so far, it looks like the responses are going on a tangent
(exceptfor Tom Lane!)
 

As soon as I say query performance it is easy to jump into explain-plan land. No, this is not that! I don't run "select
count(*)of tables" every day. I ran this only to show the performance issue and only for comparison purposes.
 

Now that my test is completed, I would like to summarize it here:
Note: This particular issue will go unnoticed and a non-issue on small databases. I am dealing with 20TB database here.
So,it shows.
 

1) Run select count(*) of all tables in the database. Ran for about 4 hours.
2) Perform pg_dump followed by pg_restore (database is 20TB in size. Not too many dead tuples in the original database,
solet us not go there.)
 
3) Run select count(*) of all tables in the database. Ran for 32 hours! Yikes.
4) Run analyze (analyze verbose) on the database
5) Run select count(*) of all tables in the database. Ran for 32 hours! Yikes!!
6) Run vacuumdb -a -z
7) Run select count(*) of all tables in the database. Ran for 1 hour 40 minutes. Excellent!

Tom - Thank you for the hint about "hint bits". I also reviewed" https://wiki.postgresql.org/wiki/Hint_Bits" which says
"Aplain SELECT, count(*), or VACUUM on the entire table will check every tuple for visibility and set its hint bits."
Mypoint is, select count(*) is NOT setting whatever it needs to. If it did, step 5) above should have completed lot
quicker.IMHO, "vacuumdb" is fixing this. May be ANALYZE is setting these bits on the sample tuples it looks at. But
thereis no option to run full analyze on all tuples.
 

If I don't run vacuumdb, I have a hunch that the performance will improve over the time as the users or autovacuum hits
thetuples over time and does the necessary magic on the hint bits.
 

IMHO, a prudent fix would be to incorporate the fix in ANALYZE. The way it is, the fix happened to be a side benefit of
vacuumdbas it scans every page. At the least, I feel, PG_RESTORE document should be changed. Currently it says " Once
restored,it is wise to run ANALYZE on each restored table so the optimizer has useful statistics; see Section 25.1.3
andSection 25.1.6 for more information.". ANALYZE should be replaced with "VACUUM".
 

Thanks,
Murthy


-----Original Message-----
From: Laurenz Albe <laurenz.albe@cybertec.at> 
Sent: Sunday, December 24, 2023 7:06 PM
To: Murthy Nunna <mnunna@fnal.gov>; pgsql-admin@postgresql.org
Subject: Re: Query Performance after pg_restore

On Sun, 2023-12-24 at 15:53 +0000, Murthy Nunna wrote:
> I did pg_dump of a ~20TB database followed by pg_restore. I find simple queries like select
> count(*) running slow. I did a select count(*) on all tables before pg_dump which took ~4 hours.
> After pg_restore, same thing took 32 hours.

SELECT count(*) is always slow.

> By the way, there is no change in postgres versions. It is 14.4 before and after pg_restore.

You should compare the execution plans of the queries on the old and the new server.

Besides, if you are running lots of count(*) queries, you are doing something wrong:

https://urldefense.proofpoint.com/v2/url?u=https-3A__www.cybertec-2Dpostgresql.com_en_pagination-2Dproblem-2Dtotal-2Dresult-2Dcount_-23total-2Dcount&d=DwIFaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=0wrsmPzpZSao0v32yCcG2Q&m=sHvfp1ZxGpYQrS61b2zVlZ3xkkLw3twcNEUvtXlG-TMSgRbIHTCvYREx111tm8ws&s=WEmJg5zO2-3jDk1DMz28MdxVYpXJ6cygSzz-q3DRPN4&e=


Yours,
Laurenz Albe

В списке pgsql-admin по дате отправления:

Предыдущее
От: Laurenz Albe
Дата:
Сообщение: Re: Query Performance after pg_restore
Следующее
От: M Sarwar
Дата:
Сообщение: Re: Login script