Обсуждение: Query Performance after pg_restore

Поиск
Список
Период
Сортировка

Query Performance after pg_restore

От
Murthy Nunna
Дата:

Hi,

 

I did pg_dump of a ~20TB database followed by pg_restore. I find simple queries like select count(*) running slow. I did a select count(*) on all tables before pg_dump which took ~4 hours. After pg_restore, same thing took 32 hours.

 

I ran “analyze verbose ;” on the database, but it did not help. It ran quick in 1.5 hours, but as you are aware it does sampling.

 

I am now running “vacuumdb -p <port-number> -a -z -j 10 -v”, but I do not know if it is going to help.

 

I am not sure if above vacuumdb helps, but if it helps what is it vacuuming in a pristine database? If it is doing some special “analyze”, what is it? Is there a way to run “full” analyze without sampling and without vacuuming?

 

By the way, there is no change in postgres versions. It is 14.4 before and after pg_restore.

 

Thanks!

 

 

 

Re: Query Performance after pg_restore

От
Tom Lane
Дата:
Murthy Nunna <mnunna@fnal.gov> writes:
> I did pg_dump of a ~20TB database followed by pg_restore. I find simple queries like select count(*) running slow. I
dida select count(*) on all tables before pg_dump which took ~4 hours. After pg_restore, same thing took 32 hours. 

My bet is that that was setting commit hint bits, and hence incurring
a lot of writes.  If the data is reasonably stable that's a one-time
expense.

            regards, tom lane



RE: Query Performance after pg_restore

От
Murthy Nunna
Дата:
Thanks, Tom.

I am wondering if there is way to obtain same performance as before. Like restoring performance statistics?!
Simple analyze is not helping. And Vacuum is a overkill where there are no dead tuples.

I am not setting any hint bits... I can verify this in new and old if you can provide a query or some method to verify
this...I still have original copy of same database on a different server for comparison. 



-----Original Message-----
From: Tom Lane <tgl@sss.pgh.pa.us>
Sent: Sunday, December 24, 2023 10:12 AM
To: Murthy Nunna <mnunna@fnal.gov>
Cc: pgsql-admin@postgresql.org
Subject: Re: Query Performance after pg_restore

Murthy Nunna <mnunna@fnal.gov> writes:
> I did pg_dump of a ~20TB database followed by pg_restore. I find simple queries like select count(*) running slow. I
dida select count(*) on all tables before pg_dump which took ~4 hours. After pg_restore, same thing took 32 hours. 

My bet is that that was setting commit hint bits, and hence incurring a lot of writes.  If the data is reasonably
stablethat's a one-time expense. 

            regards, tom lane



Re: Query Performance after pg_restore

От
Ron Johnson
Дата:
I'd run EXPLAIN <query> on each server.

On Sun, Dec 24, 2023 at 12:00 PM Murthy Nunna <mnunna@fnal.gov> wrote:
Thanks, Tom.

I am wondering if there is way to obtain same performance as before. Like restoring performance statistics?!
Simple analyze is not helping. And Vacuum is a overkill where there are no dead tuples.

I am not setting any hint bits... I can verify this in new and old if you can provide a query or some method to verify this... I still have original copy of same database on a different server for comparison.



-----Original Message-----
From: Tom Lane <tgl@sss.pgh.pa.us>
Sent: Sunday, December 24, 2023 10:12 AM
To: Murthy Nunna <mnunna@fnal.gov>
Cc: pgsql-admin@postgresql.org
Subject: Re: Query Performance after pg_restore

Murthy Nunna <mnunna@fnal.gov> writes:
> I did pg_dump of a ~20TB database followed by pg_restore. I find simple queries like select count(*) running slow. I did a select count(*) on all tables before pg_dump which took ~4 hours. After pg_restore, same thing took 32 hours.

My bet is that that was setting commit hint bits, and hence incurring a lot of writes.  If the data is reasonably stable that's a one-time expense.

                        regards, tom lane


Re: Query Performance after pg_restore

От
Laurenz Albe
Дата:
On Sun, 2023-12-24 at 15:53 +0000, Murthy Nunna wrote:
> I did pg_dump of a ~20TB database followed by pg_restore. I find simple queries like select
> count(*) running slow. I did a select count(*) on all tables before pg_dump which took ~4 hours.
> After pg_restore, same thing took 32 hours.

SELECT count(*) is always slow.

> By the way, there is no change in postgres versions. It is 14.4 before and after pg_restore.

You should compare the execution plans of the queries on the old and the new server.

Besides, if you are running lots of count(*) queries, you are doing something wrong:
https://www.cybertec-postgresql.com/en/pagination-problem-total-result-count/#total-count

Yours,
Laurenz Albe



RE: Query Performance after pg_restore

От
Murthy Nunna
Дата:
Thank you all for responding to my performance issue.
With great respect to all, based on the responses I received so far, it looks like the responses are going on a tangent
(exceptfor Tom Lane!)
 

As soon as I say query performance it is easy to jump into explain-plan land. No, this is not that! I don't run "select
count(*)of tables" every day. I ran this only to show the performance issue and only for comparison purposes.
 

Now that my test is completed, I would like to summarize it here:
Note: This particular issue will go unnoticed and a non-issue on small databases. I am dealing with 20TB database here.
So,it shows.
 

1) Run select count(*) of all tables in the database. Ran for about 4 hours.
2) Perform pg_dump followed by pg_restore (database is 20TB in size. Not too many dead tuples in the original database,
solet us not go there.)
 
3) Run select count(*) of all tables in the database. Ran for 32 hours! Yikes.
4) Run analyze (analyze verbose) on the database
5) Run select count(*) of all tables in the database. Ran for 32 hours! Yikes!!
6) Run vacuumdb -a -z
7) Run select count(*) of all tables in the database. Ran for 1 hour 40 minutes. Excellent!

Tom - Thank you for the hint about "hint bits". I also reviewed" https://wiki.postgresql.org/wiki/Hint_Bits" which says
"Aplain SELECT, count(*), or VACUUM on the entire table will check every tuple for visibility and set its hint bits."
Mypoint is, select count(*) is NOT setting whatever it needs to. If it did, step 5) above should have completed lot
quicker.IMHO, "vacuumdb" is fixing this. May be ANALYZE is setting these bits on the sample tuples it looks at. But
thereis no option to run full analyze on all tuples.
 

If I don't run vacuumdb, I have a hunch that the performance will improve over the time as the users or autovacuum hits
thetuples over time and does the necessary magic on the hint bits.
 

IMHO, a prudent fix would be to incorporate the fix in ANALYZE. The way it is, the fix happened to be a side benefit of
vacuumdbas it scans every page. At the least, I feel, PG_RESTORE document should be changed. Currently it says " Once
restored,it is wise to run ANALYZE on each restored table so the optimizer has useful statistics; see Section 25.1.3
andSection 25.1.6 for more information.". ANALYZE should be replaced with "VACUUM".
 

Thanks,
Murthy


-----Original Message-----
From: Laurenz Albe <laurenz.albe@cybertec.at> 
Sent: Sunday, December 24, 2023 7:06 PM
To: Murthy Nunna <mnunna@fnal.gov>; pgsql-admin@postgresql.org
Subject: Re: Query Performance after pg_restore

On Sun, 2023-12-24 at 15:53 +0000, Murthy Nunna wrote:
> I did pg_dump of a ~20TB database followed by pg_restore. I find simple queries like select
> count(*) running slow. I did a select count(*) on all tables before pg_dump which took ~4 hours.
> After pg_restore, same thing took 32 hours.

SELECT count(*) is always slow.

> By the way, there is no change in postgres versions. It is 14.4 before and after pg_restore.

You should compare the execution plans of the queries on the old and the new server.

Besides, if you are running lots of count(*) queries, you are doing something wrong:

https://urldefense.proofpoint.com/v2/url?u=https-3A__www.cybertec-2Dpostgresql.com_en_pagination-2Dproblem-2Dtotal-2Dresult-2Dcount_-23total-2Dcount&d=DwIFaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=0wrsmPzpZSao0v32yCcG2Q&m=sHvfp1ZxGpYQrS61b2zVlZ3xkkLw3twcNEUvtXlG-TMSgRbIHTCvYREx111tm8ws&s=WEmJg5zO2-3jDk1DMz28MdxVYpXJ6cygSzz-q3DRPN4&e=


Yours,
Laurenz Albe