Обсуждение: Slow query in table where many rows were deleted. VACUUM FULL fixes it

Поиск
Список
Период
Сортировка

Slow query in table where many rows were deleted. VACUUM FULL fixes it

От
Pavlos Kallis
Дата:
Hi all,

I have the following table:

CREATE TABLE IF NOT EXISTS public.shortened_url
(
    id character varying(12) COLLATE pg_catalog."default" NOT NULL,
    created_at timestamp without time zone,
    expires_at timestamp without time zone,
    url text COLLATE pg_catalog."default" NOT NULL,
    CONSTRAINT shortened_url_pkey PRIMARY KEY (id)
)

The table contains only the following index on PRIMARY KEY:

CREATE UNIQUE INDEX IF NOT EXISTS shortened_url_pkey
    ON public.shortened_url USING btree
    (id COLLATE pg_catalog."default" ASC NULLS LAST)
    TABLESPACE pg_default;

This table has approximately 5 million rows of expired URLs (expires_at < now()), and 5 thousand rows of non-expired URLs (expires_at > now())

I deleted all expired URLs with this query:

DELETE FROM shortened_url WHERE expires_at < now().

Then, I tried to query the table for expired URLs:

SELECT * FROM shortened_url WHERE expires_at < now();

This query was very slow. It took around 1-2 minutes to run, while it had to fetch only 5000 rows (the non-expired URLs, since the other ones were deleted).

After that, I tried to run VACUUM ANALYZE and REINDEX to the table.
The query was still slow.

Finally, I ran VACUUM FULL and re-executed the query. Only then, it started running fast (1-2 seconds).

Do you have observed a similar behavior with VACUUM ANALYZE / VACUUM FULL and why this can happen? 
Is this because data is compacted after VACUUM FULL and sequential disk reads are faster? 
Shouldn't VACUUM ANALYZE reclaim the disk space and make the query run fast?
Is this because RDS might do some magic? Is it something I am missing?

Additional details
PostgreSQL version: 14.7 on db.t3.micro RDS
PG configuration: Default of RDS

Kind Regards,
Pavlos

Re: Slow query in table where many rows were deleted. VACUUM FULL fixes it

От
Laurenz Albe
Дата:
On Tue, 2024-01-30 at 11:40 +0200, Pavlos Kallis wrote:
> I have the following table:
>
> CREATE TABLE IF NOT EXISTS public.shortened_url
> (
>     id character varying(12) COLLATE pg_catalog."default" NOT NULL,
>     created_at timestamp without time zone,
>     expires_at timestamp without time zone,
>     url text COLLATE pg_catalog."default" NOT NULL,
>     CONSTRAINT shortened_url_pkey PRIMARY KEY (id)
> )
>
> The table contains only the following index on PRIMARY KEY:
>
> CREATE UNIQUE INDEX IF NOT EXISTS shortened_url_pkey
>     ON public.shortened_url USING btree
>     (id COLLATE pg_catalog."default" ASC NULLS LAST)
>     TABLESPACE pg_default;
>
> This table has approximately 5 million rows of expired URLs (expires_at < now()), and 5 thousand rows of non-expired
URLs(expires_at > now()) 
>
> I deleted all expired URLs with this query:
>
> DELETE FROM shortened_url WHERE expires_at < now().
>
> Then, I tried to query the table for expired URLs:
>
> SELECT * FROM shortened_url WHERE expires_at < now();
>
> This query was very slow. It took around 1-2 minutes to run, while it had to fetch only 5000 rows (the non-expired
URLs,since the other ones were deleted). 
>
> After that, I tried to run VACUUM ANALYZE and REINDEX to the table.
> The query was still slow.
>
> Finally, I ran VACUUM FULL and re-executed the query. Only then, it started running fast (1-2 seconds).
>
> Do you have observed a similar behavior with VACUUM ANALYZE / VACUUM FULL and why this can happen? 
> Is this because data is compacted after VACUUM FULL and sequential disk reads are faster? 
> Shouldn't VACUUM ANALYZE reclaim the disk space and make the query run fast?
> Is this because RDS might do some magic? Is it something I am missing?

There are too many unknowns here.  Please enable "track_io_timing" and send us
the output of EXPLAIN (ANALYZE, BUFFERS) for the slow statements.

One theory could be that there was a long running transaction or something else
that prevented VACUUM from cleaning up.  For that, the output of
"VACUUM (VERBOSE) shortened_url" would be interesting.

> Additional details
> PostgreSQL version: 14.7 on db.t3.micro RDS
> PG configuration: Default of RDS

We can only speak about real PostgreSQL...

Yours,
Laurenz Albe



Re: Slow query in table where many rows were deleted. VACUUM FULL fixes it

От
Philip Semanchuk
Дата:

> On Jan 30, 2024, at 4:40 AM, Pavlos Kallis <pkallis@yourhero.com> wrote:
>
> Shouldn't VACUUM ANALYZE reclaim the disk space?

Hi Pavlos,
The short answer to this is “no”. That’s an important difference between VACUUM (also known as “plain” VACUUM) and
VACUUMFULL. In some special cases plain VACUUM can reclaim disk space, but I think both the circumstances under which
itcan do so and the amount it can reclaim are pretty limited. An oversimplified but "mostly correct" way to think about
itis that plain VACUUM can't reclaim disk space, whereas VACUUM FULL can. 

This is covered in the 4th paragraph of the doc of the VACUUM command --
https://www.postgresql.org/docs/current/sql-vacuum.html

So in your case those 5m rows that you deleted were probably still clogging up your table until you ran VACUUM FULL.


Hope this helps
Philip


Re: Slow query in table where many rows were deleted. VACUUM FULL fixes it

От
David Rowley
Дата:
On Wed, 31 Jan 2024 at 09:09, Philip Semanchuk
<philip@americanefficient.com> wrote:
> So in your case those 5m rows that you deleted were probably still clogging up your table until you ran VACUUM FULL.

It seems more likely to me that the VACUUM removed the rows and just
left empty pages in the table.  Since there's no index on expires_at,
the only way to answer that query is to Seq Scan and Seq Scan will
need to process those empty pages.  While that processing is very fast
if the page's item pointers array is empty, it could still be slow if
the page needs to be read from disk. Laurenz's request for the explain
(analyze, buffers) output with track_io_timing on will help confirm
this.

If it is just reading empty pages that's causing this issue then
adding that missing index would improve the situation after running
just plain VACUUM each time there's a bulk delete.

David



Re: Slow query in table where many rows were deleted. VACUUM FULL fixes it

От
Divya Sharma
Дата:
Hi Pavlos

This is my understanding of why you were not able to run the query fast enough after the vacuum analyze. This is possibly what would have happened:

  1. The relation has 5 million expired URLs and 5 thousand non-expired URLs
  2. Assuming that the table only has 5 million and 5 thousand tuples, once you delete the expired ones, there will be an autovacuum triggered. “If the number of tuples obsoleted since the last VACUUM exceeds the “vacuum threshold”, the table is vacuumed“ - https://www.postgresql.org/docs/current/routine-vacuuming.html#AUTOVACUUM ; As the Analyze threshold will also be exceeded, that would also have been run by autovacuum alongside.
  3. The status of this autovacuum (if it is running or blocked), could have been checked in the pg_stat_activity.
  4. Note, autovacuum does not trigger to clean up the dead tuples if it is disabled for the relation (or in the postgresql.conf file). However, if you would have taken transaction IDs to the threshold of autovacuum_freeze_max_age, autovacuum would trigger to FREEZE transaction IDs even if disabled.
  5. As you stated its a t3.micro instance, they have limited resources, so it could be that the autovacuum was slow running (again, this can be checked in pg_stat_activity).
  6. Given that you manually ran a VACUUM ANALYZE and it did not make the query faster, could be due to internal fragmentation. You are right, Vacuum does not release the space back to the operating system in most cases. This statement is the documentation that can clarify this for you :
    “The standard form of VACUUM removes dead row versions in tables and indexes and marks the space available for future reuse. However, it will not return the space to the operating system, except in the special case where one or more pages at the end of a table become entirely free and an exclusive table lock can be easily obtained. In contrast, VACUUM FULL actively compacts tables by writing a complete new version of the table file with no dead space. This minimizes the size of the table, but can take a long time. It also requires extra disk space for the new copy of the table, until the operation completes.” https://www.postgresql.org/docs/current/routine-vacuuming.html#VACUUM-FOR-SPACE-RECOVERY
  7. This basically means that once you ran a VACUUM FULL, it might have actually shrunk the table quite significantly, which made the query to be much faster.
  8. You could have compared the size of the table before and after the VACUUM FULL to understand this better.

Just a few suggestion for doing bulk removal of data :
  • It would be worth looking into pg_repack for such bulk deletes rather than vacuum full as the former does not take an exclusive lock for the entire duration of the operation - https://reorg.github.io/pg_repack/ . However, you will still need double the space of the table, as it also recreates the table.
  • Another way of doing bulk removal of data would be to do a CTAS (https://www.postgresql.org/docs/14/sql-createtableas.html) to a new table with the live data (in your case the 5 thousand tuples), and then dropping the old table (which means no dead tuples). You might need a trigger in between to make sure all the live data during use is transferred to the new table.
  • You might want to look into partitioning and drop the partitions once the URLs in that particular partition are no longer needed (Like URLs older than 6 months).


Kind Regards
Divya Sharma


On Tue, Jan 30, 2024 at 8:38 PM David Rowley <dgrowleyml@gmail.com> wrote:
On Wed, 31 Jan 2024 at 09:09, Philip Semanchuk
<philip@americanefficient.com> wrote:
> So in your case those 5m rows that you deleted were probably still clogging up your table until you ran VACUUM FULL.

It seems more likely to me that the VACUUM removed the rows and just
left empty pages in the table.  Since there's no index on expires_at,
the only way to answer that query is to Seq Scan and Seq Scan will
need to process those empty pages.  While that processing is very fast
if the page's item pointers array is empty, it could still be slow if
the page needs to be read from disk. Laurenz's request for the explain
(analyze, buffers) output with track_io_timing on will help confirm
this.

If it is just reading empty pages that's causing this issue then
adding that missing index would improve the situation after running
just plain VACUUM each time there's a bulk delete.

David