On Tue, 2024-01-30 at 11:40 +0200, Pavlos Kallis wrote:
> I have the following table:
>
> CREATE TABLE IF NOT EXISTS public.shortened_url
> (
> id character varying(12) COLLATE pg_catalog."default" NOT NULL,
> created_at timestamp without time zone,
> expires_at timestamp without time zone,
> url text COLLATE pg_catalog."default" NOT NULL,
> CONSTRAINT shortened_url_pkey PRIMARY KEY (id)
> )
>
> The table contains only the following index on PRIMARY KEY:
>
> CREATE UNIQUE INDEX IF NOT EXISTS shortened_url_pkey
> ON public.shortened_url USING btree
> (id COLLATE pg_catalog."default" ASC NULLS LAST)
> TABLESPACE pg_default;
>
> This table has approximately 5 million rows of expired URLs (expires_at < now()), and 5 thousand rows of non-expired
URLs(expires_at > now())
>
> I deleted all expired URLs with this query:
>
> DELETE FROM shortened_url WHERE expires_at < now().
>
> Then, I tried to query the table for expired URLs:
>
> SELECT * FROM shortened_url WHERE expires_at < now();
>
> This query was very slow. It took around 1-2 minutes to run, while it had to fetch only 5000 rows (the non-expired
URLs,since the other ones were deleted).
>
> After that, I tried to run VACUUM ANALYZE and REINDEX to the table.
> The query was still slow.
>
> Finally, I ran VACUUM FULL and re-executed the query. Only then, it started running fast (1-2 seconds).
>
> Do you have observed a similar behavior with VACUUM ANALYZE / VACUUM FULL and why this can happen?
> Is this because data is compacted after VACUUM FULL and sequential disk reads are faster?
> Shouldn't VACUUM ANALYZE reclaim the disk space and make the query run fast?
> Is this because RDS might do some magic? Is it something I am missing?
There are too many unknowns here. Please enable "track_io_timing" and send us
the output of EXPLAIN (ANALYZE, BUFFERS) for the slow statements.
One theory could be that there was a long running transaction or something else
that prevented VACUUM from cleaning up. For that, the output of
"VACUUM (VERBOSE) shortened_url" would be interesting.
> Additional details
> PostgreSQL version: 14.7 on db.t3.micro RDS
> PG configuration: Default of RDS
We can only speak about real PostgreSQL...
Yours,
Laurenz Albe