CTE with JOIN of two tables is much faster than a regular query
От | |
---|---|
Тема | CTE with JOIN of two tables is much faster than a regular query |
Дата | |
Msg-id | 004b01d436d6$ff96c840$fec458c0$@gmail.com обсуждение исходный текст |
Ответы |
Re: CTE with JOIN of two tables is much faster than a regular query
(Andreas Kretschmer <andreas@a-kretschmer.de>)
Re: CTE with JOIN of two tables is much faster than a regular query (Stephen Frost <sfrost@snowman.net>) Re: CTE with JOIN of two tables is much faster than a regular query (Ravi Krishna <sravikrishna@aol.com>) (was: CTE with JOIN of two tables is much faster than a regularquery) (Albrecht Dreß <albrecht.dress@arcor.de>) |
Список | pgsql-general |
Running PostgreSQL 9.5 on Windows.
The CTE mentioned below completes the query in 4.5 seconds while the regular query takes 66 seconds. I read from EXPLAIN ANALYSE that the regular query starts with a full table scan over “Doc” while the CTE joins the two tables first and applies the filter condition in the 2nd step.
I believe that some rows in “Doc” which are not referenced by “F” contain a large amount of data in the field “szText” and this will slow down the ILIKE operator.
What can I do to improve the performance of the regular query without using a CTE?
This is a much simplified extract from a larger application:
CREATE TABLE Doc (
oID UUID NOT NULL PRIMARY KEY,
uDocID UUID NOT NULL UNIQUE,
szText TEXT
);
CREATE TABLE F (
oID UUID NOT NULL PRIMARY KEY,
uDocRef UUID,
CONSTRAINT F_fkey1 FOREIGN KEY (uDocRef) REFERENCES Doc (uDocID)
);
-- just in case …
ALTER TABLE Doc ALTER uDocID SET STATISTICS 10000;
ALTER TABLE Doc ALTER szText SET STATISTICS 10000;
VACUUM ANALYSE Doc;
SELECT COUNT(*) FROM Doc;
=> 125946 records
ALTER TABLE F ALTER uDocRef SET STATISTICS 10000;
VACUUM ANALYSE F;
SELECT COUNT(*) FROM F;
=> 32605 records
Result with CTE:
EXPLAIN ANALYSE
WITH a AS (
SELECT F.oID, Doc.szText
FROM F
JOIN Doc ON F.uDocRef = Doc.udocid
)
SELECT *
FROM a
WHERE szText ILIKE '%480GB%';
"CTE Scan on a (cost=9463.42..10197.03 rows=52 width=48) (actual time=478.770..4551.613 rows=10 loops=1)"
" Filter: (sztext ~~* '%480GB%'::text)"
" Rows Removed by Filter: 32595"
" CTE a"
" -> Hash Join (cost=973.61..9463.42 rows=32605 width=359) (actual time=36.998..100.337 rows=32605 loops=1)"
" Hash Cond: (doc.udocid = f.udocref)"
" -> Seq Scan on doc (cost=0.00..7691.46 rows=125946 width=359) (actual time=0.008..18.269 rows=125946 loops=1)"
" -> Hash (cost=566.05..566.05 rows=32605 width=32) (actual time=35.825..35.825 rows=32605 loops=1)"
" Buckets: 32768 Batches: 1 Memory Usage: 2294kB"
" -> Seq Scan on f (cost=0.00..566.05 rows=32605 width=32) (actual time=0.005..14.677 rows=32605 loops=1)"
"Planning time: 4.689 ms"
"Execution time: 4554.893 ms"
Result with regular query:
EXPLAIN ANALYSE
SELECT F.oID, Doc.szText
FROM F
JOIN Doc ON F.uDocRef = Doc.udocid
WHERE szText ILIKE '%480GB%';
"Hash Join (cost=8006.56..8694.93 rows=5 width=359) (actual time=66500.415..66506.978 rows=10 loops=1)"
" Hash Cond: (f.udocref = doc.udocid)"
" -> Seq Scan on f (cost=0.00..566.05 rows=32605 width=32) (actual time=0.002..3.143 rows=32605 loops=1)"
" -> Hash (cost=8006.32..8006.32 rows=19 width=359) (actual time=66500.023..66500.023 rows=16 loops=1)"
" Buckets: 1024 Batches: 1 Memory Usage: 19kB"
" -> Seq Scan on doc (cost=0.00..8006.32 rows=19 width=359) (actual time=8864.720..66499.991 rows=16 loops=1)"
" Filter: (sztext ~~* '%480GB%'::text)"
" Rows Removed by Filter: 125930"
"Planning time: 263.542 ms"
"Execution time: 66507.003 ms"
В списке pgsql-general по дате отправления:
Следующее
От: Andreas KretschmerДата:
Сообщение: Re: CTE with JOIN of two tables is much faster than a regular query