Re: BUG #14107: Major query planner bug regarding subqueries and indices
От | Mathias Kunter |
---|---|
Тема | Re: BUG #14107: Major query planner bug regarding subqueries and indices |
Дата | |
Msg-id | 69e376c7-6c2c-c96d-2ad6-3734affb59b7@gmail.com обсуждение исходный текст |
Ответ на | Re: BUG #14107: Major query planner bug regarding subqueries and indices (David Rowley <david.rowley@2ndquadrant.com>) |
Ответы |
Re: BUG #14107: Major query planner bug regarding subqueries
and indices
(Victor Yegorov <vyegorov@gmail.com>)
|
Список | pgsql-bugs |
> Thanks for the report, but I think your subject line is a little > exaggerated. Sorry for that :-) > I'd not actually thought about expanding this to IN and NOT IN, but > likely it would be possible, providing NULLs could be handled > correctly too. Was this the optimisation you think is missing? No - let's consider a real-world example where the problem becomes more evident. Assume we have the following two tables: CREATE TABLE book (id SERIAL NOT NULL, name VARCHAR, authorId SERIAL, CONSTRAINT book_pkey PRIMARY KEY (id)); CREATE TABLE author (id SERIAL NOT NULL, surname VARCHAR, CONSTRAINT author_pkey PRIMARY KEY (id)); All relevant columns are indexed. The book table contains 1.3 M rows. Now let's say we want to find books by either name or by author name. A simple problem, actually. The following query is used (which could of course also be rewritten to use a JOIN instead, but that's not the point here). We find out that it has an unexpected and unacceptably long execution time: EXPLAIN SELECT * FROM book WHERE name = 'Harry Potter' OR authorId IN (SELECT id FROM author WHERE surname = 'Rowling'); QUERY PLAN ------------------------------------------------------------------------------------------- Seq Scan on book (cost=13.68..26792.88 rows=576709 width=40) Filter: (((name)::text = 'Harry Potter'::text) OR (hashed SubPlan 1)) SubPlan 1 -> Bitmap Heap Scan on author (cost=4.20..13.67 rows=6 width=4) Recheck Cond: ((surname)::text = 'Rowling'::text) -> Bitmap Index Scan on author_surname_index (cost=0.00..4.20 rows=6 width=0) Index Cond: ((surname)::text = 'Rowling'::text) In contrast to that, consider the query plan for the following logically equivalent query, which executes in just a few milliseconds (!): EXPLAIN SELECT * FROM book WHERE name = 'Harry Potter' UNION SELECT * FROM book WHERE authorId IN (SELECT id FROM author WHERE surname = 'Rowling'); QUERY PLAN ------------------------------------------------------------------------------------------------------------ HashAggregate (cost=454.87..455.99 rows=112 width=29) Group Key: book.id, book.name, book.authorid -> Append (cost=0.43..454.03 rows=112 width=29) -> Index Scan using book_name_index on book (cost=0.43..16.48 rows=3 width=29) Index Cond: ((name)::text = 'Harry Potter'::text) -> Nested Loop (cost=4.63..436.43 rows=109 width=29) -> Bitmap Heap Scan on author (cost=4.20..13.67 rows=6 width=4) Recheck Cond: ((surname)::text = 'Rowling'::text) -> Bitmap Index Scan on author_surname_index (cost=0.00..4.20 rows=6 width=0) Index Cond: ((surname)::text = 'Rowling'::text) -> Index Scan using book_authorid_index on book book_1 (cost=0.43..70.28 rows=18 width=29) Index Cond: (authorid = author.id)
В списке pgsql-bugs по дате отправления:
Предыдущее
От: David RowleyДата:
Сообщение: Re: BUG #14107: Major query planner bug regarding subqueries and indices
Следующее
От: Mathias KunterДата:
Сообщение: Re: BUG #14107: Major query planner bug regarding subqueries and indices