Re: BUG #7627: Bad query plan when using LIMIT

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: BUG #7627: Bad query plan when using LIMIT
Дата
Msg-id 2841.1351574832@sss.pgh.pa.us
обсуждение исходный текст
Ответ на BUG #7627: Bad query plan when using LIMIT  (edward@clericare.com)
Ответы Re: BUG #7627: Bad query plan when using LIMIT  (Simon Riggs <simon@2ndQuadrant.com>)
Список pgsql-bugs
edward@clericare.com writes:
> The following two queries differ only by adding "LIMIT 1", but the one with
> the limit gets radically worse performance. I've done VACUUM FULL, VACUUM
> ANALYZE, and REINDEX DATABASE and there are no modifications since.

> EXPLAIN ANALYZE SELECT * FROM commits WHERE id IN (SELECT id FROM commits
> ORDER BY tree_high LIMIT 605 ) AND tree_other IS NULL ORDER BY tree_high
> DESC;

> EXPLAIN ANALYZE SELECT * FROM commits WHERE id IN (SELECT id FROM commits
> ORDER BY tree_high LIMIT 605 ) AND tree_other IS NULL ORDER BY tree_high
> DESC LIMIT 1;

I think what's happening there is that the planner supposes that an
indexscan in "tree_high DESC" order will find rows matching the IN
condition uniformly distributed in the scan order --- but, because of
the construction of the IN clause, they're actually going to be
pessimally located at the very end of that scan order.  So it ends up
forming all of the nestloop result, when it had expected to have to
compute only 1/595 of it.  We've discussed dialing down the planner's
optimism about limit plans to not assume perfect independence of filter
conditions, but I don't think anyone would advocate for having it assume
the worst possible case, which is what you've got here unfortunately.

I can't help thinking that there's a way to express this problem without
the peculiar self-join, but I'm too tired to think of a good one right
now.  The best I can do is a window function:

select last_value(id) over (order by tree_high), last_value(...) ...
from (select * from commits order by tree_high limit 605) ss
where tree_other is null;

but it'd be pretty tedious to write out the last_value construct for
each column you want, and anyway this seems less than elegant even
aside from that objection.

            regards, tom lane

В списке pgsql-bugs по дате отправления:

Предыдущее
От: edward@clericare.com
Дата:
Сообщение: BUG #7627: Bad query plan when using LIMIT
Следующее
От: Tianyin Xu
Дата:
Сообщение: Re: BUG #7624: Misleading Log Message & Inconsistent Configuration Design