Re: strange query plan with LIMIT

Поиск
Список
Период
Сортировка
От anthony.shipman@symstream.com
Тема Re: strange query plan with LIMIT
Дата
Msg-id 201106081508.05668.anthony.shipman@symstream.com
обсуждение исходный текст
Ответ на Re: strange query plan with LIMIT  (tv@fuzzy.cz)
Ответы Re: strange query plan with LIMIT  (tv@fuzzy.cz)
Список pgsql-performance
On Wednesday 08 June 2011 02:40, tv@fuzzy.cz wrote:
> Hi,
>
> why are you reposting this? Pavel Stehule already recommended you to run
> ANALYZE on the tdiag table - have you done that? What was the effect?

The mailing list system hiccupped and I ended up with two posts.

VACUUM ANALYZE was done, more than once.
Setting the statistics value on the diag_id column to 1000 seemed to only make
the query a bit slower.

>
> The stats are off - e.g. the bitmap scan says
>
>    ->  Bitmap Heap Scan on tdiag  (cost=25763.48..638085.13 rows=1141019
> width=114) (actual time=43.232..322.441 rows=86530 loops=1)
>
> so it expects to get 1141019 rows but it gets 86530, i.e. about 7% of the
> expected number. That might be enough to cause bad plan choice and thus
> performance issues.

What seems odd to me is that the only difference between the two is the limit
clause:

select * from tdiag where (create_time >= '2011-06-03
09:49:04.000000+0' and create_time < '2011-06-06 09:59:04.000000+0') order by
diag_id limit 1;

select * from tdiag where (create_time >= '2011-06-03
09:49:04.000000+0' and create_time < '2011-06-06 09:59:04.000000+0') order by
diag_id;

and yet the plan completely changes.

I think that I have to force the evaluation order to get a reliably fast
result:

begin; create temporary table tt on commit drop as
select diag_id from tdiag where create_time >= '2011-06-03 09:49:04.000000+0'
     and create_time < '2011-06-06 09:59:04.000000+0';
select * from tdiag where diag_id in (select * from tt)
     order by diag_id limit 10; commit;

                                                                QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=3566.24..3566.27 rows=10 width=112) (actual
time=1800.699..1800.736 rows=10 loops=1)
   ->  Sort  (cost=3566.24..3566.74 rows=200 width=112) (actual
time=1800.694..1800.708 rows=10 loops=1)
         Sort Key: tdiag.diag_id
         Sort Method:  top-N heapsort  Memory: 18kB
         ->  Nested Loop  (cost=1360.00..3561.92 rows=200 width=112) (actual
time=269.087..1608.324 rows=86530 loops=1)
               ->  HashAggregate  (cost=1360.00..1362.00 rows=200 width=4)
(actual time=269.052..416.898 rows=86530 loops=1)
                     ->  Seq Scan on tt  (cost=0.00..1156.00 rows=81600
width=4) (actual time=0.020..120.323 rows=86530 loops=1)
               ->  Index Scan using tdiag_pkey on tdiag  (cost=0.00..10.99
rows=1 width=112) (actual time=0.006..0.008 rows=1 loops=86530)
                     Index Cond: (tdiag.diag_id = tt.diag_id)
 Total runtime: 1801.290 ms

>
> And yet another recommendation - the sort is performed on disk, so give it
> more work_mem and it should be much faster (should change from "merge
> sort" to "quick sort"). Try something like work_mem=20MB and see if it
> does the trick.

This certainly speeds up the sorting.

>
> regards
> Tomas

--
Anthony Shipman                 | What most people think about
Anthony.Shipman@symstream.com   | most things is mostly wrong.

В списке pgsql-performance по дате отправления:

Предыдущее
От: Craig Ringer
Дата:
Сообщение: Re: 100% CPU Utilization when we run queries.
Следующее
От: Claudio Freire
Дата:
Сообщение: Re: strange query plan with LIMIT