Re: Weird behaviour of the planner

Поиск
Список
Период
Сортировка
От Guillaume Lelarge
Тема Re: Weird behaviour of the planner
Дата
Msg-id CAECtzeWmeB-w3pmwDML4fAg7F_S1Qp3crKcfydDbQSrDq9Z5=Q@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Weird behaviour of the planner  (David Rowley <david.rowley@2ndquadrant.com>)
Список pgsql-general
2018-08-01 16:59 GMT+02:00 David Rowley <david.rowley@2ndquadrant.com>:
On 2 August 2018 at 02:48, Guillaume Lelarge <guillaume@lelarge.info> wrote:
> EXPLAIN (ANALYZE) SELECT DISTINCT * FROM gleu2;
>
>                                                        QUERY PLAN
> ---------------------------------------------------------------------------------------------------------------------
>  Unique  (cost=12005.97..12049.20 rows=1 width=1430) (actual
> time=20055.294..20323.348 rows=1 loops=1)
>    ->  Sort  (cost=12005.97..12006.30 rows=132 width=1430) (actual
> time=20055.290..20105.738 rows=60000 loops=1)
>          Sort Key: (... 130 columns ...)
>          Sort Method: external sort  Disk: 84464kB
>          ->  Seq Scan on gleu2  (cost=0.00..12001.32 rows=132 width=1430)
> (actual time=0.109..114.142 rows=60000 loops=1)
>  Planning time: 10.012 ms
>  Execution time: 20337.854 ms
> (7 rows)
>
> That looks quite good. The total cost is 12049, so I expect this plan to
> have the smaller cost as it's the choosen plan. Now, I'm disabling Sort, and
> here is what I get:
>
> SET enable_sort TO off;
> EXPLAIN (ANALYZE) SELECT DISTINCT * FROM gleu2;
>
>                                                        QUERY PLAN
> ---------------------------------------------------------------------------------------------------------------------
>  HashAggregate  (cost=12044.22..12044.23 rows=1 width=1430) (actual
> time=508.342..508.343 rows=1 loops=1)
>    Group Key: (... 130 columns ...)
>    ->  Seq Scan on gleu2  (cost=0.00..12001.32 rows=132 width=1430) (actual
> time=0.036..57.088 rows=60000 loops=1)
>  Planning time: 1.335 ms
>  Execution time: 508.529 ms
> (5 rows)

When the planner adds a new path it compares the cost not exactly, but
with a 'fuzz factor'.  It's very likely that the hashagg path did not
make it as it was not fuzzily any cheaper than the unique path. By
default, this fuzz factor is 1%.


That may definitely be it. There's not much of a difference in the total costs.

It seems in your case the costs don't quite match reality which is
quite likely due to the poor row estimates on "gleu2".  Has that table
been analyzed recently? or is there some reason that auto-vacuum is
not getting to it?


This is a small test case of a much bigger query joining a large number of tables, materialized views, views (calling functions), etc. The actual plan contains 84 nodes (32 scans, lots of joins, and a few other nodes). The mis-estimate is to follow what the big query gives me.

There's a bit more reading of what I'm talking about in
https://github.com/postgres/postgres/blob/master/src/backend/optimizer/util/pathnode.c#L141


I'm gonna read that. Thank you.


--
Guillaume.

В списке pgsql-general по дате отправления:

Предыдущее
От: David Rowley
Дата:
Сообщение: Re: Weird behaviour of the planner
Следующее
От: Tom Lane
Дата:
Сообщение: Re: ALTER ROLE SET search_path produced by pg_dumpall gives : ERROR: syntax error at or near "$" .