Re: Planning aggregates which require sorted or distinct input

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Planning aggregates which require sorted or distinct input
Дата
Msg-id 17421.1169230347@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Planning aggregates which require sorted or distinct input  (Gavin Sherry <swm@alcove.com.au>)
Ответы Re: Planning aggregates which require sorted or distinct  (Gavin Sherry <swm@alcove.com.au>)
Список pgsql-hackers
Gavin Sherry <swm@alcove.com.au> writes:
> What we want to do is have a kind of 'sub plan' for each aggregate. In
> effect, the plan might start looking like a directed graph.  Here is part
> of the plan as a directed graph.

>                        GroupAggregate
>               /-----------------^---------------...
>               |                 |
>               |                 |
>               ^                 |
>               |               Unique
>               |                 ^
>               |                 |
>             Sort               Sort
>           (saledate)    (saledate,prodid)
>               ^                 ^
>               |                 |
>               -------------- Fan Out ------------...
>                                 ^
>                                 |
>                                Scan

> This idea was presented by Brian Hagenbuch at Greenplum. He calls it a
> 'Fan Out' plan. It is trivial to rejoin the data because all data input to
> the aggregates is sorted by the same primary key.

Er, what primary key would that be exactly?  And even if you had a key,
I wouldn't call joining on it trivial; I'd call it expensive ...

Still, it looks better than your "pipeline" idea which is even more full
of handwaving --- the problem with that one is that you're either
duplicating the earlier aggregates' results a lot of times, or you've
got different numbers of rows for different columns at various steps of
the pipeline.

I'd stick with the fanout idea but work on some way to keep related rows
together that doesn't depend on untenable assumptions like having a
primary key.

When I've thought about this in the past, I had in mind leaving the plan
structure pretty much as it is, but making the planner concern itself
with the properties of individual aggregates more than it does now ---
eg, mark DISTINCT aggregates as to whether they should use sorting or
hashing, or mark that they can assume pre-sorted input.  Perhaps this
is another way of describing what you call a fan-out plan.
        regards, tom lane


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Windows buildfarm failures
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Windows buildfarm failures