Re: Combining Aggregates

Поиск
Список
Период
Сортировка
От David Rowley
Тема Re: Combining Aggregates
Дата
Msg-id CAHoyFK8oR5AMFqUpFWhEo87MCTwRSsvjiKVBgnvq1-0bEM+C2g@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Combining Aggregates  (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
Ответы Re: Combining Aggregates  (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
Список pgsql-hackers
On 6 March 2015 at 19:01, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote:
Postgres-XC solved this question by creating a plan with two Agg/Group nodes, one for combining transitioned result and one for creating the distributed transition results (one per distributed run per group).
 
So, Agg/Group for combining result had as many Agg/Group nodes as there are distributed/parallel runs.

This sounds quite like the planner must be forcing the executor to having to execute the plan on a fixed number of worker processes.

I really hoped that we could, one day, have a load monitor process that decided what might be the best number of threads to execute a parallel plan on. Otherwise how would we decide how many worker processes to allocate to a plan? Surely there must be times where only utilising half of the processors for a query would be better than trying to use all processors and having many more context switched to perform.

Probably the harder part about dynamically deciding the number of workers would be around the costing. Where maybe the plan will execute the fastest with 32 workers, but if it was only given 2 workers then it might execute better as a non-parallel plan.
 
But XC chose this way to reduce the code footprint. In Postgres, we can have different nodes for combining and transitioning as you have specified above. Aggregation is not pathified in current planner, hence XC took the approach of pushing the Agg nodes down the plan tree when there was distributed/parallel execution possible. If we can get aggregation pathified, we can go by path-based approach which might give a better judgement of whether or not to distribute the aggregates itself.

Looking at Postgres-XC might be useful to get ideas. I can help you there.
 

 Regards

David Rowley

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: Strange assertion using VACOPT_FREEZE in vacuum.c
Следующее
От: Vladimir Borodin
Дата:
Сообщение: Re: pg_upgrade and rsync