Re: [HACKERS] Partition-wise aggregation/grouping

Поиск

Список

Период

Сортировка

От	Jeevan Chalke
Тема	Re: [HACKERS] Partition-wise aggregation/grouping
Дата	22 марта 2018 г. 16:15:56
Msg-id	CAM2+6=U9CM1oodmcH_GCvcW4ujrWYWoX6T_aeBzWEgomrULcXQ@mail.gmail.com обсуждение исходный текст
Ответ на	Re: [HACKERS] Partition-wise aggregation/grouping (Robert Haas <robertmhaas@gmail.com>)
Ответы	Re: [HACKERS] Partition-wise aggregation/grouping (Robert Haas <robertmhaas@gmail.com>)
Список	pgsql-hackers

Дерево обсуждения

On Thu, Mar 22, 2018 at 3:26 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Mar 21, 2018 at 11:33 AM, Jeevan Chalke
<jeevan.chalke@enterprisedb.com> wrote:
> Let me try to explain this:
> 1. GROUPING_CAN_PARTITIONWISE_AGG
> 2. extra->is_partial_aggregation
> 3. extra->perform_partial_partitionwise_aggregation

Please find attached an incremental patch that attempts to refactor
this logic into a simpler form. What I've done is merged all three of
the above Booleans into a single state variable called 'patype', which
can be one of PARTITIONWISE_AGGREGATE_NONE,
PARTITIONWISE_AGGREGATE_FULL, and PARTITIONWISE_AGGREGATE_PARTIAL.
When create_ordinary_grouping_paths() is called, extra.patype is the
value for the parent relation; that function computes a new value and
passes it down to create_partitionwise_grouping_paths(), which inserts
into the new 'extra' structure for the child.

Essentially, in your system, extra->is_partial_aggregation and
extra->perform_partial_partitionwise_aggregation both corresponded to
whether or not patype == PARTITIONWISE_AGGREGATE_PARTIAL, but the
former indicated whether the *parent* was doing partition-wise
aggregation (and thus we needed to generate only partial paths) while
the latter indicated whether the *current* relation was doing
partition-wise aggregation (and thus we needed to force creation of
partially_grouped_rel). This took me a long time to understand
because of the way the fields were named; they didn't indicate that
one was for the parent and one for the current relation. Meanwhile,
GROUPING_CAN_PARTITIONWISE_AGG indicated whether partition-wise
aggregate should be tried at all for the current relation; there was
no analogous indicator for the parent relation because we can't be
processing a child at all if the parent didn't decide to do
partition-wise aggregation. So to know what was happening for the
current relation you had to look at GROUPING_CAN_PARTITIONWISE_AGG +
extra->perform_partial_partitionwise_aggregation, and to know what was
happening for the parent relation you just looked at
extra->is_partial_aggregation. With this proposed refactoring patch,
there's just one patype value at each level, which at least to me
seems simpler. I tried to improve the comments somewhat, too.

Leeks cleaner now. Thanks for refactoring it.

I have merged these changes and flatten all previuos changes into the main patch.

You have some long lines that it would be good to break, like this:

child_extra.targetList = (List *) adjust_appendrel_attrs(root,

(Node *) extra->targetList,
nappinfos,
appinfos);

If you put a newline after (List *), the formatting will come out
nicer -- it will fit within 80 columns. Please go through the patches
and make these kinds of changes for lines over 80 columns where
possible.

OK. Done.

I was under impression that it is not good to split the first line. What if in split version, while applying any new patch or in Merge process something gets inserted between them? That will result into something unexpected. But I am not sure if that's ever a possibility.

I guess we'd better move the GROUPING_CAN_* constants to a header
file, if they're going to be exposed through GroupPathExtraData. That
can go in some refactoring patch.

Yep. Moved that into 0003 where we create GroupPathExtraData.

Is there a good reason not to use input_rel->relids as the input to
fetch_upper_rel() in all cases, rather than just at subordinate
levels?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company