Re: Parallel Aggregates for string_agg and array_agg

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: Parallel Aggregates for string_agg and array_agg
Дата
Msg-id 24abe018-b3c6-2117-b5b9-39e660db00bd@2ndquadrant.com
обсуждение исходный текст
Ответ на Re: Parallel Aggregates for string_agg and array_agg  (David Rowley <david.rowley@2ndquadrant.com>)
Ответы Re: Parallel Aggregates for string_agg and array_agg  (David Rowley <david.rowley@2ndquadrant.com>)
Список pgsql-hackers

On 03/05/2018 04:51 AM, David Rowley wrote:
> On 5 March 2018 at 04:54, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>> 1) There seems to be forgotten declaration of initArrayResultInternal in
>> arrayfuncs.c. I suppose you've renamed it to initArrayResultWithSize and
>> moved it to a header file, and forgotten to remove this bit.
> 
> Oops. Must be left over from trying things another way. Removed.
> 
>> 2) bytea_string_agg_deserialize has this comment:
>>
>>  /*
>>   * data: technically we could reuse the buf.data buffer, but that is
>>   * slightly larger than we need due to the extra 4 bytes for the cursor
>>   */
>>
>> I find the argument "it has 4 extra bytes, so we'll allocate new chunk"
>> somewhat weak. We do allocate the memory in 2^n chunks anyway, so the
>> new chunk is likely to be much larger anyway. I'm not saying we need to
>> reuse the buffer, IMHO the savings will be non-measurable.
> 
> Agreed. I've removed that part of the comment.
> 
>> 3) string_agg_transfn and bytea_string_agg_transfn say this"
>>
>>  /*
>>   * We always append the delimiter text before the value text, even
>>   * with the first aggregated item.  The reason for this is that if
>>   * this state needs to be combined with another state using the
>>   * aggregate's combine function, then we need to have the delimiter
>>   * for the first aggregated item.  The first delimiter will be
>>   * stripped off in the final function anyway.  We use a little cheat
>>   * here and store the position of the actual first value (after the
>>   * delimiter) using the StringInfo's cursor variable.  This relies on
>>   * the cursor not being used for any other purpose.
>>   */
>>
>> How does this make the code any simpler, compared to not adding the
>> delimiter (as before) and adding it when combining the values (at which
>> point you should know when it's actually needed)?
> 
> The problem is that if we don't record the first delimiter then we
> won't know what it is when it comes to combining the states.
> 
> Consider the following slightly backward-looking case;
> 
> select string_agg(',', x::text) from generate_Series(1,10)x;
>       string_agg
> ----------------------
>  ,2,3,4,5,6,7,8,9,10,
> 
> Here the delimiter is the number, not the ','. Of course, the
> delimiter for the first aggregated result is not shown.  The previous
> implementation of the transfn for this aggregate just threw away the
> first delimiter, but that's no good for parallel aggregates as the
> transfn may be getting called in a parallel worker, in which case
> we'll need that delimiter in the combine function to properly join to
> the other partial aggregated results matching the same group.
> 

Hmmm, you're right I haven't considered the delimiter might be variable.
But isn't just yet another hint that while StringInfo was suitable for
aggregate state of non-parallel string_agg, it may not be for the
parallel version?

What if the parallel string_agg instead used something like this:

struct StringAggState
{
    char           *delimiter;    /* first delimiter */
    StringInfoData  str;          /* partial aggregate state */
} StringAggState;

I think that would make the code cleaner, compared to using the cursor
field for that purpose. But maybe it'd make it not possible to share
code with the non-parallel aggregate, not sure.

I always considered the cursor field to be meant for scanning through
StringInfo, so using it for this purpose seems a bit like a misuse.

> Probably I could improve the comment a bit. I understand that having a
> variable delimiter is not commonplace in the normal world. I certainly
> had never considered it before working on this.  I scratched my head a
> bit when doing this and thought about inventing a new trans type, but
> I decided that the most efficient design for an aggregate state was to
> store the delimiter and data all in one buffer and have a pointer to
> the start of the actual data... StringInfo has exactly what's required
> if you use the cursor as the pointer, so I didn't go and reinvent the
> wheel.
> 

I wonder why the variable delimiter is not mentioned in the docs ... (at
least I haven't found anything mentioning the behavior).

> I've now changed the comment to read:
> 
> /*
> * It's important that we store the delimiter text for all aggregated
> * items, even the first one, which at first thought you might think
> * could just be discarded.  The reason for this is that if this
> * function is being called from a parallel worker, then we'll need
> * the first delimiter in order to properly combine the partially
> * aggregated state with the states coming from other workers.  In the
> * final output, the first delimiter will be stripped off of the final
> * aggregate state, but in order to know where the actual first data
> * is we must store the position of the first data value somewhere.
> * Conveniently, StringInfo has a cursor property which can be used
> * to serve our needs here.
> */
> 
> I've also updated an incorrect Assert in array_agg_array_serialize.
> 
> Please find attached the updated patch.
> 

Seems fine to me. I plan to do a bit more testing/review next week, but
I plan to move it to RFC after that (unless I run into something during
the review, but I don't expect that).

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Intermittent pg_ctl failures on Windows
Следующее
От: Tom Lane
Дата:
Сообщение: Re: VACUUM FULL vs dropped columns