Re: POC: GROUP BY optimization

Поиск
Список
Период
Сортировка
On 5/29/24 19:53, Alexander Korotkov wrote:
> Hi, Andrei!
> 
> Thank you for your feedback.
> 
> On Wed, May 29, 2024 at 11:08 AM Andrei Lepikhov
> <a.lepikhov@postgrespro.ru> wrote:
>> On 5/27/24 19:41, Alexander Korotkov wrote:
>>> Any thoughts?
>> About 0001:
>> Having overviewed it, I don't see any issues (but I'm the author),
>> except grammatical ones - but I'm not a native to judge it.
>> Also, the sentence 'turning GROUP BY clauses  into pathkeys' is unclear
>> to me. It may be better to write something like:  'building pathkeys by
>> the list of grouping clauses'.
> 
> OK, thank you.  I'll run once again for the grammar issues.
> 
>> 0002:
>> The part under USE_ASSERT_CHECKING looks good to me. But the code in
>> group_keys_reorder_by_pathkeys looks suspicious: of course, we do some
>> doubtful work without any possible way to reproduce, but if we envision
>> some duplicated elements in the group_clauses, we should avoid usage of
>> the list_concat_unique_ptr.
> 
> As I understand Tom, there is a risk that clauses list may contain
> multiple instances of equivalent SortGroupClause, not duplicate
> pointers.
Maybe. I just implemented the worst-case scenario with the intention of 
maximum safety. So, it's up to you.
> 
>> What's more, why do you not exit from
>> foreach_ptr immediately after SortGroupClause has been found? I think
>> the new_group_clauses should be consistent with the new_group_pathkeys.
> 
> I wanted this to be consistent with preprocess_groupclause(), where
> duplicate SortGroupClause'es are grouped together.  Otherwise, we
> could just delete redundant SortGroupClause'es.
Hm, preprocess_groupclause is called before the standard_qp_callback 
forms the 'canonical form' of group_pathkeys and such behaviour needed. 
But the code that chooses the reordering strategy uses already processed 
grouping clauses, where we don't have duplicates in the first 
num_groupby_pathkeys elements, do we?
>> 0004:
>> I was also thinking about reintroducing the preprocess_groupclause
>> because with the re-arrangement of GROUP-BY clauses according to
>> incoming pathkeys, it doesn't make sense to have a user-defined order—at
>> least while cost_sort doesn't differ costs for alternative column orderings.
>> So, I'm okay with the code. But why don't you use the same approach with
>> foreach_ptr as before?
> 
> I restored the function as it was before 0452b461bc with minimal edits
> to support the incremental sort.  I think it would be more valuable to
> keep the difference with pg16 code small rather than refactor to
> simplify existing code.
Got it

-- 
regards,
Andrei Lepikhov
Postgres Professional




В списке pgsql-hackers по дате отправления:

Предыдущее
От: David Rowley
Дата:
Сообщение: Re: Avoid an odd undefined behavior with memcmp (src/bin/pg_rewind/pg_rewind.c)
Следующее
От: Peter Smith
Дата:
Сообщение: Re: 001_rep_changes.pl fails due to publisher stuck on shutdown