Re: POC: GROUP BY optimization

Поиск
Список
Период
Сортировка
От Alexander Korotkov
Тема Re: POC: GROUP BY optimization
Дата
Msg-id CAPpHfdtAOT8ZN8FjeP2eBRrCnW8BKb0V0PFzfSuChdNVk84QQw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: POC: GROUP BY optimization  (Andrei Lepikhov <a.lepikhov@postgrespro.ru>)
Ответы Re: POC: GROUP BY optimization
Список pgsql-hackers
Hi!

On Thu, May 30, 2024 at 7:22 AM Andrei Lepikhov
<a.lepikhov@postgrespro.ru> wrote:
> On 5/29/24 19:53, Alexander Korotkov wrote:
> > Thank you for your feedback.
> >
> > On Wed, May 29, 2024 at 11:08 AM Andrei Lepikhov
> > <a.lepikhov@postgrespro.ru> wrote:
> >> On 5/27/24 19:41, Alexander Korotkov wrote:
> >>> Any thoughts?
> >> About 0001:
> >> Having overviewed it, I don't see any issues (but I'm the author),
> >> except grammatical ones - but I'm not a native to judge it.
> >> Also, the sentence 'turning GROUP BY clauses  into pathkeys' is unclear
> >> to me. It may be better to write something like:  'building pathkeys by
> >> the list of grouping clauses'.
> >
> > OK, thank you.  I'll run once again for the grammar issues.

I've revised some grammar including the sentence you've proposed.

> >> 0002:
> >> The part under USE_ASSERT_CHECKING looks good to me. But the code in
> >> group_keys_reorder_by_pathkeys looks suspicious: of course, we do some
> >> doubtful work without any possible way to reproduce, but if we envision
> >> some duplicated elements in the group_clauses, we should avoid usage of
> >> the list_concat_unique_ptr.
> >
> > As I understand Tom, there is a risk that clauses list may contain
> > multiple instances of equivalent SortGroupClause, not duplicate
> > pointers.
> Maybe. I just implemented the worst-case scenario with the intention of
> maximum safety. So, it's up to you.
> >
> >> What's more, why do you not exit from
> >> foreach_ptr immediately after SortGroupClause has been found? I think
> >> the new_group_clauses should be consistent with the new_group_pathkeys.
> >
> > I wanted this to be consistent with preprocess_groupclause(), where
> > duplicate SortGroupClause'es are grouped together.  Otherwise, we
> > could just delete redundant SortGroupClause'es.
> Hm, preprocess_groupclause is called before the standard_qp_callback
> forms the 'canonical form' of group_pathkeys and such behaviour needed.
> But the code that chooses the reordering strategy uses already processed
> grouping clauses, where we don't have duplicates in the first
> num_groupby_pathkeys elements, do we?

Yep, it seems that we work with group clauses which were processed by
standard_qp_callback().  In turn standard_qp_callback() called
make_pathkeys_for_sortclauses_extended() with remove_redundant ==
true.  So, there shouldn't be duplicates.  And it's unclear what
should we be protected from.

I leaved 0002 with just asserts.  And extracted questionable changes into 0005.

------
Regards,
Alexander Korotkov
Supabase

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: David Christensen
Дата:
Сообщение: Re: Reuse child_relids in try_partitionwise_join was Re: Assert failure on bms_equal(child_joinrel->relids, child_joinrelids)
Следующее
От: David Rowley
Дата:
Сообщение: Re: Add memory context type to pg_backend_memory_contexts view