Re: [HACKERS] Re: Improve OR conditions on joined columns (commonstar schema problem)

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: [HACKERS] Re: Improve OR conditions on joined columns (commonstar schema problem)
Дата
Msg-id 20180330020527.hllsqse2vwqldfa5@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: [HACKERS] Re: Improve OR conditions on joined columns (common star schema problem)  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: [HACKERS] Re: Improve OR conditions on joined columns (commonstar schema problem)  (David Rowley <david.rowley@2ndquadrant.com>)
Список pgsql-hackers
Hi,

I've only skimmed the thread, looking at the patch on its own.


On 2018-01-04 17:50:48 -0500, Tom Lane wrote:
> diff --git a/src/backend/optimizer/plan/plaindex ...dd11e72 .
> --- a/src/backend/optimizer/plan/planunionor.c
> +++ b/src/backend/optimizer/plan/planunionor.c
> @@ -0,0 +1,667 @@
> +/*-------------------------------------------------------------------------
> + *
> + * planunionor.c
> + *      Consider whether join OR clauses can be converted to UNION queries.
> + *
> + * The current implementation of the UNION step is to de-duplicate using
> + * row CTIDs.

Could we skip using the ctid if there's a DISTINCT (or something to that
effect) above? We do not need to avoid removing rows that are identical
if that's done anyway.


> A big limitation is that this only works on plain relations,
> + * and not for instance on foreign tables.  Another problem is that we can
> + * only de-duplicate by sort/unique, not hashing; but that could be fixed
> + * if we write a hash opclass for TID.

I wonder if an alternative could be some sort of rowid that we invent.
It'd not be that hard to introduce an executor node (or do it in
projection) that simply counts row and returns that as a
column. Together with e.g. range table id that'd be unique. But for that
we would need to guarantee that foreign tables / subqueries /
... returned the same result in two scans.  We could do so by pushing
the data gathering into a CTE, but that'd make this exercise moot.

Why can't we ask at least FDWs to return something ctid like?


> + * To allow join removal to happen, we can't reference the CTID column
> + * of an otherwise-removable relation.

A brief hint why wouldn't hurt.

> +/*
> + * Is query as a whole safe to apply union OR transformation to?
> + * This checks relatively-expensive conditions that we don't want to
> + * worry about until we've found a candidate OR clause.
> + */
> +static bool
> +is_query_safe_for_union_or_transform(PlannerInfo *root)
> +{
> +    Query       *parse = root->parse;
> +    Relids        allbaserels;
> +    ListCell   *lc;
> +    int            relid;
> +
> +    /*
> +     * Must not have any volatile functions in FROM or WHERE (see notes at
> +     * head of file).
> +     */
> +    if (contain_volatile_functions((Node *) parse->jointree))
> +        return false;

Hm, are there any SRFs that could be in relevant places? I think we
reject them everywhere were they'd be problematic (as targetlist is
processed above)?


Do you have any plans for this patch at this moment?

Greetings,

Andres Freund


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: Protect syscache from bloating with negative cache entries
Следующее
От: Edmund Horner
Дата:
Сообщение: Re: pgbench doc typos