Обсуждение: Text-any concatenation volatility acting as optimization barrier

Поиск
Список
Период
Сортировка

Text-any concatenation volatility acting as optimization barrier

От
Marti Raudsepp
Дата:
Hi list,

Andrew Dunstan reported an awkward-seeming case on IRC where shifting
around a concatenation expression in a view made the planner choose a
good or a bad execution plan.

Simplified, it boils down to this:

db=# create table foo(i int);
db=# explain verbose select i from (select i, i::text || 'x' as asd
from foo) as subq;
Seq Scan on public.foo  (cost=0.00..34.00 rows=2400 width=4) Output: foo.i

db=# explain verbose select i from (select i, i || 'x'::text as asd
from foo) as subq;
Subquery Scan on subq  (cost=0.00..76.00 rows=2400 width=4) Output: subq.i ->  Seq Scan on public.foo
(cost=0.00..52.00rows=2400 width=4)       Output: foo.i, ((foo.i)::text || 'x'::text)
 

Case #1 uses the normal textcat(text, text) operator by automatically
coercing 'x' as text.
However, case #2 uses the anytextcat(anynonarray, text), which is
marked as volatile thus acts as an optimization barrier. Later, the
anytextcat SQL function is inlined and the EXPLAIN VERBOSE output has
no trace of what happened.

Is this something we can, or want, to fix?

One way would be doing preprocess_expression() before
pull_up_subqueries() so function inlining happens earlier, but I can't
imagine what unintended consequences that might have.

Another option would be creating explicit immutable  text || foo
operators for common types, but that sounds pretty hacky.

Regards,
Marti


Re: Text-any concatenation volatility acting as optimization barrier

От
Andrew Dunstan
Дата:

On 02/07/2012 03:18 PM, Marti Raudsepp wrote:
> Hi list,
>
> Andrew Dunstan reported an awkward-seeming case on IRC where shifting
> around a concatenation expression in a view made the planner choose a
> good or a bad execution plan.
>
> Simplified, it boils down to this:
>
> db=# create table foo(i int);
> db=# explain verbose select i from (select i, i::text || 'x' as asd
> from foo) as subq;
> Seq Scan on public.foo  (cost=0.00..34.00 rows=2400 width=4)
>    Output: foo.i
>
> db=# explain verbose select i from (select i, i || 'x'::text as asd
> from foo) as subq;
> Subquery Scan on subq  (cost=0.00..76.00 rows=2400 width=4)
>    Output: subq.i
>    ->   Seq Scan on public.foo  (cost=0.00..52.00 rows=2400 width=4)
>          Output: foo.i, ((foo.i)::text || 'x'::text)
>
> Case #1 uses the normal textcat(text, text) operator by automatically
> coercing 'x' as text.
> However, case #2 uses the anytextcat(anynonarray, text), which is
> marked as volatile thus acts as an optimization barrier. Later, the
> anytextcat SQL function is inlined and the EXPLAIN VERBOSE output has
> no trace of what happened.
>
> Is this something we can, or want, to fix?
>
> One way would be doing preprocess_expression() before
> pull_up_subqueries() so function inlining happens earlier, but I can't
> imagine what unintended consequences that might have.
>
> Another option would be creating explicit immutable  text || foo
> operators for common types, but that sounds pretty hacky.
>




It gets worse if you replace the expression with a call to a (non-sql) 
function returning text, which was in fact the original use case. Then 
you're pretty  much hosed.

cheers

andrew


Re: Text-any concatenation volatility acting as optimization barrier

От
Marti Raudsepp
Дата:
On Tue, Feb 7, 2012 at 22:31, Andrew Dunstan <andrew@dunslane.net> wrote:
> It gets worse if you replace the expression with a call to a (non-sql)
> function returning text, which was in fact the original use case. Then
> you're pretty  much hosed.

Oh, if it's a non-SQL function then marking it as STABLE/IMMUTABLE
should solve it -- did you try that?

Regards,
Marti


Re: Text-any concatenation volatility acting as optimization barrier

От
Andrew Dunstan
Дата:

On 02/07/2012 03:36 PM, Marti Raudsepp wrote:
> On Tue, Feb 7, 2012 at 22:31, Andrew Dunstan<andrew@dunslane.net>  wrote:
>> It gets worse if you replace the expression with a call to a (non-sql)
>> function returning text, which was in fact the original use case. Then
>> you're pretty  much hosed.
> Oh, if it's a non-SQL function then marking it as STABLE/IMMUTABLE
> should solve it -- did you try that?
>


Hmm, your're right. I could have sworn I tried that. That still leaves 
the oddity you reported, though.

cheers

andrew


Re: Text-any concatenation volatility acting as optimization barrier

От
Tom Lane
Дата:
Marti Raudsepp <marti@juffo.org> writes:
> Case #1 uses the normal textcat(text, text) operator by automatically
> coercing 'x' as text.
> However, case #2 uses the anytextcat(anynonarray, text), which is
> marked as volatile thus acts as an optimization barrier.

Hmm ... since those operators were invented (in 8.3), we have adopted a
policy that I/O functions are presumed to be no worse than stable:
http://archives.postgresql.org/pgsql-committers/2010-07/msg00307.php
ISTM that would justify relabeling anytextcat/textanycat as stable,
which should fix this.

> One way would be doing preprocess_expression() before
> pull_up_subqueries() so function inlining happens earlier, but I can't
> imagine what unintended consequences that might have.

I'm pretty sure that would be a bad idea.
        regards, tom lane


Re: Text-any concatenation volatility acting as optimization barrier

От
Marti Raudsepp
Дата:
On Wed, Feb 8, 2012 at 06:21, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Marti Raudsepp <marti@juffo.org> writes:
>> Case #1 uses the normal textcat(text, text) operator by automatically
>> coercing 'x' as text.
>> However, case #2 uses the anytextcat(anynonarray, text), which is
>> marked as volatile thus acts as an optimization barrier.
>
> Hmm ... since those operators were invented (in 8.3), we have adopted a
> policy that I/O functions are presumed to be no worse than stable:
> http://archives.postgresql.org/pgsql-committers/2010-07/msg00307.php
> ISTM that would justify relabeling anytextcat/textanycat as stable,
> which should fix this.

Yes, we should definitely take advantage of that.

I scanned through all of pg_proc, there are 4 functions like this that
can be changed: textanycat, anytextcat, quote_literal and
quote_nullable. All of these have SQL wrappers to cast their argument
to ::text.

 quote_literal  | select pg_catalog.quote_literal($1::pg_catalog.text)
 quote_nullable | select pg_catalog.quote_nullable($1::pg_catalog.text)
 textanycat     | select $1 || $2::pg_catalog.text
 anytextcat     | select $1::pg_catalog.text || $2

Patch attached (in git am format). Passes all regression tests (except
'json' which fails on my machine even on git master).

No documentation changes necessary AFAICT.

Regards,
Marti

Вложения

Re: Text-any concatenation volatility acting as optimization barrier

От
Robert Haas
Дата:
On Wed, Feb 8, 2012 at 4:53 AM, Marti Raudsepp <marti@juffo.org> wrote:
> Patch attached (in git am format). Passes all regression tests (except
> 'json' which fails on my machine even on git master).

Can you provide the diffs from the json test on your machine?  I don't
see any build-farm failures off-hand...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Text-any concatenation volatility acting as optimization barrier

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, Feb 8, 2012 at 4:53 AM, Marti Raudsepp <marti@juffo.org> wrote:
>> Patch attached (in git am format). Passes all regression tests (except
>> 'json' which fails on my machine even on git master).

> Can you provide the diffs from the json test on your machine?  I don't
> see any build-farm failures off-hand...

I'm seeing diffs too after applying Marti's patch: instead of "z", "b",
etc, the field labels in the json values look like "f1", "f2", etc in
the output of queries such as

SELECT row_to_json(q) 
FROM (SELECT $$a$$ || x AS b,         y AS c,         ARRAY[ROW(x.*,ARRAY[1,2,3]),              ROW(y.*,ARRAY[4,5,6])]
ASz      FROM generate_series(1,2) x,           generate_series(4,5) y) q;
 

I believe what is happening is that now that the planner can flatten the
sub-select, the RowExprs are getting expanded differently, and that ties
into the "when do we lose column names" business that Andrew has been
on about.

However, I was not seeing that before applying the patch, so maybe Marti
has another issue too.

I am going to go ahead and commit the patch with the altered json
results, because IMO it is mere accident that these regression cases
were coming out with "nice" field labels anyway.  When and if Andrew
gets the RowExpr cases fixed properly, that will show up as these cases
going back to nicer-looking output.
        regards, tom lane


Re: Text-any concatenation volatility acting as optimization barrier

От
Marti Raudsepp
Дата:
On Wed, Feb 8, 2012 at 18:20, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Wed, Feb 8, 2012 at 4:53 AM, Marti Raudsepp <marti@juffo.org> wrote:
>>> Patch attached (in git am format). Passes all regression tests (except
>>> 'json' which fails on my machine even on git master).
>
>> Can you provide the diffs from the json test on your machine?  I don't
>> see any build-farm failures off-hand...
>
> I'm seeing diffs too after applying Marti's patch: instead of "z", "b",
> etc, the field labels in the json values look like "f1", "f2", etc in
> the output of queries such as

Sorry, my bad. I guess I got the two versions (patched and unpatched)
mixed up. Indeed this difference disappears when I remove my patch.
I'll have to be more careful when checking regression diffs in the
future. :)

Regards,
Marti


Re: Text-any concatenation volatility acting as optimization barrier

От
Andrew Dunstan
Дата:

On 02/08/2012 11:20 AM, Tom Lane wrote:
> I am going to go ahead and commit the patch with the altered json 
> results, because IMO it is mere accident that these regression cases 
> were coming out with "nice" field labels anyway. When and if Andrew 
> gets the RowExpr cases fixed properly, that will show up as these 
> cases going back to nicer-looking output.

I take it this is your way of making me hurry up with that work :-)

cheers

andrew


Re: Text-any concatenation volatility acting as optimization barrier

От
Tom Lane
Дата:
Andrew Dunstan <andrew@dunslane.net> writes:
> On 02/08/2012 11:20 AM, Tom Lane wrote:
>> I am going to go ahead and commit the patch with the altered json 
>> results, because IMO it is mere accident that these regression cases 
>> were coming out with "nice" field labels anyway. When and if Andrew 
>> gets the RowExpr cases fixed properly, that will show up as these 
>> cases going back to nicer-looking output.

> I take it this is your way of making me hurry up with that work :-)

Well, right now the behavior is more consistent than it was before;
we would surely have gotten questions about it as it was.  Whether
you find time to improve it or not isn't my concern at the moment.

BTW, the order of the items in the json output doesn't match the column
order of the sub-select, with or without the patch.  This seems ... odd.
Is it intentional?
        regards, tom lane