Обсуждение: Avoid excessive inlining?

Поиск

Список

Период

Сортировка

Avoid excessive inlining?

От

"Joel Jacobson"

Дата:

19 декабря 2020 г., 08:59:21

Is there a way to avoid excessive inlining when writing pure SQL functions, without having to use PL/pgSQL?

The JOIN LATERAL and Nested Subqueries versions run much slower than the PL/pgSQL version:

Execution Times:

JOIN LATERAL: 12198.010 ms

Nested Subqueries: 12250.077 ms

PL/pgSQL: 312.493 ms

The three functions below are equivalent, they all compute the Eastern date for a given year.

CREATE OR REPLACE FUNCTION easter_lateral(year integer)

RETURNS DATE

LANGUAGE sql

AS $$

SELECT make_date(year, easter_month, easter_day)

FROM (VALUES (year % 19, year / 100)) AS Q1(g,c)

JOIN LATERAL (VALUES ((c - c/4 - (8*c + 13)/25 + 19*g + 15) % 30)) AS Q2(h) ON TRUE

JOIN LATERAL (VALUES (h - (h/28)*(1 - (h/28)*(29/(h + 1))*((21 - g)/11)))) AS Q3(i) ON TRUE

JOIN LATERAL (VALUES ((year + year/4 + i + 2 - c + c/4) % 7)) AS Q4(j) ON TRUE

JOIN LATERAL (VALUES (i - j)) AS Q5(p) ON TRUE

JOIN LATERAL (VALUES (3 + (p + 26)/30, 1 + (p + 27 + (p + 6)/40) % 31)) AS Q6(easter_month, easter_day) ON TRUE

$$;

CREATE OR REPLACE FUNCTION easter_nested_subqueries(year integer)

RETURNS DATE

LANGUAGE sql

AS $$

SELECT make_date(year, easter_month, easter_day)

FROM (

SELECT *,

3 + (p + 26)/30 AS easter_month,

1 + (p + 27 + (p + 6)/40) % 31 AS easter_day

FROM (

SELECT *,

i - j AS p

FROM (

SELECT *,

(year + year/4 + i + 2 - c + c/4) % 7 AS j

FROM (

SELECT *,

h - (h/28)*(1 - (h/28)*(29/(h + 1))*((21 - g)/11)) AS i

FROM (

SELECT *,

(c - c/4 - (8*c + 13)/25 + 19*g + 15) % 30 AS h

FROM (

SELECT

year % 19 AS g,

year / 100 AS c

) AS Q1

) AS Q2

) AS Q3

) AS Q4

) AS Q5

) AS Q6

$$;

CREATE OR REPLACE FUNCTION easter_plpgsql(year integer)

RETURNS date

LANGUAGE plpgsql

AS $$

-- Based on: https://github.com/christopherthompson81/pgsql_holidays/blob/master/utils/easter.pgsql

DECLARE

g CONSTANT integer := year % 19;

c CONSTANT integer := year / 100;

h CONSTANT integer := (c - c/4 - (8*c + 13)/25 + 19*g + 15) % 30;

i CONSTANT integer := h - (h/28)*(1 - (h/28)*(29/(h + 1))*((21 - g)/11));

j CONSTANT integer := (year + year/4 + i + 2 - c + c/4) % 7;

p CONSTANT integer := i - j;

BEGIN

RETURN make_date(

year,

3 + (p + 26)/30,

1 + (p + 27 + (p + 6)/40) % 31

);

END;

$$;

joel=# EXPLAIN ANALYZE SELECT MAX(easter) FROM (SELECT easter_lateral(year) AS easter FROM generate_series(1,100000) AS year) AS x;

QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------

Aggregate (cost=27250.00..27250.01 rows=1 width=4) (actual time=12195.974..12195.974 rows=1 loops=1)

-> Function Scan on generate_series year (cost=0.00..26000.00 rows=100000 width=4) (actual time=15.840..12167.758 rows=100000 loops=1)

Planning Time: 0.262 ms

Execution Time: 12198.010 ms

(4 rows)

joel=# EXPLAIN ANALYZE SELECT MAX(easter) FROM (SELECT easter_nested_subqueries(year) AS easter FROM generate_series(1,100000) AS year) AS x;

QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------

Aggregate (cost=27250.00..27250.01 rows=1 width=4) (actual time=12248.316..12248.317 rows=1 loops=1)

-> Function Scan on generate_series year (cost=0.00..26000.00 rows=100000 width=4) (actual time=17.707..12219.500 rows=100000 loops=1)

Planning Time: 0.277 ms

Execution Time: 12250.077 ms

(4 rows)

joel=# EXPLAIN ANALYZE SELECT MAX(easter) FROM (SELECT easter_plpgsql(year) AS easter FROM generate_series(1,100000) AS year) AS x;

QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------

Aggregate (cost=27250.00..27250.01 rows=1 width=4) (actual time=311.107..311.108 rows=1 loops=1)

-> Function Scan on generate_series year (cost=0.00..26000.00 rows=100000 width=4) (actual time=12.369..296.221 rows=100000 loops=1)

Planning Time: 0.058 ms

Execution Time: 312.493 ms

(4 rows)

If we look at the plan for the lateral and subqueries versions, we can see how the inlining expands to huge expressions.

Could this be the reason they run so much slower than the PL/pgSQL version?

PREPARE q_lateral AS SELECT make_date($1, easter_month, easter_day)

FROM (VALUES ($1 % 19, $1 / 100)) AS Q1(g,c)

JOIN LATERAL (VALUES ((c - c/4 - (8*c + 13)/25 + 19*g + 15) % 30)) AS

Q2(h) ON TRUE

JOIN LATERAL (VALUES (h - (h/28)*(1 - (h/28)*(29/(h + 1))*((21 -

g)/11)))) AS Q3(i) ON TRUE

JOIN LATERAL (VALUES (($1 + $1/4 + i + 2 - c + c/4) % 7)) AS Q4(j) ON TRUE

JOIN LATERAL (VALUES (i - j)) AS Q5(p) ON TRUE

JOIN LATERAL (VALUES (3 + (p + 26)/30, 1 + (p + 27 + (p + 6)/40) % 31))

AS Q6(easter_month, easter_day) ON TRUE;

SET plan_cache_mode = 'force_generic_plan';

EXPLAIN (ANALYZE, VERBOSE) EXECUTE q_lateral(2020);

Result (cost=0.00..1.14 rows=1 width=4) (actual time=0.038..0.038 rows=1 loops=1)

Output: make_date($1, (3 + (((((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) - (((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) / 28) * (1 - ((((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) / 28) * (29 / ((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) + 1))) * ((21 - ($1 % 19)) / 11))))) - (((((($1 + ($1 / 4)) + ((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) - (((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) / 28) * (1 - ((((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) / 28) * (29 / ((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) + 1))) * ((21 - ($1 % 19)) / 11)))))) + 2) - ($1 / 100)) + (($1 / 100) / 4)) % 7)) + 26) / 30)), (1 + ((((((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) - (((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) / 28) * (1 - ((((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) / 28) * (29 / ((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) + 1))) * ((21 - ($1 % 19)) / 11))))) - (((((($1 + ($1 / 4)) + ((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) - (((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) / 28) * (1 - ((((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) / 28) * (29 / ((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) + 1))) * ((21 - ($1 % 19)) / 11)))))) + 2) - ($1 / 100)) + (($1 / 100) / 4)) % 7)) + 27) + (((((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) - (((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) / 28) * (1 - ((((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) / 28) * (29 / ((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) + 1))) * ((21 - ($1 % 19)) / 11))))) - (((((($1 + ($1 / 4)) + ((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) - (((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) / 28) * (1 - ((((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) / 28) * (29 / ((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) + 1))) * ((21 - ($1 % 19)) / 11)))))) + 2) - ($1 / 100)) + (($1 / 100) / 4)) % 7)) + 6) / 40)) % 31)))

Planning Time: 4.144 ms

Execution Time: 0.220 ms

PREPARE q_subqueries AS SELECT make_date($1, easter_month, easter_day)

FROM (

SELECT *,

3 + (p + 26)/30 AS easter_month,

1 + (p + 27 + (p + 6)/40) % 31 AS easter_day

FROM (

SELECT *,

i - j AS p

FROM (

SELECT *,

($1 + $1/4 + i + 2 - c + c/4) % 7 AS j

FROM (

SELECT *,

h - (h/28)*(1 - (h/28)*(29/(h + 1))*((21 - g)/11)) AS i

FROM (

SELECT *,

(c - c/4 - (8*c + 13)/25 + 19*g + 15) % 30 AS h

FROM (

SELECT

$1 % 19 AS g,

$1 / 100 AS c

) AS Q1

) AS Q2

) AS Q3

) AS Q4

) AS Q5

) AS Q6

;

SET plan_cache_mode = 'force_generic_plan';

EXPLAIN (ANALYZE, VERBOSE) EXECUTE q_subqueries(2020);

Result (cost=0.00..1.14 rows=1 width=4) (actual time=0.043..0.044 rows=1 loops=1)

Planning Time: 2.443 ms

Execution Time: 0.235 ms

Thanks Andreas Karlsson for teaching me the PREPARE + SET plan_cache_mode = 'force_generic_plan'; + EXPLAIN trick, very useful.

Best regards,

Joel

Re: Avoid excessive inlining?

От

Philip Semanchuk

Дата:

21 декабря 2020 г., 19:45:41

> On Dec 19, 2020, at 12:59 AM, Joel Jacobson <joel@compiler.org> wrote:
>
> Is there a way to avoid excessive inlining when writing pure SQL functions, without having to use PL/pgSQL?

Hi Joel,
The rules for inlining are here:
https://wiki.postgresql.org/wiki/Inlining_of_SQL_functions

According to those rules, if you declared your SQL function as VOLATILE, then Postgres wouldn’t inline it. From your
question,I’m not sure if you want to have the same function inlined sometimes and not others. I can’t think of a way to
dothat offhand. 

Hope this helps,
Philip

Re: Avoid excessive inlining?

От

Laurenz Albe

Дата:

22 декабря 2020 г., 16:40:20

On Mon, 2020-12-21 at 11:45 -0500, Philip Semanchuk wrote:
> > On Dec 19, 2020, at 12:59 AM, Joel Jacobson <joel@compiler.org> wrote:
> > Is there a way to avoid excessive inlining when writing pure SQL functions, without having to use PL/pgSQL?
> 
> The rules for inlining are here:
> https://wiki.postgresql.org/wiki/Inlining_of_SQL_functions
> 
> According to those rules, if you declared your SQL function as VOLATILE, then Postgres wouldn’t
>  inline it. From your question, I’m not sure if you want to have the same function inlined
>  sometimes and not others. I can’t think of a way to do that offhand.

Where do you see that?  As far as I know, VOLATILE is the best choice if you
want the function to be inlined.

I would say that the simplest way to prevent a function from being inlined
is to set a parameter on it:

  ALTER FUNCTION f() SET enable_seqscan = on;

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Avoid excessive inlining?

От

Philip Semanchuk

Дата:

22 декабря 2020 г., 17:57:19


> On Dec 22, 2020, at 8:40 AM, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
>
> On Mon, 2020-12-21 at 11:45 -0500, Philip Semanchuk wrote:
>>> On Dec 19, 2020, at 12:59 AM, Joel Jacobson <joel@compiler.org> wrote:
>>> Is there a way to avoid excessive inlining when writing pure SQL functions, without having to use PL/pgSQL?
>>
>> The rules for inlining are here:
>> https://wiki.postgresql.org/wiki/Inlining_of_SQL_functions
>>
>> According to those rules, if you declared your SQL function as VOLATILE, then Postgres wouldn’t
>> inline it. From your question, I’m not sure if you want to have the same function inlined
>> sometimes and not others. I can’t think of a way to do that offhand.
>
> Where do you see that?  As far as I know, VOLATILE is the best choice if you
> want the function to be inlined.

Ugh, you’re absolutely right, and I’m sorry for spreading misinformation. That’s what I get from quoting from memory
ratherthan reading the link that I posted.  


>
> I would say that the simplest way to prevent a function from being inlined
> is to set a parameter on it:
>
>  ALTER FUNCTION f() SET enable_seqscan = on;

I appreciate the correction and education.

Cheers
Philip

Re: Avoid excessive inlining?

От

"Joel Jacobson"

Дата:

22 декабря 2020 г., 19:10:35

On Tue, Dec 22, 2020, at 14:40, Laurenz Albe wrote:

>I would say that the simplest way to prevent a function from being inlined

>is to set a parameter on it:

> ALTER FUNCTION f() SET enable_seqscan = on;

Thanks, very useful, didn't know about that trick.

I think I was a bit unclear about my problem, and might have used the wrong terminology.

In my LATERAL query, there are calculations in a certain order.

For each step, "columns" are computed named e.g. "g", "c", "h", "i", etc.

However, when looking at the query plan, these steps are gone, and instead there is just one huge fully expanded expression, which doesn't look very efficient.

Ideas?

PREPARE q_lateral AS SELECT make_date($1, easter_month, easter_day)

FROM (VALUES ($1 % 19, $1 / 100)) AS Q1(g,c)

JOIN LATERAL (VALUES ((c - c/4 - (8*c + 13)/25 + 19*g + 15) % 30)) AS

Q2(h) ON TRUE

JOIN LATERAL (VALUES (h - (h/28)*(1 - (h/28)*(29/(h + 1))*((21 -

g)/11)))) AS Q3(i) ON TRUE

JOIN LATERAL (VALUES (($1 + $1/4 + i + 2 - c + c/4) % 7)) AS Q4(j) ON TRUE

JOIN LATERAL (VALUES (i - j)) AS Q5(p) ON TRUE

JOIN LATERAL (VALUES (3 + (p + 26)/30, 1 + (p + 27 + (p + 6)/40) % 31))

AS Q6(easter_month, easter_day) ON TRUE;

SET plan_cache_mode = 'force_generic_plan';

EXPLAIN (ANALYZE, VERBOSE) EXECUTE q_lateral(2020);

Result (cost=0.00..1.14 rows=1 width=4) (actual time=0.038..0.038 rows=1 loops=1)

Planning Time: 4.144 ms

Execution Time: 0.220 ms

Re: Avoid excessive inlining?

От

Tom Lane

Дата:

22 декабря 2020 г., 19:32:36

"Joel Jacobson" <joel@compiler.org> writes:
> I think I was a bit unclear about my problem, and might have used the wrong terminology.
> In my LATERAL query, there are calculations in a certain order.
> For each step, "columns" are computed named e.g. "g", "c", "h", "i", etc.
> However, when looking at the query plan, these steps are gone, and instead there is just one huge fully expanded
expression,which doesn't look very efficient. 

Yeah, this isn't really about function inlining, it's about subquery
flattening (which is similar in some ways, but not the same thing).

Unfortunately, subquery flattening happens early enough in the planner
that there's no chance of making any useful cost comparisons to decide
whether to do it or not.  So we just do it unconditionally.  I'm
not really sure that failing to do it would provide a better outcome
in this situation anyway --- sure, you'd save a few scalar calculations,
but the overhead of running additional plan nodes could outweigh that.

The long and the short of it is that SQL isn't terribly well suited to
execute a fundamentally stepwise, imperative algorithm like this one.
Rather than hacking up cute tricks with LATERAL, you should just use
a language that *is* well suited.  That's why we provide PLs.

FWIW, another trick for inserting optimization fences is WITH.
So you could do something like

WITH Q1(g,c) AS MATERIALIZED
  (SELECT year % 19, year / 100),
Q2(h) AS MATERIALIZED
  (SELECT (c - c/4 - (8*c + 13)/25 + 19*g + 15) % 30 FROM Q1),
...
SELECT make_date(year, easter_month, easter_day) FROM Q6;

But I'd bet lunch that that won't be faster for this example,
because there's a lot of overhead in CTEs.

            regards, tom lane

Re: Avoid excessive inlining?

От

"Joel Jacobson"

Дата:

22 декабря 2020 г., 20:44:57

Thanks Tom,

this was exactly what I needed to hear.

I guess I recently have become too fond of all the nice new "recent" advanced SQL features,

such as LATERAL and MATERIALIZED CTEs, now in my possession since I now only code on hobby

projects, after all the years stuck in an old PostgreSQL version in my previous job,

to realise that such SQL features are not always a good fit for the job at all times.

I rewrote all the slow pure SQL code in PL/pgSQL and got as 568% speed-up in the CBOR to JSON converter I'm working on. Thanks also for giving me inspiration on the wording for my own commit message:

https://github.com/truthly/pg-cbor/commit/7ea7640f699cdf271ffa9cfb5d059f8141f541ed

Author: Joel Jakobsson <joel@compiler.org>

Date: Tue Dec 22 18:21:47 2020 +0100

Optimize by rewriting declarative pure SQL code into imperative PL/pgSQL

PostgreSQL's SQL language isn't terribly well suited to execute

a fundamentally stepwise, imperative algorithm like CBOR.

Rather than hacking up cute tricks with LATERAL, we should just use

a language that *is* well suited, a PL, like PL/pgSQL.

-- Pure SQL (before):

select * from pg_stat_xact_user_functions ;

schemaname | funcname | calls | total_time | self_time

------------+------------+-------+------------+-----------

cbor | next_item | 14 | 48.91024 | 38.964918

cbor | next_array | 1 | 7.297435 | 1.816102

cbor | next_map | 2 | 40.844352 | 7.8957

cbor | to_jsonb | 1 | 50.222183 | 1.311943

-- PL/pgSQL (after):

select * from pg_stat_xact_user_functions ;

schemaname | funcname | calls | total_time | self_time

------------+--------------+-------+------------+-----------

cbor | next_item | 14 | 8.021371 | 3.358271

cbor | next_array | 1 | 0.565398 | 0.353071

cbor | next_map | 2 | 5.607702 | 1.324057

cbor | to_jsonb | 1 | 8.823691 | 0.80232

FUNCTIONS/major_type_0.sql | 23 +++++++++++++++++++++++

FUNCTIONS/major_type_1.sql | 23 +++++++++++++++++++++++

FUNCTIONS/major_type_2.sql | 23 +++++++++++++++++++++++

FUNCTIONS/major_type_3.sql | 23 +++++++++++++++++++++++

FUNCTIONS/major_type_4.sql | 23 +++++++++++++++++++++++

FUNCTIONS/major_type_5.sql | 23 +++++++++++++++++++++++

FUNCTIONS/major_type_6.sql | 40 ++++++++++++++++++++++++++++++++++++++++

FUNCTIONS/major_type_7.sql | 43 +++++++++++++++++++++++++++++++++++++++++++

FUNCTIONS/next_item.sql | 109 ++++++++++++++++++++++++++++++++++++-------------------------------------------------------------------------

Makefile | 8 ++++++++

expected/rfc7049_appendix_a.out | 52 +++++++++++++++++++++++++++++++++++++++-------------

11 files changed, 304 insertions(+), 86 deletions(-)

Best regards,

Joel

On Tue, Dec 22, 2020, at 17:32, Tom Lane wrote:

"Joel Jacobson" <joel@compiler.org> writes:

> I think I was a bit unclear about my problem, and might have used the wrong terminology.

> In my LATERAL query, there are calculations in a certain order.

> For each step, "columns" are computed named e.g. "g", "c", "h", "i", etc.

> However, when looking at the query plan, these steps are gone, and instead there is just one huge fully expanded expression, which doesn't look very efficient.

Yeah, this isn't really about function inlining, it's about subquery

flattening (which is similar in some ways, but not the same thing).

Unfortunately, subquery flattening happens early enough in the planner

that there's no chance of making any useful cost comparisons to decide

whether to do it or not. So we just do it unconditionally. I'm

not really sure that failing to do it would provide a better outcome

in this situation anyway --- sure, you'd save a few scalar calculations,

but the overhead of running additional plan nodes could outweigh that.

The long and the short of it is that SQL isn't terribly well suited to

execute a fundamentally stepwise, imperative algorithm like this one.

Rather than hacking up cute tricks with LATERAL, you should just use

a language that *is* well suited. That's why we provide PLs.

FWIW, another trick for inserting optimization fences is WITH.

So you could do something like

WITH Q1(g,c) AS MATERIALIZED

(SELECT year % 19, year / 100),

Q2(h) AS MATERIALIZED

(SELECT (c - c/4 - (8*c + 13)/25 + 19*g + 15) % 30 FROM Q1),

...

SELECT make_date(year, easter_month, easter_day) FROM Q6;

But I'd bet lunch that that won't be faster for this example,

because there's a lot of overhead in CTEs.

regards, tom lane

Kind regards,

Joel

Re: Avoid excessive inlining?

От

"Joel Jacobson"

Дата:

02 января 2021 г., 22:06:54

I found yet another trick, which actually seems to be slightly faster than the plpgsql version.

The trick is to use VIEW ... WITH (security_barrier) to tell the optimizer it shouldn’t flatten the subqueries.

CREATE TABLE eastern (year integer);

INSERT INTO eastern (year) SELECT generate_series(1,100000);

CREATE VIEW v0 WITH (security_barrier) AS (

SELECT

year,

year % 19 AS g,

year / 100 AS c

FROM eastern

);

CREATE VIEW v1 WITH (security_barrier) AS (

SELECT

year,

(c - c/4 - (8*c + 13)/25 + 19*g + 15) % 30 AS h

FROM v0

);

CREATE VIEW v2 WITH (security_barrier) AS (

SELECT

year,

h - (h/28)*(1 - (h/28)*(29/(h + 1))*((21 - g)/11)) AS i

FROM v1

);

CREATE VIEW v3 WITH (security_barrier) AS (

SELECT

year,

(year + year/4 + i + 2 - c + c/4) % 7 AS j

FROM v2

);

CREATE VIEW v4 WITH (security_barrier) AS (

SELECT

year,

i - j AS p

FROM v3

);

CREATE VIEW v5 WITH (security_barrier) AS (

SELECT

year,

3 + (p + 26)/30 AS easter_month,

1 + (p + 27 + (p + 6)/40) % 31 AS easter_day

FROM v4

);

EXPLAIN ANALYZE VERBOSE

SELECT make_date(year, easter_month, easter_day) FROM v5;

QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------------

Subquery Scan on v5 (cost=0.00..17943.00 rows=100000 width=4) (actual time=0.025..213.996 rows=100000 loops=1)

Output: make_date(v5.year, v5.easter_month, v5.easter_day)

-> Subquery Scan on v4 (cost=0.00..16693.00 rows=100000 width=12) (actual time=0.024..191.448 rows=100000 loops=1)

Output: v4.year, (3 + ((v4.p + 26) / 30)), (1 + (((v4.p + 27) + ((v4.p + 6) / 40)) % 31))

-> Subquery Scan on v3 (cost=0.00..13443.00 rows=100000 width=8) (actual time=0.022..155.690 rows=100000 loops=1)

Output: v3.year, (v3.i - v3.j)

-> Subquery Scan on v2 (cost=0.00..12193.00 rows=100000 width=12) (actual time=0.021..135.485 rows=100000 loops=1)

Output: v2.year, v2.i, ((((((v2.year + (v2.year / 4)) + v2.i) + 2) - v2.c) + (v2.c / 4)) % 7)

-> Subquery Scan on v1 (cost=0.00..9193.00 rows=100000 width=12) (actual time=0.019..97.935 rows=100000 loops=1)

Output: v1.year, v1.c, (v1.h - ((v1.h / 28) * (1 - (((v1.h / 28) * (29 / (v1.h + 1))) * ((21 - v1.g) / 11)))))

-> Subquery Scan on v0 (cost=0.00..5443.00 rows=100000 width=16) (actual time=0.017..57.988 rows=100000 loops=1)

Output: v0.year, v0.g, v0.c, (((((v0.c - (v0.c / 4)) - (((8 * v0.c) + 13) / 25)) + (19 * v0.g)) + 15) % 30)

-> Seq Scan on public.eastern (cost=0.00..1943.00 rows=100000 width=12) (actual time=0.015..23.908 rows=100000 loops=1)

Output: eastern.year, (eastern.year % 19), (eastern.year / 100)

Planning Time: 0.274 ms

Execution Time: 220.698 ms

(16 rows)

EXPLAIN ANALYZE VERBOSE

SELECT easter_plpgsql(year) FROM eastern;

QUERY PLAN

------------------------------------------------------------------------------------------------------------------------

Seq Scan on public.eastern (cost=0.00..26443.00 rows=100000 width=4) (actual time=0.077..301.519 rows=100000 loops=1)

Output: easter_plpgsql(year)

Planning Time: 0.049 ms

Execution Time: 309.119 ms

(4 rows)

On Tue, Dec 22, 2020, at 17:32, Tom Lane wrote:

"Joel Jacobson" <joel@compiler.org> writes:
> I think I was a bit unclear about my problem, and might have used the wrong terminology.
> In my LATERAL query, there are calculations in a certain order.
> For each step, "columns" are computed named e.g. "g", "c", "h", "i", etc.
> However, when looking at the query plan, these steps are gone, and instead there is just one huge fully expanded expression, which doesn't look very efficient.

Yeah, this isn't really about function inlining, it's about subquery
flattening (which is similar in some ways, but not the same thing).

Unfortunately, subquery flattening happens early enough in the planner
that there's no chance of making any useful cost comparisons to decide
whether to do it or not. So we just do it unconditionally. I'm
not really sure that failing to do it would provide a better outcome
in this situation anyway --- sure, you'd save a few scalar calculations,
but the overhead of running additional plan nodes could outweigh that.

The long and the short of it is that SQL isn't terribly well suited to
execute a fundamentally stepwise, imperative algorithm like this one.
Rather than hacking up cute tricks with LATERAL, you should just use
a language that *is* well suited. That's why we provide PLs.

FWIW, another trick for inserting optimization fences is WITH.
So you could do something like

WITH Q1(g,c) AS MATERIALIZED
(SELECT year % 19, year / 100),
Q2(h) AS MATERIALIZED
(SELECT (c - c/4 - (8*c + 13)/25 + 19*g + 15) % 30 FROM Q1),
...
SELECT make_date(year, easter_month, easter_day) FROM Q6;

But I'd bet lunch that that won't be faster for this example,
because there's a lot of overhead in CTEs.

regards, tom lane

Kind regards,

Joel

Re: Avoid excessive inlining?

От

Pavel Stehule

Дата:

02 января 2021 г., 22:37:13

so 2. 1. 2021 v 20:07 odesílatel Joel Jacobson <joel@compiler.org> napsal:

I found yet another trick, which actually seems to be slightly faster than the plpgsql version.

What version of Postgres do you use?

plpgsql in Postgres 13 is significantly faster than on older versions

Regards

Pavel

The trick is to use VIEW ... WITH (security_barrier) to tell the optimizer it shouldn’t flatten the subqueries.

CREATE TABLE eastern (year integer);

INSERT INTO eastern (year) SELECT generate_series(1,100000);

CREATE VIEW v0 WITH (security_barrier) AS (
SELECT
    year,
    year % 19 AS g,
    year / 100 AS c
FROM eastern
);
CREATE VIEW v1 WITH (security_barrier) AS (
SELECT
    year,
    g,
    c,
    (c - c/4 - (8*c + 13)/25 + 19*g + 15) % 30 AS h
FROM v0
);
CREATE VIEW v2 WITH (security_barrier) AS (
SELECT
    year,
    c,
    h - (h/28)*(1 - (h/28)*(29/(h + 1))*((21 - g)/11)) AS i
FROM v1
);
CREATE VIEW v3 WITH (security_barrier) AS (
SELECT
    year,
    i,
    (year + year/4 + i + 2 - c + c/4) % 7 AS j
FROM v2
);
CREATE VIEW v4 WITH (security_barrier) AS (
SELECT
    year,
    i - j AS p
FROM v3
);
CREATE VIEW v5 WITH (security_barrier) AS (
SELECT
    year,
    3 + (p + 26)/30 AS easter_month,
    1 + (p + 27 + (p + 6)/40) % 31 AS easter_day
FROM v4
);

EXPLAIN ANALYZE VERBOSE
SELECT make_date(year, easter_month, easter_day) FROM v5;

                                                                        QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------
Subquery Scan on v5 (cost=0.00..17943.00 rows=100000 width=4) (actual time=0.025..213.996 rows=100000 loops=1)
   Output: make_date(v5.year, v5.easter_month, v5.easter_day)
   -> Subquery Scan on v4 (cost=0.00..16693.00 rows=100000 width=12) (actual time=0.024..191.448 rows=100000 loops=1)
         Output: v4.year, (3 + ((v4.p + 26) / 30)), (1 + (((v4.p + 27) + ((v4.p + 6) / 40)) % 31))
         -> Subquery Scan on v3 (cost=0.00..13443.00 rows=100000 width=8) (actual time=0.022..155.690 rows=100000 loops=1)
               Output: v3.year, (v3.i - v3.j)
               -> Subquery Scan on v2 (cost=0.00..12193.00 rows=100000 width=12) (actual time=0.021..135.485 rows=100000 loops=1)
                     Output: v2.year, v2.i, ((((((v2.year + (v2.year / 4)) + v2.i) + 2) - v2.c) + (v2.c / 4)) % 7)
                     -> Subquery Scan on v1 (cost=0.00..9193.00 rows=100000 width=12) (actual time=0.019..97.935 rows=100000 loops=1)
                           Output: v1.year, v1.c, (v1.h - ((v1.h / 28) * (1 - (((v1.h / 28) * (29 / (v1.h + 1))) * ((21 - v1.g) / 11)))))
                           -> Subquery Scan on v0 (cost=0.00..5443.00 rows=100000 width=16) (actual time=0.017..57.988 rows=100000 loops=1)
                                 Output: v0.year, v0.g, v0.c, (((((v0.c - (v0.c / 4)) - (((8 * v0.c) + 13) / 25)) + (19 * v0.g)) + 15) % 30)
                                 -> Seq Scan on public.eastern (cost=0.00..1943.00 rows=100000 width=12) (actual time=0.015..23.908 rows=100000 loops=1)
                                       Output: eastern.year, (eastern.year % 19), (eastern.year / 100)
Planning Time: 0.274 ms
Execution Time: 220.698 ms
(16 rows)

EXPLAIN ANALYZE VERBOSE
SELECT easter_plpgsql(year) FROM eastern;
                                                       QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
Seq Scan on public.eastern (cost=0.00..26443.00 rows=100000 width=4) (actual time=0.077..301.519 rows=100000 loops=1)
   Output: easter_plpgsql(year)
Planning Time: 0.049 ms
Execution Time: 309.119 ms
(4 rows)

On Tue, Dec 22, 2020, at 17:32, Tom Lane wrote:
"Joel Jacobson" <joel@compiler.org> writes:
> I think I was a bit unclear about my problem, and might have used the wrong terminology.
> In my LATERAL query, there are calculations in a certain order.
> For each step, "columns" are computed named e.g. "g", "c", "h", "i", etc.
> However, when looking at the query plan, these steps are gone, and instead there is just one huge fully expanded expression, which doesn't look very efficient.

Yeah, this isn't really about function inlining, it's about subquery
flattening (which is similar in some ways, but not the same thing).

Unfortunately, subquery flattening happens early enough in the planner
that there's no chance of making any useful cost comparisons to decide
whether to do it or not. So we just do it unconditionally. I'm
not really sure that failing to do it would provide a better outcome
in this situation anyway --- sure, you'd save a few scalar calculations,
but the overhead of running additional plan nodes could outweigh that.

The long and the short of it is that SQL isn't terribly well suited to
execute a fundamentally stepwise, imperative algorithm like this one.
Rather than hacking up cute tricks with LATERAL, you should just use
a language that *is* well suited. That's why we provide PLs.

FWIW, another trick for inserting optimization fences is WITH.
So you could do something like

WITH Q1(g,c) AS MATERIALIZED
(SELECT year % 19, year / 100),
Q2(h) AS MATERIALIZED
(SELECT (c - c/4 - (8*c + 13)/25 + 19*g + 15) % 30 FROM Q1),
...
SELECT make_date(year, easter_month, easter_day) FROM Q6;

But I'd bet lunch that that won't be faster for this example,
because there's a lot of overhead in CTEs.

regards, tom lane

Kind regards,

Joel

Re: Avoid excessive inlining?

От

"Joel Jacobson"

Дата:

02 января 2021 г., 23:02:38

13.1

On Sat, Jan 2, 2021, at 20:37, Pavel Stehule wrote:

Hi

so 2. 1. 2021 v 20:07 odesílatel Joel Jacobson <joel@compiler.org> napsal:

I found yet another trick, which actually seems to be slightly faster than the plpgsql version.

What version of Postgres do you use?

plpgsql in Postgres 13 is significantly faster than on older versions

Regards

Pavel

Re: Avoid excessive inlining?

От

Pavel Stehule

Дата:

03 января 2021 г., 08:57:50

so 2. 1. 2021 v 21:02 odesílatel Joel Jacobson <joel@compiler.org> napsal:

13.1

ok there is plpgsql most fast

you can check if there are not some hidden implicit casting in your code.

Regards

Pavel

On Sat, Jan 2, 2021, at 20:37, Pavel Stehule wrote:
Hi

so 2. 1. 2021 v 20:07 odesílatel Joel Jacobson <joel@compiler.org> napsal:

I found yet another trick, which actually seems to be slightly faster than the plpgsql version.

What version of Postgres do you use?

plpgsql in Postgres 13 is significantly faster than on older versions

Regards

Pavel

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Avoid excessive inlining?