Обсуждение: Avoid excessive inlining?

Поиск
Список
Период
Сортировка

Avoid excessive inlining?

От
"Joel Jacobson"
Дата:
Is there a way to avoid excessive inlining when writing pure SQL functions, without having to use PL/pgSQL?

The JOIN LATERAL and Nested Subqueries versions run much slower than the PL/pgSQL version:

Execution Times:
JOIN LATERAL: 12198.010 ms
Nested Subqueries: 12250.077 ms
PL/pgSQL: 312.493 ms

The three functions below are equivalent, they all compute the Eastern date for a given year.

CREATE OR REPLACE FUNCTION easter_lateral(year integer)
RETURNS DATE
LANGUAGE sql
AS $$
SELECT make_date(year, easter_month, easter_day)
FROM (VALUES (year % 19, year / 100)) AS Q1(g,c)
JOIN LATERAL (VALUES ((c - c/4 - (8*c + 13)/25 + 19*g + 15) % 30)) AS Q2(h) ON TRUE
JOIN LATERAL (VALUES (h - (h/28)*(1 - (h/28)*(29/(h + 1))*((21 - g)/11)))) AS Q3(i) ON TRUE
JOIN LATERAL (VALUES ((year + year/4 + i + 2 - c + c/4) % 7)) AS Q4(j) ON TRUE
JOIN LATERAL (VALUES (i - j)) AS Q5(p) ON TRUE
JOIN LATERAL (VALUES (3 + (p + 26)/30, 1 + (p + 27 + (p + 6)/40) % 31)) AS Q6(easter_month, easter_day) ON TRUE
$$;

CREATE OR REPLACE FUNCTION easter_nested_subqueries(year integer)
RETURNS DATE
LANGUAGE sql
AS $$
SELECT make_date(year, easter_month, easter_day)
FROM (
  SELECT *,
    3 + (p + 26)/30 AS easter_month,
    1 + (p + 27 + (p + 6)/40) % 31 AS easter_day
  FROM (
    SELECT *,
      i - j AS p
    FROM (
      SELECT *,
      (year + year/4 + i + 2 - c + c/4) % 7 AS j
      FROM (
        SELECT *,
          h - (h/28)*(1 - (h/28)*(29/(h + 1))*((21 - g)/11)) AS i
        FROM (
          SELECT *,
            (c - c/4 - (8*c + 13)/25 + 19*g + 15) % 30 AS h
          FROM (
            SELECT
              year % 19 AS g,
              year / 100 AS c
          ) AS Q1
        ) AS Q2
      ) AS Q3
    ) AS Q4
  ) AS Q5
) AS Q6
$$;

CREATE OR REPLACE FUNCTION easter_plpgsql(year integer)
RETURNS date
LANGUAGE plpgsql
AS $$
-- Based on: https://github.com/christopherthompson81/pgsql_holidays/blob/master/utils/easter.pgsql
DECLARE
g CONSTANT integer := year % 19;
c CONSTANT integer := year / 100;
h CONSTANT integer := (c - c/4 - (8*c + 13)/25 + 19*g + 15) % 30;
i CONSTANT integer := h - (h/28)*(1 - (h/28)*(29/(h + 1))*((21 - g)/11));
j CONSTANT integer := (year + year/4 + i + 2 - c + c/4) % 7;
p CONSTANT integer := i - j;
BEGIN
RETURN make_date(
  year,
  3 + (p + 26)/30,
  1 + (p + 27 + (p + 6)/40) % 31
);
END;
$$;

joel=# EXPLAIN ANALYZE SELECT MAX(easter) FROM (SELECT easter_lateral(year) AS easter FROM generate_series(1,100000) AS year) AS x;
                                                                 QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
Aggregate  (cost=27250.00..27250.01 rows=1 width=4) (actual time=12195.974..12195.974 rows=1 loops=1)
   ->  Function Scan on generate_series year  (cost=0.00..26000.00 rows=100000 width=4) (actual time=15.840..12167.758 rows=100000 loops=1)
Planning Time: 0.262 ms
Execution Time: 12198.010 ms
(4 rows)

joel=# EXPLAIN ANALYZE SELECT MAX(easter) FROM (SELECT easter_nested_subqueries(year) AS easter FROM generate_series(1,100000) AS year) AS x;
                                                                 QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
Aggregate  (cost=27250.00..27250.01 rows=1 width=4) (actual time=12248.316..12248.317 rows=1 loops=1)
   ->  Function Scan on generate_series year  (cost=0.00..26000.00 rows=100000 width=4) (actual time=17.707..12219.500 rows=100000 loops=1)
Planning Time: 0.277 ms
Execution Time: 12250.077 ms
(4 rows)

joel=# EXPLAIN ANALYZE SELECT MAX(easter) FROM (SELECT easter_plpgsql(year) AS easter FROM generate_series(1,100000) AS year) AS x;
                                                                QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
Aggregate  (cost=27250.00..27250.01 rows=1 width=4) (actual time=311.107..311.108 rows=1 loops=1)
   ->  Function Scan on generate_series year  (cost=0.00..26000.00 rows=100000 width=4) (actual time=12.369..296.221 rows=100000 loops=1)
Planning Time: 0.058 ms
Execution Time: 312.493 ms
(4 rows)

If we look at the plan for the lateral and subqueries versions, we can see how the inlining expands to huge expressions.

Could this be the reason they run so much slower than the PL/pgSQL version?

PREPARE q_lateral AS SELECT make_date($1, easter_month, easter_day)
FROM (VALUES ($1 % 19, $1 / 100)) AS Q1(g,c)
JOIN LATERAL (VALUES ((c - c/4 - (8*c + 13)/25 + 19*g + 15) % 30)) AS
Q2(h) ON TRUE
JOIN LATERAL (VALUES (h - (h/28)*(1 - (h/28)*(29/(h + 1))*((21 -
g)/11)))) AS Q3(i) ON TRUE
JOIN LATERAL (VALUES (($1 + $1/4 + i + 2 - c + c/4) % 7)) AS Q4(j) ON TRUE
JOIN LATERAL (VALUES (i - j)) AS Q5(p) ON TRUE
JOIN LATERAL (VALUES (3 + (p + 26)/30, 1 + (p + 27 + (p + 6)/40) % 31))
AS Q6(easter_month, easter_day) ON TRUE;

SET plan_cache_mode = 'force_generic_plan';

EXPLAIN (ANALYZE, VERBOSE) EXECUTE q_lateral(2020);

Result  (cost=0.00..1.14 rows=1 width=4) (actual time=0.038..0.038 rows=1 loops=1)
  Output: make_date($1, (3 + (((((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) - (((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) / 28) * (1 - ((((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) / 28) * (29 / ((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) + 1))) * ((21 - ($1 % 19)) / 11))))) - (((((($1 + ($1 / 4)) + ((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) - (((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) / 28) * (1 - ((((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) / 28) * (29 / ((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) + 1))) * ((21 - ($1 % 19)) / 11)))))) + 2) - ($1 / 100)) + (($1 / 100) / 4)) % 7)) + 26) / 30)), (1 + ((((((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) - (((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) / 28) * (1 - ((((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) / 28) * (29 / ((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) + 1))) * ((21 - ($1 % 19)) / 11))))) - (((((($1 + ($1 / 4)) + ((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) - (((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) / 28) * (1 - ((((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) / 28) * (29 / ((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) + 1))) * ((21 - ($1 % 19)) / 11)))))) + 2) - ($1 / 100)) + (($1 / 100) / 4)) % 7)) + 27) + (((((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) - (((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) / 28) * (1 - ((((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) / 28) * (29 / ((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) + 1))) * ((21 - ($1 % 19)) / 11))))) - (((((($1 + ($1 / 4)) + ((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) - (((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) / 28) * (1 - ((((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) / 28) * (29 / ((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) + 1))) * ((21 - ($1 % 19)) / 11)))))) + 2) - ($1 / 100)) + (($1 / 100) / 4)) % 7)) + 6) / 40)) % 31)))
Planning Time: 4.144 ms
Execution Time: 0.220 ms

PREPARE q_subqueries AS SELECT make_date($1, easter_month, easter_day)
FROM (
  SELECT *,
    3 + (p + 26)/30 AS easter_month,
    1 + (p + 27 + (p + 6)/40) % 31 AS easter_day
  FROM (
    SELECT *,
      i - j AS p
    FROM (
      SELECT *,
      ($1 + $1/4 + i + 2 - c + c/4) % 7 AS j
      FROM (
        SELECT *,
          h - (h/28)*(1 - (h/28)*(29/(h + 1))*((21 - g)/11)) AS i
        FROM (
          SELECT *,
            (c - c/4 - (8*c + 13)/25 + 19*g + 15) % 30 AS h
          FROM (
            SELECT
              $1 % 19 AS g,
              $1 / 100 AS c
          ) AS Q1
        ) AS Q2
      ) AS Q3
    ) AS Q4
  ) AS Q5
) AS Q6
;

SET plan_cache_mode = 'force_generic_plan';

EXPLAIN (ANALYZE, VERBOSE) EXECUTE q_subqueries(2020);

Result  (cost=0.00..1.14 rows=1 width=4) (actual time=0.043..0.044 rows=1 loops=1)
  Output: make_date($1, (3 + (((((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) - (((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) / 28) * (1 - ((((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) / 28) * (29 / ((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) + 1))) * ((21 - ($1 % 19)) / 11))))) - (((((($1 + ($1 / 4)) + ((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) - (((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) / 28) * (1 - ((((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) / 28) * (29 / ((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) + 1))) * ((21 - ($1 % 19)) / 11)))))) + 2) - ($1 / 100)) + (($1 / 100) / 4)) % 7)) + 26) / 30)), (1 + ((((((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) - (((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) / 28) * (1 - ((((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) / 28) * (29 / ((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) + 1))) * ((21 - ($1 % 19)) / 11))))) - (((((($1 + ($1 / 4)) + ((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) - (((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) / 28) * (1 - ((((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) / 28) * (29 / ((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) + 1))) * ((21 - ($1 % 19)) / 11)))))) + 2) - ($1 / 100)) + (($1 / 100) / 4)) % 7)) + 27) + (((((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) - (((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) / 28) * (1 - ((((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) / 28) * (29 / ((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) + 1))) * ((21 - ($1 % 19)) / 11))))) - (((((($1 + ($1 / 4)) + ((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) - (((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) / 28) * (1 - ((((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) / 28) * (29 / ((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) + 1))) * ((21 - ($1 % 19)) / 11)))))) + 2) - ($1 / 100)) + (($1 / 100) / 4)) % 7)) + 6) / 40)) % 31)))
Planning Time: 2.443 ms
Execution Time: 0.235 ms

Thanks Andreas Karlsson for teaching me the PREPARE + SET plan_cache_mode = 'force_generic_plan'; + EXPLAIN trick, very useful.

Best regards,

Joel

Re: Avoid excessive inlining?

От
Philip Semanchuk
Дата:

> On Dec 19, 2020, at 12:59 AM, Joel Jacobson <joel@compiler.org> wrote:
>
> Is there a way to avoid excessive inlining when writing pure SQL functions, without having to use PL/pgSQL?

Hi Joel,
The rules for inlining are here:
https://wiki.postgresql.org/wiki/Inlining_of_SQL_functions

According to those rules, if you declared your SQL function as VOLATILE, then Postgres wouldn’t inline it. From your
question,I’m not sure if you want to have the same function inlined sometimes and not others. I can’t think of a way to
dothat offhand. 

Hope this helps,
Philip




Re: Avoid excessive inlining?

От
Laurenz Albe
Дата:
On Mon, 2020-12-21 at 11:45 -0500, Philip Semanchuk wrote:
> > On Dec 19, 2020, at 12:59 AM, Joel Jacobson <joel@compiler.org> wrote:
> > Is there a way to avoid excessive inlining when writing pure SQL functions, without having to use PL/pgSQL?
> 
> The rules for inlining are here:
> https://wiki.postgresql.org/wiki/Inlining_of_SQL_functions
> 
> According to those rules, if you declared your SQL function as VOLATILE, then Postgres wouldn’t
>  inline it. From your question, I’m not sure if you want to have the same function inlined
>  sometimes and not others. I can’t think of a way to do that offhand.

Where do you see that?  As far as I know, VOLATILE is the best choice if you
want the function to be inlined.

I would say that the simplest way to prevent a function from being inlined
is to set a parameter on it:

  ALTER FUNCTION f() SET enable_seqscan = on;

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com




Re: Avoid excessive inlining?

От
Philip Semanchuk
Дата:

> On Dec 22, 2020, at 8:40 AM, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
>
> On Mon, 2020-12-21 at 11:45 -0500, Philip Semanchuk wrote:
>>> On Dec 19, 2020, at 12:59 AM, Joel Jacobson <joel@compiler.org> wrote:
>>> Is there a way to avoid excessive inlining when writing pure SQL functions, without having to use PL/pgSQL?
>>
>> The rules for inlining are here:
>> https://wiki.postgresql.org/wiki/Inlining_of_SQL_functions
>>
>> According to those rules, if you declared your SQL function as VOLATILE, then Postgres wouldn’t
>> inline it. From your question, I’m not sure if you want to have the same function inlined
>> sometimes and not others. I can’t think of a way to do that offhand.
>
> Where do you see that?  As far as I know, VOLATILE is the best choice if you
> want the function to be inlined.

Ugh, you’re absolutely right, and I’m sorry for spreading misinformation. That’s what I get from quoting from memory
ratherthan reading the link that I posted.  


>
> I would say that the simplest way to prevent a function from being inlined
> is to set a parameter on it:
>
>  ALTER FUNCTION f() SET enable_seqscan = on;

I appreciate the correction and education.

Cheers
Philip


Re: Avoid excessive inlining?

От
"Joel Jacobson"
Дата:
On Tue, Dec 22, 2020, at 14:40, Laurenz Albe wrote:
>I would say that the simplest way to prevent a function from being inlined
>is to set a parameter on it:
>  ALTER FUNCTION f() SET enable_seqscan = on;

Thanks, very useful, didn't know about that trick.

I think I was a bit unclear about my problem, and might have used the wrong terminology.

In my LATERAL query, there are calculations in a certain order.
For each step, "columns" are computed named e.g. "g", "c", "h", "i", etc.
However, when looking at the query plan, these steps are gone, and instead there is just one huge fully expanded expression, which doesn't look very efficient.

Ideas?

PREPARE q_lateral AS SELECT make_date($1, easter_month, easter_day)
FROM (VALUES ($1 % 19, $1 / 100)) AS Q1(g,c)
JOIN LATERAL (VALUES ((c - c/4 - (8*c + 13)/25 + 19*g + 15) % 30)) AS
Q2(h) ON TRUE
JOIN LATERAL (VALUES (h - (h/28)*(1 - (h/28)*(29/(h + 1))*((21 -
g)/11)))) AS Q3(i) ON TRUE
JOIN LATERAL (VALUES (($1 + $1/4 + i + 2 - c + c/4) % 7)) AS Q4(j) ON TRUE
JOIN LATERAL (VALUES (i - j)) AS Q5(p) ON TRUE
JOIN LATERAL (VALUES (3 + (p + 26)/30, 1 + (p + 27 + (p + 6)/40) % 31))
AS Q6(easter_month, easter_day) ON TRUE;

SET plan_cache_mode = 'force_generic_plan';

EXPLAIN (ANALYZE, VERBOSE) EXECUTE q_lateral(2020);

Result  (cost=0.00..1.14 rows=1 width=4) (actual time=0.038..0.038 rows=1 loops=1)
  Output: make_date($1, (3 + (((((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) - (((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) / 28) * (1 - ((((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) / 28) * (29 / ((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) + 1))) * ((21 - ($1 % 19)) / 11))))) - (((((($1 + ($1 / 4)) + ((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) - (((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) / 28) * (1 - ((((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) / 28) * (29 / ((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) + 1))) * ((21 - ($1 % 19)) / 11)))))) + 2) - ($1 / 100)) + (($1 / 100) / 4)) % 7)) + 26) / 30)), (1 + ((((((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) - (((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) / 28) * (1 - ((((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) / 28) * (29 / ((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) + 1))) * ((21 - ($1 % 19)) / 11))))) - (((((($1 + ($1 / 4)) + ((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) - (((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) / 28) * (1 - ((((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) / 28) * (29 / ((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) + 1))) * ((21 - ($1 % 19)) / 11)))))) + 2) - ($1 / 100)) + (($1 / 100) / 4)) % 7)) + 27) + (((((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) - (((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) / 28) * (1 - ((((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) / 28) * (29 / ((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) + 1))) * ((21 - ($1 % 19)) / 11))))) - (((((($1 + ($1 / 4)) + ((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) - (((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) / 28) * (1 - ((((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) / 28) * (29 / ((((((($1 / 100) - (($1 / 100) / 4)) - (((8 * ($1 / 100)) + 13) / 25)) + (19 * ($1 % 19))) + 15) % 30) + 1))) * ((21 - ($1 % 19)) / 11)))))) + 2) - ($1 / 100)) + (($1 / 100) / 4)) % 7)) + 6) / 40)) % 31)))
Planning Time: 4.144 ms
Execution Time: 0.220 ms


Re: Avoid excessive inlining?

От
Tom Lane
Дата:
"Joel Jacobson" <joel@compiler.org> writes:
> I think I was a bit unclear about my problem, and might have used the wrong terminology.
> In my LATERAL query, there are calculations in a certain order.
> For each step, "columns" are computed named e.g. "g", "c", "h", "i", etc.
> However, when looking at the query plan, these steps are gone, and instead there is just one huge fully expanded
expression,which doesn't look very efficient. 

Yeah, this isn't really about function inlining, it's about subquery
flattening (which is similar in some ways, but not the same thing).

Unfortunately, subquery flattening happens early enough in the planner
that there's no chance of making any useful cost comparisons to decide
whether to do it or not.  So we just do it unconditionally.  I'm
not really sure that failing to do it would provide a better outcome
in this situation anyway --- sure, you'd save a few scalar calculations,
but the overhead of running additional plan nodes could outweigh that.

The long and the short of it is that SQL isn't terribly well suited to
execute a fundamentally stepwise, imperative algorithm like this one.
Rather than hacking up cute tricks with LATERAL, you should just use
a language that *is* well suited.  That's why we provide PLs.

FWIW, another trick for inserting optimization fences is WITH.
So you could do something like

WITH Q1(g,c) AS MATERIALIZED
  (SELECT year % 19, year / 100),
Q2(h) AS MATERIALIZED
  (SELECT (c - c/4 - (8*c + 13)/25 + 19*g + 15) % 30 FROM Q1),
...
SELECT make_date(year, easter_month, easter_day) FROM Q6;

But I'd bet lunch that that won't be faster for this example,
because there's a lot of overhead in CTEs.

            regards, tom lane



Re: Avoid excessive inlining?

От
"Joel Jacobson"
Дата:
Thanks Tom,

this was exactly what I needed to hear.

I guess I recently have become too fond of all the nice new "recent" advanced SQL features,
such as LATERAL and MATERIALIZED CTEs, now in my possession since I now only code on hobby
projects, after all the years stuck in an old PostgreSQL version in my previous job,
to realise that such SQL features are not always a good fit for the job at all times.

I rewrote all the slow pure SQL code in PL/pgSQL and got as 568% speed-up in the CBOR to JSON converter I'm working on. Thanks also for giving me inspiration on the wording for my own commit message:

Date:   Tue Dec 22 18:21:47 2020 +0100

    Optimize by rewriting declarative pure SQL code into imperative PL/pgSQL

    PostgreSQL's SQL language isn't terribly well suited to execute
    a fundamentally stepwise, imperative algorithm like CBOR.

    Rather than hacking up cute tricks with LATERAL, we should just use
    a language that *is* well suited, a PL, like PL/pgSQL.

    -- Pure SQL (before):
    select * from pg_stat_xact_user_functions ;
     schemaname |  funcname  | calls | total_time | self_time
    ------------+------------+-------+------------+-----------
     cbor       | next_item  |    14 |   48.91024 | 38.964918
     cbor       | next_array |     1 |   7.297435 |  1.816102
     cbor       | next_map   |     2 |  40.844352 |    7.8957
     cbor       | to_jsonb   |     1 |  50.222183 |  1.311943

    -- PL/pgSQL (after):
    select * from pg_stat_xact_user_functions ;
     schemaname |   funcname   | calls | total_time | self_time
    ------------+--------------+-------+------------+-----------
     cbor       | next_item    |    14 |   8.021371 |  3.358271
     cbor       | next_array   |     1 |   0.565398 |  0.353071
     cbor       | next_map     |     2 |   5.607702 |  1.324057
     cbor       | to_jsonb     |     1 |   8.823691 |   0.80232

FUNCTIONS/major_type_0.sql      |  23 +++++++++++++++++++++++
FUNCTIONS/major_type_1.sql      |  23 +++++++++++++++++++++++
FUNCTIONS/major_type_2.sql      |  23 +++++++++++++++++++++++
FUNCTIONS/major_type_3.sql      |  23 +++++++++++++++++++++++
FUNCTIONS/major_type_4.sql      |  23 +++++++++++++++++++++++
FUNCTIONS/major_type_5.sql      |  23 +++++++++++++++++++++++
FUNCTIONS/major_type_6.sql      |  40 ++++++++++++++++++++++++++++++++++++++++
FUNCTIONS/major_type_7.sql      |  43 +++++++++++++++++++++++++++++++++++++++++++
FUNCTIONS/next_item.sql         | 109 ++++++++++++++++++++++++++++++++++++-------------------------------------------------------------------------
Makefile                        |   8 ++++++++
expected/rfc7049_appendix_a.out |  52 +++++++++++++++++++++++++++++++++++++++-------------
11 files changed, 304 insertions(+), 86 deletions(-)

Best regards,

Joel

On Tue, Dec 22, 2020, at 17:32, Tom Lane wrote:
"Joel Jacobson" <joel@compiler.org> writes:
> I think I was a bit unclear about my problem, and might have used the wrong terminology.
> In my LATERAL query, there are calculations in a certain order.
> For each step, "columns" are computed named e.g. "g", "c", "h", "i", etc.
> However, when looking at the query plan, these steps are gone, and instead there is just one huge fully expanded expression, which doesn't look very efficient.

Yeah, this isn't really about function inlining, it's about subquery
flattening (which is similar in some ways, but not the same thing).

Unfortunately, subquery flattening happens early enough in the planner
that there's no chance of making any useful cost comparisons to decide
whether to do it or not.  So we just do it unconditionally.  I'm
not really sure that failing to do it would provide a better outcome
in this situation anyway --- sure, you'd save a few scalar calculations,
but the overhead of running additional plan nodes could outweigh that.

The long and the short of it is that SQL isn't terribly well suited to
execute a fundamentally stepwise, imperative algorithm like this one.
Rather than hacking up cute tricks with LATERAL, you should just use
a language that *is* well suited.  That's why we provide PLs.

FWIW, another trick for inserting optimization fences is WITH.
So you could do something like

WITH Q1(g,c) AS MATERIALIZED
  (SELECT year % 19, year / 100),
Q2(h) AS MATERIALIZED
  (SELECT (c - c/4 - (8*c + 13)/25 + 19*g + 15) % 30 FROM Q1),
...
SELECT make_date(year, easter_month, easter_day) FROM Q6;

But I'd bet lunch that that won't be faster for this example,
because there's a lot of overhead in CTEs.

regards, tom lane


Kind regards,

Joel

Re: Avoid excessive inlining?

От
"Joel Jacobson"
Дата:
I found yet another trick, which actually seems to be slightly faster than the plpgsql version.

The trick is to use VIEW ... WITH (security_barrier) to tell the optimizer it shouldn’t flatten the subqueries.

CREATE TABLE eastern (year integer);

INSERT INTO eastern (year) SELECT generate_series(1,100000);

CREATE VIEW v0 WITH (security_barrier) AS (
  SELECT
    year,
    year % 19 AS g,
    year / 100 AS c
  FROM eastern
);
CREATE VIEW v1 WITH (security_barrier) AS (
  SELECT
    year,
    g,
    c,
    (c - c/4 - (8*c + 13)/25 + 19*g + 15) % 30 AS h
  FROM v0
);
CREATE VIEW v2 WITH (security_barrier) AS (
  SELECT
    year,
    c,
    h - (h/28)*(1 - (h/28)*(29/(h + 1))*((21 - g)/11)) AS i
  FROM v1
);
CREATE VIEW v3 WITH (security_barrier) AS (
  SELECT
    year,
    i,
    (year + year/4 + i + 2 - c + c/4) % 7 AS j
  FROM v2
);
CREATE VIEW v4 WITH (security_barrier) AS (
  SELECT
    year,
    i - j AS p
  FROM v3
);
CREATE VIEW v5 WITH (security_barrier) AS (
  SELECT
    year,
    3 + (p + 26)/30 AS easter_month,
    1 + (p + 27 + (p + 6)/40) % 31 AS easter_day
  FROM v4
);

EXPLAIN ANALYZE VERBOSE
SELECT make_date(year, easter_month, easter_day) FROM v5;

                                                                        QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------
Subquery Scan on v5  (cost=0.00..17943.00 rows=100000 width=4) (actual time=0.025..213.996 rows=100000 loops=1)
   Output: make_date(v5.year, v5.easter_month, v5.easter_day)
   ->  Subquery Scan on v4  (cost=0.00..16693.00 rows=100000 width=12) (actual time=0.024..191.448 rows=100000 loops=1)
         Output: v4.year, (3 + ((v4.p + 26) / 30)), (1 + (((v4.p + 27) + ((v4.p + 6) / 40)) % 31))
         ->  Subquery Scan on v3  (cost=0.00..13443.00 rows=100000 width=8) (actual time=0.022..155.690 rows=100000 loops=1)
               Output: v3.year, (v3.i - v3.j)
               ->  Subquery Scan on v2  (cost=0.00..12193.00 rows=100000 width=12) (actual time=0.021..135.485 rows=100000 loops=1)
                     Output: v2.year, v2.i, ((((((v2.year + (v2.year / 4)) + v2.i) + 2) - v2.c) + (v2.c / 4)) % 7)
                     ->  Subquery Scan on v1  (cost=0.00..9193.00 rows=100000 width=12) (actual time=0.019..97.935 rows=100000 loops=1)
                           Output: v1.year, v1.c, (v1.h - ((v1.h / 28) * (1 - (((v1.h / 28) * (29 / (v1.h + 1))) * ((21 - v1.g) / 11)))))
                           ->  Subquery Scan on v0  (cost=0.00..5443.00 rows=100000 width=16) (actual time=0.017..57.988 rows=100000 loops=1)
                                 Output: v0.year, v0.g, v0.c, (((((v0.c - (v0.c / 4)) - (((8 * v0.c) + 13) / 25)) + (19 * v0.g)) + 15) % 30)
                                 ->  Seq Scan on public.eastern  (cost=0.00..1943.00 rows=100000 width=12) (actual time=0.015..23.908 rows=100000 loops=1)
                                       Output: eastern.year, (eastern.year % 19), (eastern.year / 100)
Planning Time: 0.274 ms
Execution Time: 220.698 ms
(16 rows)

EXPLAIN ANALYZE VERBOSE
SELECT easter_plpgsql(year) FROM eastern;
                                                       QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
Seq Scan on public.eastern  (cost=0.00..26443.00 rows=100000 width=4) (actual time=0.077..301.519 rows=100000 loops=1)
   Output: easter_plpgsql(year)
Planning Time: 0.049 ms
Execution Time: 309.119 ms
(4 rows)


On Tue, Dec 22, 2020, at 17:32, Tom Lane wrote:
"Joel Jacobson" <joel@compiler.org> writes:
> I think I was a bit unclear about my problem, and might have used the wrong terminology.
> In my LATERAL query, there are calculations in a certain order.
> For each step, "columns" are computed named e.g. "g", "c", "h", "i", etc.
> However, when looking at the query plan, these steps are gone, and instead there is just one huge fully expanded expression, which doesn't look very efficient.

Yeah, this isn't really about function inlining, it's about subquery
flattening (which is similar in some ways, but not the same thing).

Unfortunately, subquery flattening happens early enough in the planner
that there's no chance of making any useful cost comparisons to decide
whether to do it or not.  So we just do it unconditionally.  I'm
not really sure that failing to do it would provide a better outcome
in this situation anyway --- sure, you'd save a few scalar calculations,
but the overhead of running additional plan nodes could outweigh that.

The long and the short of it is that SQL isn't terribly well suited to
execute a fundamentally stepwise, imperative algorithm like this one.
Rather than hacking up cute tricks with LATERAL, you should just use
a language that *is* well suited.  That's why we provide PLs.

FWIW, another trick for inserting optimization fences is WITH.
So you could do something like

WITH Q1(g,c) AS MATERIALIZED
  (SELECT year % 19, year / 100),
Q2(h) AS MATERIALIZED
  (SELECT (c - c/4 - (8*c + 13)/25 + 19*g + 15) % 30 FROM Q1),
...
SELECT make_date(year, easter_month, easter_day) FROM Q6;

But I'd bet lunch that that won't be faster for this example,
because there's a lot of overhead in CTEs.

regards, tom lane


Kind regards,

Joel

Re: Avoid excessive inlining?

От
Pavel Stehule
Дата:
Hi

so 2. 1. 2021 v 20:07 odesílatel Joel Jacobson <joel@compiler.org> napsal:
I found yet another trick, which actually seems to be slightly faster than the plpgsql version.

What version of Postgres do you use?

plpgsql in Postgres 13 is significantly faster than on older versions

Regards

Pavel


The trick is to use VIEW ... WITH (security_barrier) to tell the optimizer it shouldn’t flatten the subqueries.

CREATE TABLE eastern (year integer);

INSERT INTO eastern (year) SELECT generate_series(1,100000);

CREATE VIEW v0 WITH (security_barrier) AS (
  SELECT
    year,
    year % 19 AS g,
    year / 100 AS c
  FROM eastern
);
CREATE VIEW v1 WITH (security_barrier) AS (
  SELECT
    year,
    g,
    c,
    (c - c/4 - (8*c + 13)/25 + 19*g + 15) % 30 AS h
  FROM v0
);
CREATE VIEW v2 WITH (security_barrier) AS (
  SELECT
    year,
    c,
    h - (h/28)*(1 - (h/28)*(29/(h + 1))*((21 - g)/11)) AS i
  FROM v1
);
CREATE VIEW v3 WITH (security_barrier) AS (
  SELECT
    year,
    i,
    (year + year/4 + i + 2 - c + c/4) % 7 AS j
  FROM v2
);
CREATE VIEW v4 WITH (security_barrier) AS (
  SELECT
    year,
    i - j AS p
  FROM v3
);
CREATE VIEW v5 WITH (security_barrier) AS (
  SELECT
    year,
    3 + (p + 26)/30 AS easter_month,
    1 + (p + 27 + (p + 6)/40) % 31 AS easter_day
  FROM v4
);

EXPLAIN ANALYZE VERBOSE
SELECT make_date(year, easter_month, easter_day) FROM v5;

                                                                        QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------
Subquery Scan on v5  (cost=0.00..17943.00 rows=100000 width=4) (actual time=0.025..213.996 rows=100000 loops=1)
   Output: make_date(v5.year, v5.easter_month, v5.easter_day)
   ->  Subquery Scan on v4  (cost=0.00..16693.00 rows=100000 width=12) (actual time=0.024..191.448 rows=100000 loops=1)
         Output: v4.year, (3 + ((v4.p + 26) / 30)), (1 + (((v4.p + 27) + ((v4.p + 6) / 40)) % 31))
         ->  Subquery Scan on v3  (cost=0.00..13443.00 rows=100000 width=8) (actual time=0.022..155.690 rows=100000 loops=1)
               Output: v3.year, (v3.i - v3.j)
               ->  Subquery Scan on v2  (cost=0.00..12193.00 rows=100000 width=12) (actual time=0.021..135.485 rows=100000 loops=1)
                     Output: v2.year, v2.i, ((((((v2.year + (v2.year / 4)) + v2.i) + 2) - v2.c) + (v2.c / 4)) % 7)
                     ->  Subquery Scan on v1  (cost=0.00..9193.00 rows=100000 width=12) (actual time=0.019..97.935 rows=100000 loops=1)
                           Output: v1.year, v1.c, (v1.h - ((v1.h / 28) * (1 - (((v1.h / 28) * (29 / (v1.h + 1))) * ((21 - v1.g) / 11)))))
                           ->  Subquery Scan on v0  (cost=0.00..5443.00 rows=100000 width=16) (actual time=0.017..57.988 rows=100000 loops=1)
                                 Output: v0.year, v0.g, v0.c, (((((v0.c - (v0.c / 4)) - (((8 * v0.c) + 13) / 25)) + (19 * v0.g)) + 15) % 30)
                                 ->  Seq Scan on public.eastern  (cost=0.00..1943.00 rows=100000 width=12) (actual time=0.015..23.908 rows=100000 loops=1)
                                       Output: eastern.year, (eastern.year % 19), (eastern.year / 100)
Planning Time: 0.274 ms
Execution Time: 220.698 ms
(16 rows)

EXPLAIN ANALYZE VERBOSE
SELECT easter_plpgsql(year) FROM eastern;
                                                       QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
Seq Scan on public.eastern  (cost=0.00..26443.00 rows=100000 width=4) (actual time=0.077..301.519 rows=100000 loops=1)
   Output: easter_plpgsql(year)
Planning Time: 0.049 ms
Execution Time: 309.119 ms
(4 rows)


On Tue, Dec 22, 2020, at 17:32, Tom Lane wrote:
"Joel Jacobson" <joel@compiler.org> writes:
> I think I was a bit unclear about my problem, and might have used the wrong terminology.
> In my LATERAL query, there are calculations in a certain order.
> For each step, "columns" are computed named e.g. "g", "c", "h", "i", etc.
> However, when looking at the query plan, these steps are gone, and instead there is just one huge fully expanded expression, which doesn't look very efficient.

Yeah, this isn't really about function inlining, it's about subquery
flattening (which is similar in some ways, but not the same thing).

Unfortunately, subquery flattening happens early enough in the planner
that there's no chance of making any useful cost comparisons to decide
whether to do it or not.  So we just do it unconditionally.  I'm
not really sure that failing to do it would provide a better outcome
in this situation anyway --- sure, you'd save a few scalar calculations,
but the overhead of running additional plan nodes could outweigh that.

The long and the short of it is that SQL isn't terribly well suited to
execute a fundamentally stepwise, imperative algorithm like this one.
Rather than hacking up cute tricks with LATERAL, you should just use
a language that *is* well suited.  That's why we provide PLs.

FWIW, another trick for inserting optimization fences is WITH.
So you could do something like

WITH Q1(g,c) AS MATERIALIZED
  (SELECT year % 19, year / 100),
Q2(h) AS MATERIALIZED
  (SELECT (c - c/4 - (8*c + 13)/25 + 19*g + 15) % 30 FROM Q1),
...
SELECT make_date(year, easter_month, easter_day) FROM Q6;

But I'd bet lunch that that won't be faster for this example,
because there's a lot of overhead in CTEs.

regards, tom lane


Kind regards,

Joel

Re: Avoid excessive inlining?

От
"Joel Jacobson"
Дата:
13.1

On Sat, Jan 2, 2021, at 20:37, Pavel Stehule wrote:
Hi

so 2. 1. 2021 v 20:07 odesílatel Joel Jacobson <joel@compiler.org> napsal:

I found yet another trick, which actually seems to be slightly faster than the plpgsql version.

What version of Postgres do you use?

plpgsql in Postgres 13 is significantly faster than on older versions

Regards

Pavel


Re: Avoid excessive inlining?

От
Pavel Stehule
Дата:


so 2. 1. 2021 v 21:02 odesílatel Joel Jacobson <joel@compiler.org> napsal:
13.1

ok there is plpgsql most fast

you can check if there are not some hidden implicit casting in your code.

Regards

Pavel


On Sat, Jan 2, 2021, at 20:37, Pavel Stehule wrote:
Hi

so 2. 1. 2021 v 20:07 odesílatel Joel Jacobson <joel@compiler.org> napsal:

I found yet another trick, which actually seems to be slightly faster than the plpgsql version.

What version of Postgres do you use?

plpgsql in Postgres 13 is significantly faster than on older versions

Regards

Pavel