Обсуждение: performance of sql and plpgsql functions

Поиск
Список
Период
Сортировка

performance of sql and plpgsql functions

От
Julius Tuskenis
Дата:

Dear Postgresql performance guru,

For some reason on our client server a function written in SQL language executes 100 times slower than the one written in plpgsql...

After updating to "PostgreSQL 12.18, compiled by Visual C++ build 1914, 64-bit" (from pg9.5) our client reported a performance issue. Everything boils down to a query that uses our function public.fnk_saskaitos_skola to calculate a visitors debt. The function is written in 'sql' language. 

The function is simple enough, marked STABLE

```

CREATE OR REPLACE FUNCTION public.fnk_saskaitos_skola(prm_saskaita integer)
 RETURNS numeric
 LANGUAGE sql
 STABLE SECURITY DEFINER
AS $function$
  SELECT
    COALESCE(sum(mok_nepadengta), 0)
  FROM
    public.b_pardavimai
    JOIN public.b_mokejimai ON (mok_pardavimas = pard_id)
  WHERE
    (pard_tipas = ANY('{1, 2, 6, 7}'))
    AND (mok_saskaita = $1)
$function$
;

```

The problem is when I use it, it takes like 50ms to execute (on our client server).

EXPLAIN (ANALYZE, BUFFERS, VERBOSE)
    SELECT * FROM fnk_saskaitos_skola(7141968)


"Function Scan on public.fnk_saskaitos_skola  (cost=0.25..0.26 rows=1 width=32) (actual time=59.824..59.825 rows=1 loops=1)"
"  Output: fnk_saskaitos_skola"
"  Function Call: fnk_saskaitos_skola(7141968)"
"  Buffers: shared hit=20"
"Planning Time: 0.044 ms"
"Execution Time: 59.848 ms"


How ever, if I rewrite the same function using plpgsql the result is quite different:

```

CREATE OR REPLACE FUNCTION public.fnk_saskaitos_skola_jt(IN prm_saskaita integer)
RETURNS numeric
LANGUAGE 'plpgsql'
STABLE SECURITY DEFINER
PARALLEL UNSAFE
COST 100
AS $BODY$
begin
  return (
      SELECT
        COALESCE(sum(mok_nepadengta), 0)
      FROM
        public.b_pardavimai
        JOIN public.b_mokejimai ON (mok_pardavimas = pard_id)
      WHERE
        (pard_tipas = ANY('{1, 2, 6, 7}'))
        AND (mok_saskaita = $1)
  );
end
$BODY$;

```


EXPLAIN (ANALYZE, BUFFERS, VERBOSE)
    SELECT fnk_saskaitos_skola_jt(7141968)   


```

"Result  (cost=0.00..0.26 rows=1 width=32) (actual time=0.562..0.562 rows=1 loops=1)"
"  Output: fnk_saskaitos_skola_jt(7141968)"
"  Buffers: shared hit=20"
"Planning Time: 0.022 ms"
"Execution Time: 0.574 ms"

```


If I analyze the sql that is inside the function I get results similar to the ones of using plpgsql function:

EXPLAIN (ANALYZE, BUFFERS, VERBOSE)
  SELECT
    COALESCE(sum(mok_nepadengta), 0)
  FROM
    public.b_pardavimai
    JOIN public.b_mokejimai ON (mok_pardavimas = pard_id)
  WHERE
    (pard_tipas = ANY('{1, 2, 6, 7}'))
    AND (mok_saskaita = 7141968)

```

"Aggregate  (cost=2773.78..2773.79 rows=1 width=32) (actual time=0.015..0.016 rows=1 loops=1)"
"  Output: COALESCE(sum((b_mokejimai.mok_nepadengta)::numeric), '0'::numeric)"
"  Buffers: shared hit=4"
"  ->  Nested Loop  (cost=1.00..2771.96 rows=730 width=3) (actual time=0.013..0.013 rows=0 loops=1)"
"        Output: b_mokejimai.mok_nepadengta"
"        Inner Unique: true"
"        Buffers: shared hit=4"
"        ->  Index Scan using idx_saskaita on public.b_mokejimai  (cost=0.56..793.10 rows=746 width=7) (actual time=0.012..0.012 rows=0 loops=1)"
"              Output: b_mokejimai.mok_id, b_mokejimai.mok_moketojas, b_mokejimai.mok_pardavimas, b_mokejimai.mok_laikas, b_mokejimai.mok_suma, b_mokejimai.mok_budas, b_mokejimai.mok_terminas, b_mokejimai.mok_cekis, b_mokejimai.mok_saskaita, b_mokejimai.mok_suma_bazine, b_mokejimai.mok_nepadengta, b_mokejimai.mok_padengta, b_mokejimai.mok_laiko_diena"
"              Index Cond: (b_mokejimai.mok_saskaita = 7141968)"
"              Buffers: shared hit=4"
"        ->  Index Scan using pk_b_pardavimai_id on public.b_pardavimai  (cost=0.44..2.65 rows=1 width=4) (never executed)"
"              Output: b_pardavimai.pard_id, b_pardavimai.pard_preke, b_pardavimai.pard_kaina, b_pardavimai.pard_nuolaida, b_pardavimai.pard_kiekis, b_pardavimai.pard_kasos_nr, b_pardavimai.pard_laikas, b_pardavimai.pard_prekes_id, b_pardavimai.pard_pirkejo_id, b_pardavimai.pard_pardavejas, b_pardavimai.pard_spausdinta, b_pardavimai.pard_reikia_grazinti, b_pardavimai.pard_kam_naudoti, b_pardavimai.pard_susieta, b_pardavimai.pard_galima_anuliuoti, b_pardavimai.pard_tipas, b_pardavimai.pard_pvm, b_pardavimai.pard_apsilankymas, b_pardavimai.pard_fk, b_pardavimai.pard_kelintas, b_pardavimai.pard_precekis, b_pardavimai.pard_imone, b_pardavimai.pard_grazintas, b_pardavimai.pard_debeto_sutartis, b_pardavimai.pard_kaina_be_nld, b_pardavimai.pard_uzsakymas_pos, b_pardavimai.pard_pvm_suma, b_pardavimai.pard_uzsakymo_nr, b_pardavimai.pard_nuolaidos_id, b_pardavimai.pard_nuolaida_taikyti, b_pardavimai.pard_pirkeja_keisti_galima, b_pardavimai.pard_suma_keisti_galima"
"              Index Cond: (b_pardavimai.pard_id = b_mokejimai.mok_pardavimas)"
"              Filter: (b_pardavimai.pard_tipas = ANY ('{1,2,6,7}'::smallint[]))"
"Planning Time: 0.550 ms"
"Execution Time: 0.049 ms"

```


As I understand, the planning in case of sql functions is done everytime the functions is executed. I don't mind if planning would take 0.550 ms as when using plain SQL. But why execution takes ~59ms??...  What is it spent for?

Isn't PosgreSQL supposed to inline simple SQL functions that are stable or immutable?

Any advice on  where to look for the cause of this "anomaly" is highly appreciated?


I've tried executing the same query on different server and different database - I could not reproduce the behavior. Using SQL function produces results faster.

I'd be gratefull to receive some insights of how to investigate the behavior. I'm not keen on changing the language or the function not knowing why it is required or how it helps...



Regards,

Julius Tuskenis



Re: performance of sql and plpgsql functions

От
Pavel Stehule
Дата:


po 17. 6. 2024 v 11:35 odesílatel Julius Tuskenis <julius.tuskenis@gmail.com> napsal:

Dear Postgresql performance guru,

For some reason on our client server a function written in SQL language executes 100 times slower than the one written in plpgsql...

After updating to "PostgreSQL 12.18, compiled by Visual C++ build 1914, 64-bit" (from pg9.5) our client reported a performance issue. Everything boils down to a query that uses our function public.fnk_saskaitos_skola to calculate a visitors debt. The function is written in 'sql' language. 

The function is simple enough, marked STABLE

```

CREATE OR REPLACE FUNCTION public.fnk_saskaitos_skola(prm_saskaita integer)
 RETURNS numeric
 LANGUAGE sql
 STABLE SECURITY DEFINER
AS $function$
  SELECT
    COALESCE(sum(mok_nepadengta), 0)
  FROM
    public.b_pardavimai
    JOIN public.b_mokejimai ON (mok_pardavimas = pard_id)
  WHERE
    (pard_tipas = ANY('{1, 2, 6, 7}'))
    AND (mok_saskaita = $1)
$function$
;

```

The problem is when I use it, it takes like 50ms to execute (on our client server).

EXPLAIN (ANALYZE, BUFFERS, VERBOSE)
    SELECT * FROM fnk_saskaitos_skola(7141968)


"Function Scan on public.fnk_saskaitos_skola  (cost=0.25..0.26 rows=1 width=32) (actual time=59.824..59.825 rows=1 loops=1)"
"  Output: fnk_saskaitos_skola"
"  Function Call: fnk_saskaitos_skola(7141968)"
"  Buffers: shared hit=20"
"Planning Time: 0.044 ms"
"Execution Time: 59.848 ms"


How ever, if I rewrite the same function using plpgsql the result is quite different:

```

CREATE OR REPLACE FUNCTION public.fnk_saskaitos_skola_jt(IN prm_saskaita integer)
RETURNS numeric
LANGUAGE 'plpgsql'
STABLE SECURITY DEFINER
PARALLEL UNSAFE
COST 100
AS $BODY$
begin
  return (
      SELECT
        COALESCE(sum(mok_nepadengta), 0)
      FROM
        public.b_pardavimai
        JOIN public.b_mokejimai ON (mok_pardavimas = pard_id)
      WHERE
        (pard_tipas = ANY('{1, 2, 6, 7}'))
        AND (mok_saskaita = $1)
  );
end
$BODY$;

```


EXPLAIN (ANALYZE, BUFFERS, VERBOSE)
    SELECT fnk_saskaitos_skola_jt(7141968)   


```

"Result  (cost=0.00..0.26 rows=1 width=32) (actual time=0.562..0.562 rows=1 loops=1)"
"  Output: fnk_saskaitos_skola_jt(7141968)"
"  Buffers: shared hit=20"
"Planning Time: 0.022 ms"
"Execution Time: 0.574 ms"

```


If I analyze the sql that is inside the function I get results similar to the ones of using plpgsql function:

EXPLAIN (ANALYZE, BUFFERS, VERBOSE)
  SELECT
    COALESCE(sum(mok_nepadengta), 0)
  FROM
    public.b_pardavimai
    JOIN public.b_mokejimai ON (mok_pardavimas = pard_id)
  WHERE
    (pard_tipas = ANY('{1, 2, 6, 7}'))
    AND (mok_saskaita = 7141968)

```

"Aggregate  (cost=2773.78..2773.79 rows=1 width=32) (actual time=0.015..0.016 rows=1 loops=1)"
"  Output: COALESCE(sum((b_mokejimai.mok_nepadengta)::numeric), '0'::numeric)"
"  Buffers: shared hit=4"
"  ->  Nested Loop  (cost=1.00..2771.96 rows=730 width=3) (actual time=0.013..0.013 rows=0 loops=1)"
"        Output: b_mokejimai.mok_nepadengta"
"        Inner Unique: true"
"        Buffers: shared hit=4"
"        ->  Index Scan using idx_saskaita on public.b_mokejimai  (cost=0.56..793.10 rows=746 width=7) (actual time=0.012..0.012 rows=0 loops=1)"
"              Output: b_mokejimai.mok_id, b_mokejimai.mok_moketojas, b_mokejimai.mok_pardavimas, b_mokejimai.mok_laikas, b_mokejimai.mok_suma, b_mokejimai.mok_budas, b_mokejimai.mok_terminas, b_mokejimai.mok_cekis, b_mokejimai.mok_saskaita, b_mokejimai.mok_suma_bazine, b_mokejimai.mok_nepadengta, b_mokejimai.mok_padengta, b_mokejimai.mok_laiko_diena"
"              Index Cond: (b_mokejimai.mok_saskaita = 7141968)"
"              Buffers: shared hit=4"
"        ->  Index Scan using pk_b_pardavimai_id on public.b_pardavimai  (cost=0.44..2.65 rows=1 width=4) (never executed)"
"              Output: b_pardavimai.pard_id, b_pardavimai.pard_preke, b_pardavimai.pard_kaina, b_pardavimai.pard_nuolaida, b_pardavimai.pard_kiekis, b_pardavimai.pard_kasos_nr, b_pardavimai.pard_laikas, b_pardavimai.pard_prekes_id, b_pardavimai.pard_pirkejo_id, b_pardavimai.pard_pardavejas, b_pardavimai.pard_spausdinta, b_pardavimai.pard_reikia_grazinti, b_pardavimai.pard_kam_naudoti, b_pardavimai.pard_susieta, b_pardavimai.pard_galima_anuliuoti, b_pardavimai.pard_tipas, b_pardavimai.pard_pvm, b_pardavimai.pard_apsilankymas, b_pardavimai.pard_fk, b_pardavimai.pard_kelintas, b_pardavimai.pard_precekis, b_pardavimai.pard_imone, b_pardavimai.pard_grazintas, b_pardavimai.pard_debeto_sutartis, b_pardavimai.pard_kaina_be_nld, b_pardavimai.pard_uzsakymas_pos, b_pardavimai.pard_pvm_suma, b_pardavimai.pard_uzsakymo_nr, b_pardavimai.pard_nuolaidos_id, b_pardavimai.pard_nuolaida_taikyti, b_pardavimai.pard_pirkeja_keisti_galima, b_pardavimai.pard_suma_keisti_galima"
"              Index Cond: (b_pardavimai.pard_id = b_mokejimai.mok_pardavimas)"
"              Filter: (b_pardavimai.pard_tipas = ANY ('{1,2,6,7}'::smallint[]))"
"Planning Time: 0.550 ms"
"Execution Time: 0.049 ms"

```


As I understand, the planning in case of sql functions is done everytime the functions is executed. I don't mind if planning would take 0.550 ms as when using plain SQL. But why execution takes ~59ms??...  What is it spent for?

Isn't PosgreSQL supposed to inline simple SQL functions that are stable or immutable?

no, PLpgSQL functions are not inlined

Regards

Pavel

 

Any advice on  where to look for the cause of this "anomaly" is highly appreciated?


I've tried executing the same query on different server and different database - I could not reproduce the behavior. Using SQL function produces results faster.

I'd be gratefull to receive some insights of how to investigate the behavior. I'm not keen on changing the language or the function not knowing why it is required or how it helps...



Regards,

Julius Tuskenis



Re: performance of sql and plpgsql functions

От
Julius Tuskenis
Дата:

Thank you Pavel for your input.

You wrote:

no, PLpgSQL functions are not inlined
Yes, I understand that. I was referring to SQL functions (not plpgsql).

Regards,

Julius Tuskenis


On 2024-06-17 12:44, Pavel Stehule wrote:


po 17. 6. 2024 v 11:35 odesílatel Julius Tuskenis <julius.tuskenis@gmail.com> napsal:

Dear Postgresql performance guru,

For some reason on our client server a function written in SQL language executes 100 times slower than the one written in plpgsql...

After updating to "PostgreSQL 12.18, compiled by Visual C++ build 1914, 64-bit" (from pg9.5) our client reported a performance issue. Everything boils down to a query that uses our function public.fnk_saskaitos_skola to calculate a visitors debt. The function is written in 'sql' language. 

The function is simple enough, marked STABLE

```

CREATE OR REPLACE FUNCTION public.fnk_saskaitos_skola(prm_saskaita integer)
 RETURNS numeric
 LANGUAGE sql
 STABLE SECURITY DEFINER
AS $function$
  SELECT
    COALESCE(sum(mok_nepadengta), 0)
  FROM
    public.b_pardavimai
    JOIN public.b_mokejimai ON (mok_pardavimas = pard_id)
  WHERE
    (pard_tipas = ANY('{1, 2, 6, 7}'))
    AND (mok_saskaita = $1)
$function$
;

```

The problem is when I use it, it takes like 50ms to execute (on our client server).

EXPLAIN (ANALYZE, BUFFERS, VERBOSE)
    SELECT * FROM fnk_saskaitos_skola(7141968)


"Function Scan on public.fnk_saskaitos_skola  (cost=0.25..0.26 rows=1 width=32) (actual time=59.824..59.825 rows=1 loops=1)"
"  Output: fnk_saskaitos_skola"
"  Function Call: fnk_saskaitos_skola(7141968)"
"  Buffers: shared hit=20"
"Planning Time: 0.044 ms"
"Execution Time: 59.848 ms"


How ever, if I rewrite the same function using plpgsql the result is quite different:

```

CREATE OR REPLACE FUNCTION public.fnk_saskaitos_skola_jt(IN prm_saskaita integer)
RETURNS numeric
LANGUAGE 'plpgsql'
STABLE SECURITY DEFINER
PARALLEL UNSAFE
COST 100
AS $BODY$
begin
  return (
      SELECT
        COALESCE(sum(mok_nepadengta), 0)
      FROM
        public.b_pardavimai
        JOIN public.b_mokejimai ON (mok_pardavimas = pard_id)
      WHERE
        (pard_tipas = ANY('{1, 2, 6, 7}'))
        AND (mok_saskaita = $1)
  );
end
$BODY$;

```


EXPLAIN (ANALYZE, BUFFERS, VERBOSE)
    SELECT fnk_saskaitos_skola_jt(7141968)   


```

"Result  (cost=0.00..0.26 rows=1 width=32) (actual time=0.562..0.562 rows=1 loops=1)"
"  Output: fnk_saskaitos_skola_jt(7141968)"
"  Buffers: shared hit=20"
"Planning Time: 0.022 ms"
"Execution Time: 0.574 ms"

```


If I analyze the sql that is inside the function I get results similar to the ones of using plpgsql function:

EXPLAIN (ANALYZE, BUFFERS, VERBOSE)
  SELECT
    COALESCE(sum(mok_nepadengta), 0)
  FROM
    public.b_pardavimai
    JOIN public.b_mokejimai ON (mok_pardavimas = pard_id)
  WHERE
    (pard_tipas = ANY('{1, 2, 6, 7}'))
    AND (mok_saskaita = 7141968)

```

"Aggregate  (cost=2773.78..2773.79 rows=1 width=32) (actual time=0.015..0.016 rows=1 loops=1)"
"  Output: COALESCE(sum((b_mokejimai.mok_nepadengta)::numeric), '0'::numeric)"
"  Buffers: shared hit=4"
"  ->  Nested Loop  (cost=1.00..2771.96 rows=730 width=3) (actual time=0.013..0.013 rows=0 loops=1)"
"        Output: b_mokejimai.mok_nepadengta"
"        Inner Unique: true"
"        Buffers: shared hit=4"
"        ->  Index Scan using idx_saskaita on public.b_mokejimai  (cost=0.56..793.10 rows=746 width=7) (actual time=0.012..0.012 rows=0 loops=1)"
"              Output: b_mokejimai.mok_id, b_mokejimai.mok_moketojas, b_mokejimai.mok_pardavimas, b_mokejimai.mok_laikas, b_mokejimai.mok_suma, b_mokejimai.mok_budas, b_mokejimai.mok_terminas, b_mokejimai.mok_cekis, b_mokejimai.mok_saskaita, b_mokejimai.mok_suma_bazine, b_mokejimai.mok_nepadengta, b_mokejimai.mok_padengta, b_mokejimai.mok_laiko_diena"
"              Index Cond: (b_mokejimai.mok_saskaita = 7141968)"
"              Buffers: shared hit=4"
"        ->  Index Scan using pk_b_pardavimai_id on public.b_pardavimai  (cost=0.44..2.65 rows=1 width=4) (never executed)"
"              Output: b_pardavimai.pard_id, b_pardavimai.pard_preke, b_pardavimai.pard_kaina, b_pardavimai.pard_nuolaida, b_pardavimai.pard_kiekis, b_pardavimai.pard_kasos_nr, b_pardavimai.pard_laikas, b_pardavimai.pard_prekes_id, b_pardavimai.pard_pirkejo_id, b_pardavimai.pard_pardavejas, b_pardavimai.pard_spausdinta, b_pardavimai.pard_reikia_grazinti, b_pardavimai.pard_kam_naudoti, b_pardavimai.pard_susieta, b_pardavimai.pard_galima_anuliuoti, b_pardavimai.pard_tipas, b_pardavimai.pard_pvm, b_pardavimai.pard_apsilankymas, b_pardavimai.pard_fk, b_pardavimai.pard_kelintas, b_pardavimai.pard_precekis, b_pardavimai.pard_imone, b_pardavimai.pard_grazintas, b_pardavimai.pard_debeto_sutartis, b_pardavimai.pard_kaina_be_nld, b_pardavimai.pard_uzsakymas_pos, b_pardavimai.pard_pvm_suma, b_pardavimai.pard_uzsakymo_nr, b_pardavimai.pard_nuolaidos_id, b_pardavimai.pard_nuolaida_taikyti, b_pardavimai.pard_pirkeja_keisti_galima, b_pardavimai.pard_suma_keisti_galima"
"              Index Cond: (b_pardavimai.pard_id = b_mokejimai.mok_pardavimas)"
"              Filter: (b_pardavimai.pard_tipas = ANY ('{1,2,6,7}'::smallint[]))"
"Planning Time: 0.550 ms"
"Execution Time: 0.049 ms"

```


As I understand, the planning in case of sql functions is done everytime the functions is executed. I don't mind if planning would take 0.550 ms as when using plain SQL. But why execution takes ~59ms??...  What is it spent for?

Isn't PosgreSQL supposed to inline simple SQL functions that are stable or immutable?

no, PLpgSQL functions are not inlined

Regards

Pavel

 

Any advice on  where to look for the cause of this "anomaly" is highly appreciated?


I've tried executing the same query on different server and different database - I could not reproduce the behavior. Using SQL function produces results faster.

I'd be gratefull to receive some insights of how to investigate the behavior. I'm not keen on changing the language or the function not knowing why it is required or how it helps...



Regards,

Julius Tuskenis



Re: performance of sql and plpgsql functions

От
Philip Semanchuk
Дата:

> On Jun 17, 2024, at 5:35 AM, Julius Tuskenis <julius.tuskenis@gmail.com> wrote:
>
>
> Isn't PosgreSQL supposed to inline simple SQL functions that are stable or immutable?

Postgres inlines SQL functions under certain conditions:
https://wiki.postgresql.org/wiki/Inlining_of_SQL_functions

One of those conditions is "the function is not SECURITY DEFINER”. It looks like yours is defined that way, so that
mightbe why it’s not being inlined.  

Hope this helps
Philip


Re: performance of sql and plpgsql functions

От
Julius Tuskenis
Дата:
On 2024-06-17 15:59, Philip Semanchuk wrote:

On Jun 17, 2024, at 5:35 AM, Julius Tuskenis <julius.tuskenis@gmail.com> wrote:


Isn't PosgreSQL supposed to inline simple SQL functions that are stable or immutable?
Postgres inlines SQL functions under certain conditions:
https://wiki.postgresql.org/wiki/Inlining_of_SQL_functions

One of those conditions is "the function is not SECURITY DEFINER”. It looks like yours is defined that way, so that might be why it’s not being inlined. 

Hope this helps
Philip

Thank You, Philip.

The link you've provided helps a lot explaining why the body of my SQL function is not inlined.

Any thoughts on why the execution times differ so much? I see planning of a plain SQL is 0.550ms. So I expect the SQL function to spend that time planning (inside), but I get 50ms (100 times longer).


Regards,

Julius Tuskenis

Re: performance of sql and plpgsql functions

От
Pavel Stehule
Дата:


po 17. 6. 2024 v 15:55 odesílatel Julius Tuskenis <julius.tuskenis@gmail.com> napsal:
On 2024-06-17 15:59, Philip Semanchuk wrote:
On Jun 17, 2024, at 5:35 AM, Julius Tuskenis <julius.tuskenis@gmail.com> wrote:


Isn't PosgreSQL supposed to inline simple SQL functions that are stable or immutable?
Postgres inlines SQL functions under certain conditions:
https://wiki.postgresql.org/wiki/Inlining_of_SQL_functions

One of those conditions is "the function is not SECURITY DEFINER”. It looks like yours is defined that way, so that might be why it’s not being inlined. 

Hope this helps
Philip

Thank You, Philip.

The link you've provided helps a lot explaining why the body of my SQL function is not inlined.

Any thoughts on why the execution times differ so much? I see planning of a plain SQL is 0.550ms. So I expect the SQL function to spend that time planning (inside), but I get 50ms (100 times longer).

Attention planning time is time of optimizations, it is not planned (expected) execution time.

Second - The embedded SQL inside PL/pgSQL uses plan cache. Against it, SQL functions are inlined (and then are pretty fast), or not, and then are slower, because there is no plan cache.

I don't know exactly where the problem is, but I've got this issue many times, execution of an not inlined SQL function is slow. If you can, try to use a profiler.



Regards,

Julius Tuskenis

Re: performance of sql and plpgsql functions

От
Tom Lane
Дата:
Julius Tuskenis <julius.tuskenis@gmail.com> writes:
> EXPLAIN (ANALYZE, BUFFERS, VERBOSE)
>    SELECT
>      COALESCE(sum(mok_nepadengta), 0)
>    FROM
>      public.b_pardavimai
>      JOIN public.b_mokejimai ON (mok_pardavimas = pard_id)
>    WHERE
>      (pard_tipas = ANY('{1, 2, 6, 7}'))
>      AND (mok_saskaita = 7141968)

I believe that the SQL-language function executor always uses generic
plans for parameterized queries (which is bad, but nobody's gotten
round to improving it).  So the above is a poor way of investigating
what will happen, because it corresponds to a custom plan for the
value 7141968.  You should try something like

PREPARE p(integer) AS
  SELECT COALESCE ...
  ... AND (mok_saskaita = $1);

SET plan_cache_mode TO force_generic_plan;

EXPLAIN ANALYZE EXECUTE p(7141968);

What I suspect is that the statistics for mok_saskaita are
highly skewed and so with a generic plan the planner will
not risk using a plan that depends on the parameter value
being infrequent, as the one you're showing does.

            regards, tom lane



Re: performance of sql and plpgsql functions

От
Julius Tuskenis
Дата:
On 2024-06-17 17:24, Tom Lane wrote:
Julius Tuskenis <julius.tuskenis@gmail.com> writes:
EXPLAIN (ANALYZE, BUFFERS, VERBOSE)   SELECT     COALESCE(sum(mok_nepadengta), 0)   FROM     public.b_pardavimai     JOIN public.b_mokejimai ON (mok_pardavimas = pard_id)   WHERE     (pard_tipas = ANY('{1, 2, 6, 7}'))     AND (mok_saskaita = 7141968)
I believe that the SQL-language function executor always uses generic
plans for parameterized queries (which is bad, but nobody's gotten
round to improving it).  So the above is a poor way of investigating
what will happen, because it corresponds to a custom plan for the
value 7141968.  You should try something like

PREPARE p(integer) AS  SELECT COALESCE ...  ... AND (mok_saskaita = $1);

SET plan_cache_mode TO force_generic_plan;

EXPLAIN ANALYZE EXECUTE p(7141968);

What I suspect is that the statistics for mok_saskaita are
highly skewed and so with a generic plan the planner will
not risk using a plan that depends on the parameter value
being infrequent, as the one you're showing does.
			regards, tom lane


Thank you Tom Lane, for pointing the problem.

In deed, after setting plan_cache_mode to force_generic_plan I see very different plan:

```

"Finalize Aggregate  (cost=6901.01..6901.02 rows=1 width=32) (actual time=50.258..56.004 rows=1 loops=1)"
"  Output: COALESCE(sum((b_mokejimai.mok_nepadengta)::numeric), '0'::numeric)"
"  Buffers: shared hit=4"
"  ->  Gather  (cost=6900.89..6901.00 rows=1 width=32) (actual time=0.809..55.993 rows=2 loops=1)"
"        Output: (PARTIAL sum((b_mokejimai.mok_nepadengta)::numeric))"
"        Workers Planned: 1"
"        Workers Launched: 1"
"        Buffers: shared hit=4"
"        ->  Partial Aggregate  (cost=5900.89..5900.90 rows=1 width=32) (actual time=0.077..0.079 rows=1 loops=2)"
"              Output: PARTIAL sum((b_mokejimai.mok_nepadengta)::numeric)"
"              Buffers: shared hit=4"
"              Worker 0: actual time=0.052..0.053 rows=1 loops=1"
"              ->  Nested Loop  (cost=25.92..5897.69 rows=1280 width=3) (actual time=0.070..0.072 rows=0 loops=2)"
"                    Output: b_mokejimai.mok_nepadengta"
"                    Inner Unique: true"
"                    Buffers: shared hit=4"
"                    Worker 0: actual time=0.043..0.043 rows=0 loops=1"
"                    ->  Parallel Bitmap Heap Scan on public.b_mokejimai  (cost=25.48..2455.36 rows=1307 width=7) (actual time=0.069..0.070 rows=0 loops=2)"
"                          Output: b_mokejimai.mok_id, b_mokejimai.mok_moketojas, b_mokejimai.mok_pardavimas, b_mokejimai.mok_laikas, b_mokejimai.mok_suma, b_mokejimai.mok_budas, b_mokejimai.mok_terminas, b_mokejimai.mok_cekis, b_mokejimai.mok_saskaita, b_mokejimai.mok_suma_bazine, b_mokejimai.mok_nepadengta, b_mokejimai.mok_padengta, b_mokejimai.mok_laiko_diena"
"                          Recheck Cond: (b_mokejimai.mok_saskaita = $1)"
"                          Buffers: shared hit=4"
"                          Worker 0: actual time=0.042..0.042 rows=0 loops=1"
"                          ->  Bitmap Index Scan on idx_saskaita  (cost=0.00..24.93 rows=2222 width=0) (actual time=0.023..0.023 rows=0 loops=1)"
"                                Index Cond: (b_mokejimai.mok_saskaita = $1)"
"                                Buffers: shared hit=4"
"                    ->  Index Scan using pk_b_pardavimai_id on public.b_pardavimai  (cost=0.44..2.63 rows=1 width=4) (never executed)"
"                          Output: b_pardavimai.pard_id, b_pardavimai.pard_preke, b_pardavimai.pard_kaina, b_pardavimai.pard_nuolaida, b_pardavimai.pard_kiekis, b_pardavimai.pard_kasos_nr, b_pardavimai.pard_laikas, b_pardavimai.pard_prekes_id, b_pardavimai.pard_pirkejo_id, b_pardavimai.pard_pardavejas, b_pardavimai.pard_spausdinta, b_pardavimai.pard_reikia_grazinti, b_pardavimai.pard_kam_naudoti, b_pardavimai.pard_susieta, b_pardavimai.pard_galima_anuliuoti, b_pardavimai.pard_tipas, b_pardavimai.pard_pvm, b_pardavimai.pard_apsilankymas, b_pardavimai.pard_fk, b_pardavimai.pard_kelintas, b_pardavimai.pard_precekis, b_pardavimai.pard_imone, b_pardavimai.pard_grazintas, b_pardavimai.pard_debeto_sutartis, b_pardavimai.pard_kaina_be_nld, b_pardavimai.pard_uzsakymas_pos, b_pardavimai.pard_pvm_suma, b_pardavimai.pard_uzsakymo_nr, b_pardavimai.pard_nuolaidos_id, b_pardavimai.pard_nuolaida_taikyti, b_pardavimai.pard_pirkeja_keisti_galima, b_pardavimai.pard_suma_keisti_galima"
"                          Index Cond: (b_pardavimai.pard_id = b_mokejimai.mok_pardavimas)"
"                          Filter: (b_pardavimai.pard_tipas = ANY ('{1,2,6,7}'::integer[]))"
"Planning Time: 0.016 ms"
"Execution Time: 56.097 ms"

```

If I understand the plan correctly, the problem is the planner expects to find 2222 records for a provide value of `mok_saskaita`.  I've tried running analyze on `b_mokejimai`, but the plan remains the same - must be because some values of `mok_saskaita` do really return tens of thousands of records.

I don't know how the planner comes up with value 2222, because on average there are 15 b_mokejimai records for a single mok_saskaita (if NULL in mok_saskata is ignored), and 628 records if not.


Anyway...

Do you think rewriting a function in plpgsql is a way to go in such case? In pg documentation (https://www.postgresql.org/docs/12/plpgsql-implementation.html#PLPGSQL-PLAN-CACHING) I read that the plan for the plpgsql function is calculated the first time the function is executed (for a connection). I'm concerned, that the function execution is not replanned: I will be stuck with a plan that corresponds to the `mok_saskaita`  parameter value passed on the first execution. Or am I wrong?

Is there a way to make PostgreSQL recalculate the plan on each execution of the function? The observed planning times are acceptable for my application.


Regards,

Julius Tuskenis