Re: performance of sql and plpgsql functions

Поиск
Список
Период
Сортировка
От Julius Tuskenis
Тема Re: performance of sql and plpgsql functions
Дата
Msg-id 09c67567-f351-415f-af2d-76f7e8661d12@gmail.com
обсуждение исходный текст
Ответ на Re: performance of sql and plpgsql functions  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-performance
On 2024-06-17 17:24, Tom Lane wrote:
Julius Tuskenis <julius.tuskenis@gmail.com> writes:
EXPLAIN (ANALYZE, BUFFERS, VERBOSE)   SELECT     COALESCE(sum(mok_nepadengta), 0)   FROM     public.b_pardavimai     JOIN public.b_mokejimai ON (mok_pardavimas = pard_id)   WHERE     (pard_tipas = ANY('{1, 2, 6, 7}'))     AND (mok_saskaita = 7141968)
I believe that the SQL-language function executor always uses generic
plans for parameterized queries (which is bad, but nobody's gotten
round to improving it).  So the above is a poor way of investigating
what will happen, because it corresponds to a custom plan for the
value 7141968.  You should try something like

PREPARE p(integer) AS  SELECT COALESCE ...  ... AND (mok_saskaita = $1);

SET plan_cache_mode TO force_generic_plan;

EXPLAIN ANALYZE EXECUTE p(7141968);

What I suspect is that the statistics for mok_saskaita are
highly skewed and so with a generic plan the planner will
not risk using a plan that depends on the parameter value
being infrequent, as the one you're showing does.
			regards, tom lane


Thank you Tom Lane, for pointing the problem.

In deed, after setting plan_cache_mode to force_generic_plan I see very different plan:

```

"Finalize Aggregate  (cost=6901.01..6901.02 rows=1 width=32) (actual time=50.258..56.004 rows=1 loops=1)"
"  Output: COALESCE(sum((b_mokejimai.mok_nepadengta)::numeric), '0'::numeric)"
"  Buffers: shared hit=4"
"  ->  Gather  (cost=6900.89..6901.00 rows=1 width=32) (actual time=0.809..55.993 rows=2 loops=1)"
"        Output: (PARTIAL sum((b_mokejimai.mok_nepadengta)::numeric))"
"        Workers Planned: 1"
"        Workers Launched: 1"
"        Buffers: shared hit=4"
"        ->  Partial Aggregate  (cost=5900.89..5900.90 rows=1 width=32) (actual time=0.077..0.079 rows=1 loops=2)"
"              Output: PARTIAL sum((b_mokejimai.mok_nepadengta)::numeric)"
"              Buffers: shared hit=4"
"              Worker 0: actual time=0.052..0.053 rows=1 loops=1"
"              ->  Nested Loop  (cost=25.92..5897.69 rows=1280 width=3) (actual time=0.070..0.072 rows=0 loops=2)"
"                    Output: b_mokejimai.mok_nepadengta"
"                    Inner Unique: true"
"                    Buffers: shared hit=4"
"                    Worker 0: actual time=0.043..0.043 rows=0 loops=1"
"                    ->  Parallel Bitmap Heap Scan on public.b_mokejimai  (cost=25.48..2455.36 rows=1307 width=7) (actual time=0.069..0.070 rows=0 loops=2)"
"                          Output: b_mokejimai.mok_id, b_mokejimai.mok_moketojas, b_mokejimai.mok_pardavimas, b_mokejimai.mok_laikas, b_mokejimai.mok_suma, b_mokejimai.mok_budas, b_mokejimai.mok_terminas, b_mokejimai.mok_cekis, b_mokejimai.mok_saskaita, b_mokejimai.mok_suma_bazine, b_mokejimai.mok_nepadengta, b_mokejimai.mok_padengta, b_mokejimai.mok_laiko_diena"
"                          Recheck Cond: (b_mokejimai.mok_saskaita = $1)"
"                          Buffers: shared hit=4"
"                          Worker 0: actual time=0.042..0.042 rows=0 loops=1"
"                          ->  Bitmap Index Scan on idx_saskaita  (cost=0.00..24.93 rows=2222 width=0) (actual time=0.023..0.023 rows=0 loops=1)"
"                                Index Cond: (b_mokejimai.mok_saskaita = $1)"
"                                Buffers: shared hit=4"
"                    ->  Index Scan using pk_b_pardavimai_id on public.b_pardavimai  (cost=0.44..2.63 rows=1 width=4) (never executed)"
"                          Output: b_pardavimai.pard_id, b_pardavimai.pard_preke, b_pardavimai.pard_kaina, b_pardavimai.pard_nuolaida, b_pardavimai.pard_kiekis, b_pardavimai.pard_kasos_nr, b_pardavimai.pard_laikas, b_pardavimai.pard_prekes_id, b_pardavimai.pard_pirkejo_id, b_pardavimai.pard_pardavejas, b_pardavimai.pard_spausdinta, b_pardavimai.pard_reikia_grazinti, b_pardavimai.pard_kam_naudoti, b_pardavimai.pard_susieta, b_pardavimai.pard_galima_anuliuoti, b_pardavimai.pard_tipas, b_pardavimai.pard_pvm, b_pardavimai.pard_apsilankymas, b_pardavimai.pard_fk, b_pardavimai.pard_kelintas, b_pardavimai.pard_precekis, b_pardavimai.pard_imone, b_pardavimai.pard_grazintas, b_pardavimai.pard_debeto_sutartis, b_pardavimai.pard_kaina_be_nld, b_pardavimai.pard_uzsakymas_pos, b_pardavimai.pard_pvm_suma, b_pardavimai.pard_uzsakymo_nr, b_pardavimai.pard_nuolaidos_id, b_pardavimai.pard_nuolaida_taikyti, b_pardavimai.pard_pirkeja_keisti_galima, b_pardavimai.pard_suma_keisti_galima"
"                          Index Cond: (b_pardavimai.pard_id = b_mokejimai.mok_pardavimas)"
"                          Filter: (b_pardavimai.pard_tipas = ANY ('{1,2,6,7}'::integer[]))"
"Planning Time: 0.016 ms"
"Execution Time: 56.097 ms"

```

If I understand the plan correctly, the problem is the planner expects to find 2222 records for a provide value of `mok_saskaita`.  I've tried running analyze on `b_mokejimai`, but the plan remains the same - must be because some values of `mok_saskaita` do really return tens of thousands of records.

I don't know how the planner comes up with value 2222, because on average there are 15 b_mokejimai records for a single mok_saskaita (if NULL in mok_saskata is ignored), and 628 records if not.


Anyway...

Do you think rewriting a function in plpgsql is a way to go in such case? In pg documentation (https://www.postgresql.org/docs/12/plpgsql-implementation.html#PLPGSQL-PLAN-CACHING) I read that the plan for the plpgsql function is calculated the first time the function is executed (for a connection). I'm concerned, that the function execution is not replanned: I will be stuck with a plan that corresponds to the `mok_saskaita`  parameter value passed on the first execution. Or am I wrong?

Is there a way to make PostgreSQL recalculate the plan on each execution of the function? The observed planning times are acceptable for my application.


Regards,

Julius Tuskenis

В списке pgsql-performance по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: performance of sql and plpgsql functions
Следующее
От: Thomas Simpson
Дата:
Сообщение: Row level security