Обсуждение: SRF called with optional NULL input runs 7x slower

Поиск
Список
Период
Сортировка

SRF called with optional NULL input runs 7x slower

От
Ow Mun Heng
Дата:
I've got 2 nearly identical SRFs to retrieve data from the DB(pg 8.2.4)
which goes something like this

================= Function 1 ======================
CREATE OR REPLACE FUNCTION hot(fromdate timestamp without time zone, todate timestamp without time zone, code text[])
  RETURNS SETOF trh_hot AS
$BODY$

DECLARE
rec RECORD;

BEGIN
    FOR rec IN
    SELECT
    foo.a,
    bar.b
    FROM d
    INNER JOIN ts
        ON ts.id  = D.id
    inner join trh
         ON ts.id  = trh.id
        AND ts.ttype = trh.ttype
        AND ts.run_date = trh.run_date
    WHERE d.record_update_date_time BETWEEN fromdate AND todate
      AND trh.run_date BETWEEN fromdate AND todate
      AND trh.ttype IN (select ttype from lookup_ttype where tsequence = 'hot')
      AND d.code = any (code)
    LOOP
    RETURN NEXT rec;
    END LOOP;
    RETURN;
END;
$BODY$
  LANGUAGE 'plpgsql' VOLATILE;

===================================

========== Function 2 ===================
CREATE OR REPLACE FUNCTION hot(fromdate timestamp without time zone, todate timestamp without time zone, code text[],
sntext[]) 
  RETURNS SETOF trh_hot AS
$BODY$

DECLARE
rec RECORD;

BEGIN
    FOR rec IN
    SELECT
    foo.a,
    bar.b
    FROM d
    INNER JOIN ts
        ON ts.id  = D.id
    inner join trh
         ON ts.id  = trh.id
        AND ts.ttype = trh.ttype
        AND ts.run_date = trh.run_date
    WHERE d.record_update_date_time BETWEEN fromdate AND todate
      AND trh.run_date BETWEEN fromdate AND todate
      AND trh.ttype IN (select ttype from lookup_ttype where tsequence = 'hot')
-->      AND (CASE WHEN code IS NULL THEN true else d.code = any (code) END)
-->      AND (CASE WHEN sn IS NULL THEN TRUE else D.id = any(SN) END)
    LOOP
    RETURN NEXT rec;
    END LOOP;
    RETURN;
END;
$BODY$
  LANGUAGE 'plpgsql' VOLATILE;

===================== >8 =============================

The main difference being these 2 lines

  AND  (CASE WHEN code IS NULL THEN true else d.code = any (code) END)
  AND (CASE WHEN sn IS NULL THEN TRUE else D.id = any(SN) END)

which the aim is to check for NULL input and determine whether or not a
filter is necessary for those items.

The query is called by these respectively

Func1:  select * from HOT('8/1/2007','9/30/2007','{HUA71}')
Func2:  select * from HOT2('8/1/2007','9/30/2007','{HUA71}',NULL)

If I put in all the SN in replacement of the NULL, it's slightly faster
at 37 secs to return the data.


Vmstat 5 on Function1:7 secs 8400 rows
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
 0  0    268  32296   2908 1096780    1    1    92    75 1083 1021 10  6 82  3
 4  1    268  33708   2864 1094856    0    0  5771    21 1134 3518 23 19 51  7
 6  0    268  39196   2876 1091396    0    0  2074    27 1230 3477 33 51  0 16

very little on cpu wait.

Vmstat 5 on Function2: 50secs 8400 rows
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
 3  0    268  33648   4980 1087552    0    0  1606   103 1216 2943 21 32  0 47
 2  1    268  33888   4484 1089576    0    0  2352    14 1209 2811 19 32  0 49
 3  1    268  34380   4100 1090564    0    0  1270    13 1187 2890 13 21  0 66
 2  1    268  34184   4004 1091408    0    0  1107    98 1206 2949 12 19  0 69
 3  2    268  33004   3984 1093312    0    0  1357    34 1203 2931 11 20  0 69
 4  0    268  33008   3860 1093556    0    0  1302     8 1193 2897 13 21  0 65
 2  1    268  33608   3800 1093140    0    0  1286     6 1195 2954 12 18  0 70
 1  1    268  33552   3300 1093776    0    0  1282   109 1213 2896 12 20  0 67
 3  1    268  32404   3296 1095368    0    0  1202     1 1188 2908 12 19  0 69
 1  1    268  33976   2680 1094476    0    0  1184     0 1180 2905 11 19  0 70
 3  1    268  33336   2440 1095924    0    0  1291     8 1196 2944 11 19  0 70
 2  1    268  34112   2272 1095412    0    0  1245   118 1221 2932 13 19  0 68
 2  0    268  34844   2756 1095364    0    0   698     4 1145 3040 19 49  0 32

Just look at the CPU wait. What Gives? Does checkin on NULL causes a penalty? I
tried running the same query as is on pgadmin3 and it looked to be OK
(the null checking). I do notice that the system (my laptop) was doing
quite a bit of IO.

how can I debug or diagnose where the issues lies? Explain analyse
doesn't do much since this is a Function Scan anyway.

AS of right now, I'm trying to see if I can do Dynamic SQL instead.



Re: SRF called with optional NULL input runs 7x slower

От
Alvaro Herrera
Дата:
Ow Mun Heng wrote:

> how can I debug or diagnose where the issues lies? Explain analyse
> doesn't do much since this is a Function Scan anyway.

Take them out of the function and EXPLAIN ANALYZE them as plain SQL.
Note that you should take the parameters out of the literal, so you need
to do something like

PREPARE foo AS SELECT ...   replace code with $1, etc ...
EXPLAIN ANALYZE EXECUTE foo(code, ...)

Otherwise they would be treated as constants so the queries would be
planned differently.

(untested, so correct the syntax appropriately)

--
Alvaro Herrera                 http://www.amazon.com/gp/registry/DXLWNGRJD34J
"Before you were born your parents weren't as boring as they are now. They
got that way paying your bills, cleaning up your room and listening to you
tell them how idealistic you are."  -- Charles J. Sykes' advice to teenagers

Re: SRF called with optional NULL input runs 7x slower

От
Ow Mun Heng
Дата:
On Mon, 2007-09-24 at 00:18 -0400, Alvaro Herrera wrote:
> Ow Mun Heng wrote:
>
> > how can I debug or diagnose where the issues lies? Explain analyse
> > doesn't do much since this is a Function Scan anyway.
>
> Take them out of the function and EXPLAIN ANALYZE them as plain SQL.
> Note that you should take the parameters out of the literal, so you need
> to do something like
>
> PREPARE foo AS SELECT ...   replace code with $1, etc ...
> EXPLAIN ANALYZE EXECUTE foo(code, ...)
>
> Otherwise they would be treated as constants so the queries would be
> planned differently.
>
> (untested, so correct the syntax appropriately)
>
Okay.. I tried that, but it seems like there's an issue w/ the CASE
statements.

When I tried the prepare w/

AND (CASE WHEN $3 IS NULL THEN true else d.code = any ($3) END)
AND (CASE WHEN $4 IS NULL THEN TRUE else D.id = any($4) END)


it ERRORs w/ could not determine data type of parameter $3

If I were to just use

AND D.code = ANY($3) then it would work.

Any other clues?

Re: SRF called with optional NULL input runs 7x slower

От
Alvaro Herrera
Дата:
Ow Mun Heng wrote:

> Okay.. I tried that, but it seems like there's an issue w/ the CASE
> statements.
>
> When I tried the prepare w/
>
> AND (CASE WHEN $3 IS NULL THEN true else d.code = any ($3) END)
> AND (CASE WHEN $4 IS NULL THEN TRUE else D.id = any($4) END)
>
>
> it ERRORs w/ could not determine data type of parameter $3

Cast it to the correct type.

--
Alvaro Herrera       Valdivia, Chile   ICBM: S 39º 49' 18.1", W 73º 13' 56.4"
"Cómo ponemos nuestros dedos en la arcilla del otro. Eso es la amistad; jugar
al alfarero y ver qué formas se pueden sacar del otro" (C. Halloway en
La Feria de las Tinieblas, R. Bradbury)

Re: SRF called with optional NULL input runs 7x slower

От
Ow Mun Heng
Дата:
On Mon, 2007-09-24 at 02:31 -0400, Alvaro Herrera wrote:
> Ow Mun Heng wrote:
>
> > Okay.. I tried that, but it seems like there's an issue w/ the CASE
> > statements.
> >
> > When I tried the prepare w/
> >
> > AND (CASE WHEN $3 IS NULL THEN true else d.code = any ($3) END)
> > AND (CASE WHEN $4 IS NULL THEN TRUE else D.id = any($4) END)
> >
> >
> > it ERRORs w/ could not determine data type of parameter $3
>
> Cast it to the correct type.

how do I cast a NULL? Is it Varchar?

Anyway.. I tried

AND (CASE WHEN CAST($3 as VARCHAR) IS NULL THEN true else d.code = any
($3) END)

and I get  ERROR: op ANY/ALL (array) requires array on right side
>

Re: SRF called with optional NULL input runs 7x slower

От
Alvaro Herrera
Дата:
Ow Mun Heng wrote:
> On Mon, 2007-09-24 at 02:31 -0400, Alvaro Herrera wrote:

> > > it ERRORs w/ could not determine data type of parameter $3
> >
> > Cast it to the correct type.
>
> how do I cast a NULL? Is it Varchar?

I didn't say "pick a random datatype", I said "the correct type", which
in this case means the type "code" has in the function.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.