Обсуждение: plpgsql function is so slow

Поиск
Список
Период
Сортировка

plpgsql function is so slow

От
Euler Taveira de Oliveira
Дата:
Hi,

I recently saw a complaint that a simple PL/PgSQL code is slower than PL/SQL.
I did some benchmark and confirmed it is. I coded the same function
(function2) in C just to compare with something. According to OP [1], the
PL/SQL seems to run more than 15x faster than PL/PgSQL code.


euler=# select function1();function1
-----------100000000
(1 row)

Time: 62107,607 ms
euler=# select function2();function2
-----------100000000
(1 row)

Time: 419,673 ms

The PL/PgSQL function is:

CREATE OR REPLACE FUNCTION function1()
RETURNS INTEGER AS
$BODY$
DECLARE i INTEGER; s INTEGER := 0;
BEGIN FOR i IN 1 .. power(10, 8) LOOP    s := s + 1; END LOOP; RETURN s;
END;
$BODY$
LANGUAGE 'plpgsql' IMMUTABLE;

The C function is:

#include "postgres.h"
#include <math.h>
#include "fmgr.h"

#ifdef PG_MODULE_MAGIC
PG_MODULE_MAGIC;
#endif

PG_FUNCTION_INFO_V1(function2);

Datum function2(PG_FUNCTION_ARGS)
{int    i;int s = 0;
for (i = 1; i <= (int) pow(10, 8); i++)    s += 1;
PG_RETURN_INT32(s);
}

PL/PgSQL oprofile is:

samples  %        symbol name
2263     25.6024  AllocSetReset
1071     12.1168  ExecMakeFunctionResultNoSets
725       8.2023  AllocSetAlloc
664       7.5122  RevalidateCachedPlan
586       6.6297  ExecEvalParam
521       5.8943  AcquireExecutorLocks
463       5.2381  ResourceOwnerForgetPlanCacheRef
359       4.0615  AllocSetFreeIndex
329       3.7221  int4pl
262       2.9641  ExecEvalConst
248       2.8057  check_stack_depth
244       2.7605  MemoryContextReset
234       2.6474  list_head
143       1.6178  ReleaseCachedPlan
130       1.4708  MemoryContextAlloc
121       1.3689  pgstat_end_function_usage
111       1.2558  pgstat_init_function_usage
98        1.1087  list_head
94        1.0635  ResourceOwnerEnlargePlanCacheRefs
90        1.0182  ResourceOwnerRememberPlanCacheRef
44        0.4978  SPI_push
39        0.4412  SPI_pop

Any ideas?

[1]
http://listas.postgresql.org.br/pipermail/pgbr-geral/2009-September/017427.html


--  Euler Taveira de Oliveira http://www.timbira.com/


Re: plpgsql function is so slow

От
Tom Lane
Дата:
Euler Taveira de Oliveira <euler@timbira.com> writes:
> I recently saw a complaint that a simple PL/PgSQL code is slower than PL/SQL.
> I did some benchmark and confirmed it is. I coded the same function
> (function2) in C just to compare with something. According to OP [1], the
> PL/SQL seems to run more than 15x faster than PL/PgSQL code.

Does incrementing an integer in a tight loop represent the complainant's
typical plpgsql workload?  If so, maybe he should go use another PL.
plpgsql is intended for work that involves a lot of database access, and
so the overhead in operations like this isn't significant.  We offer
other PLs that make different tradeoffs.

FWIW, the high showing of AllocSetReset in your profile suggests to me
that you're timing an assert-enabled build, which wouldn't exactly be
a fair comparison to an Oracle production build anyhow.
        regards, tom lane


Re: plpgsql function is so slow

От
Euler Taveira de Oliveira
Дата:
Tom Lane escreveu:
> FWIW, the high showing of AllocSetReset in your profile suggests to me
> that you're timing an assert-enabled build, which wouldn't exactly be
> a fair comparison to an Oracle production build anyhow.
> 
Ops... forgot to remove it from other test. It seems much better but far from
the ideal. :( I've never taken a look at the pl/pgsql code but it could be
nice if there would be two path codes: access-data and non-access-data paths.
I have no idea if it will be possible (is path type too complex to detect?)
but it will certainly improve the non-access-data functions.

euler=# select function1();  -- PL/PgSQLfunction1
-----------100000000
(1 row)

Time: 34455,263 ms
euler=# select function3();  -- PL/Perlfunction3
-----------100000000
(1 row)

Time: 24986,016 ms

pl/pgsql oprofile is:

samples  %        symbol name
620      14.4961  ExecMakeFunctionResultNoSets
537      12.5555  AllocSetAlloc
484      11.3163  AllocSetReset
323       7.5520  RevalidateCachedPlan
292       6.8272  ExecEvalParam
222       5.1906  pgstat_init_function_usage
218       5.0970  int4pl
199       4.6528  ResourceOwnerForgetPlanCacheRef
196       4.5827  AcquireExecutorLocks
175       4.0917  ResourceOwnerRememberPlanCacheRef
166       3.8812  ReleaseCachedPlan
155       3.6240  pgstat_end_function_usage
143       3.3435  check_stack_depth
136       3.1798  ExecEvalConst
123       2.8758  ResourceOwnerEnlargePlanCacheRefs
100       2.3381  MemoryContextReset
99        2.3147  MemoryContextAlloc
48        1.1223  SPI_pop
41        0.9586  SPI_push


--  Euler Taveira de Oliveira http://www.timbira.com/



Re: plpgsql function is so slow

От
Pavel Stehule
Дата:
2009/9/24 Euler Taveira de Oliveira <euler@timbira.com>:
> Hi,
>
> I recently saw a complaint that a simple PL/PgSQL code is slower than PL/SQL.
> I did some benchmark and confirmed it is. I coded the same function
> (function2) in C just to compare with something. According to OP [1], the
> PL/SQL seems to run more than 15x faster than PL/PgSQL code.

sure - PL/SQL is translated to native code, so PL/pgSQL is only
interpret. What more - PL/SQL or C use native integer arithmetic, but
PL/pgSQL use PostgreSQL integer arithmetic.

so if you have to use fast code, use C, plperl, but not use PL/pgSQL -
it is just glue for SQL statements.

regards
Pavel Stehule

>
>
> euler=# select function1();
>  function1
> -----------
>  100000000
> (1 row)
>
> Time: 62107,607 ms
> euler=# select function2();
>  function2
> -----------
>  100000000
> (1 row)
>
> Time: 419,673 ms
>
> The PL/PgSQL function is:
>
> CREATE OR REPLACE FUNCTION function1()
> RETURNS INTEGER AS
> $BODY$
> DECLARE
>  i INTEGER;
>  s INTEGER := 0;
> BEGIN
>  FOR i IN 1 .. power(10, 8) LOOP
>     s := s + 1;
>  END LOOP;
>  RETURN s;
> END;
> $BODY$
> LANGUAGE 'plpgsql' IMMUTABLE;
>
> The C function is:
>
> #include "postgres.h"
> #include <math.h>
> #include "fmgr.h"
>
> #ifdef PG_MODULE_MAGIC
> PG_MODULE_MAGIC;
> #endif
>
> PG_FUNCTION_INFO_V1(function2);
>
> Datum function2(PG_FUNCTION_ARGS)
> {
>        int     i;
>        int s = 0;
>
>        for (i = 1; i <= (int) pow(10, 8); i++)
>                s += 1;
>
>        PG_RETURN_INT32(s);
> }
>
> PL/PgSQL oprofile is:
>
> samples  %        symbol name
> 2263     25.6024  AllocSetReset
> 1071     12.1168  ExecMakeFunctionResultNoSets
> 725       8.2023  AllocSetAlloc
> 664       7.5122  RevalidateCachedPlan
> 586       6.6297  ExecEvalParam
> 521       5.8943  AcquireExecutorLocks
> 463       5.2381  ResourceOwnerForgetPlanCacheRef
> 359       4.0615  AllocSetFreeIndex
> 329       3.7221  int4pl
> 262       2.9641  ExecEvalConst
> 248       2.8057  check_stack_depth
> 244       2.7605  MemoryContextReset
> 234       2.6474  list_head
> 143       1.6178  ReleaseCachedPlan
> 130       1.4708  MemoryContextAlloc
> 121       1.3689  pgstat_end_function_usage
> 111       1.2558  pgstat_init_function_usage
> 98        1.1087  list_head
> 94        1.0635  ResourceOwnerEnlargePlanCacheRefs
> 90        1.0182  ResourceOwnerRememberPlanCacheRef
> 44        0.4978  SPI_push
> 39        0.4412  SPI_pop
>
> Any ideas?
>
> [1]
> http://listas.postgresql.org.br/pipermail/pgbr-geral/2009-September/017427.html
>
>
> --
>  Euler Taveira de Oliveira
>  http://www.timbira.com/
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: plpgsql function is so slow

От
Andrew Gierth
Дата:
>>>>> "Euler" == Euler Taveira de Oliveira <euler@timbira.com> writes:
Euler> Ops... forgot to remove it from other test. It seems muchEuler> better but far from the ideal. :( I've never
takena look atEuler> the pl/pgsql code but it could be nice if there would be twoEuler> path codes: access-data and
non-access-datapaths.  I have noEuler> idea if it will be possible (is path type too complex toEuler> detect?)  but it
willcertainly improve the non-access-dataEuler> functions.
 

Like Tom said, this benchmark is silly. Some comparisons (note that in
all these cases I've replaced the power(10,8) with a constant, because
you weren't comparing like with like there):

plpgsql     13.3 sec
tcl85       29.9 sec
perl5.8      7.7 sec
python2.6   11.5 sec
C            0.242 sec

What this suggests to me is that plpgsql isn't so far off the norm for
interpreted scripting languages; sure it's slower than perl, but then
most things are; comparing it with C code is just silly.

There is, though, one genuine case that's come up a few times in IRC
regarding slowness of procedural code in pg, and that's any time
someone tries to implement some array-based algorithm in plpgsql. The
fact that a[i] is O(i) not O(1) (unless the array type is fixed length)
comes as a nasty shock since iterating over an array becomes O(n^2).

This is obviously a consequence of the array storage format; is there
any potential for changing that to some format which has, say, an array
of element offsets at the start, rather than relying on stepping over
length fields?

-- 
Andrew (irc:RhodiumToad)


Re: plpgsql function is so slow

От
Merlin Moncure
Дата:
On Fri, Sep 25, 2009 at 1:05 AM, Andrew Gierth
<andrew@tao11.riddles.org.uk> wrote:
>>>>>> "Euler" == Euler Taveira de Oliveira <euler@timbira.com> writes:
>
>  Euler> Ops... forgot to remove it from other test. It seems much
>  Euler> better but far from the ideal. :( I've never taken a look at
>  Euler> the pl/pgsql code but it could be nice if there would be two
>  Euler> path codes: access-data and non-access-data paths.  I have no
>  Euler> idea if it will be possible (is path type too complex to
>  Euler> detect?)  but it will certainly improve the non-access-data
>  Euler> functions.
>
> Like Tom said, this benchmark is silly. Some comparisons (note that in
> all these cases I've replaced the power(10,8) with a constant, because
> you weren't comparing like with like there):
>
> plpgsql     13.3 sec
> tcl85       29.9 sec
> perl5.8      7.7 sec
> python2.6   11.5 sec
> C            0.242 sec
>
> What this suggests to me is that plpgsql isn't so far off the norm for
> interpreted scripting languages; sure it's slower than perl, but then
> most things are; comparing it with C code is just silly.
>
> There is, though, one genuine case that's come up a few times in IRC
> regarding slowness of procedural code in pg, and that's any time
> someone tries to implement some array-based algorithm in plpgsql. The
> fact that a[i] is O(i) not O(1) (unless the array type is fixed length)
> comes as a nasty shock since iterating over an array becomes O(n^2).
>
> This is obviously a consequence of the array storage format; is there
> any potential for changing that to some format which has, say, an array
> of element offsets at the start, rather than relying on stepping over
> length fields?

Couple points:
*) Surely, it's better to encourage use of 'unnest' style approaches
for array iteration
*) If an array has fixed length elements and doesn't have null
elements (a fairly common case), maybe it's worthwhile not
generating/storing the lengths vector?
*) Wouldn't it be possible to store offsets always, not lengths, since
you can calculate the length from the next offset?

merlin