Re: Mysterious performance of query because of plsql

Поиск
Список
Период
Сортировка
От mike g
Тема Re: Mysterious performance of query because of plsql
Дата
Msg-id 1088743187.23457.64.camel@localhost.localdomain
обсуждение исходный текст
Ответ на Mysterious performance of query because of plsql function in where condition  ("Peter Alberer" <h9351252@obelix.wu-wien.ac.at>)
Список pgsql-general
Do you have any indexes created on the submissions table?  If not
postgresql has no choice but read every row in the table.

If you do they are not being used. The tuning masters would really need
to see definition of the table if the indexes are not being used.

You might find a lot of pointers in the pgsql-performance mailing list
instead of this one.

Mike


On Thu, 2004-07-01 at 06:52, Peter Alberer wrote:
> Hi there,
>
> i have a problem with a query that uses the result of a plsql function
> In the where clause:
>
> SELECT
>    assignments.assignment_id,
>    assignments.package_id AS package_id,
>    assignments.title AS title,
>    COUNT(*) AS Count
> FROM
>    assignments INNER JOIN submissions ON
>    (assignments.assignment_id=submissions.assignment_id)
> WHERE
>    package_id=949589 AND
>    submission_status(submissions.submission_id)='closed'
> GROUP BY
>    assignments.assignment_id, assignments.package_id, assignments.title
> ORDER BY
>    assignments.title;
>
> Postgres seems to execute the function "submission_status" for every row
> of the submissions table (~1500 rows). The query therefore takes quite a
> lot time, although in fact no row is returned from the assignments table
> when the condition package_id=949589 is used.
>
>                                                           QUERY PLAN
>
> ------------------------------------------------------------------------
> ------------------------------------------------------
>  Sort  (cost=41.21..41.21 rows=1 width=35) (actual
> time=4276.978..4276.978 rows=0 loops=1)
>    Sort Key: assignments.title
>    ->  HashAggregate  (cost=41.19..41.20 rows=1 width=35) (actual
> time=4276.970..4276.970 rows=0 loops=1)
>          ->  Hash Join  (cost=2.40..41.18 rows=1 width=35) (actual
> time=4276.966..4276.966 rows=0 loops=1)
>                Hash Cond: ("outer".assignment_id =
> "inner".assignment_id)
>                ->  Seq Scan on submissions  (cost=0.00..38.73 rows=9
> width=4) (actual time=10.902..4276.745 rows=38 loops=1)
>                      Filter: (submission_status(submission_id) =
> 'closed'::text)
>                ->  Hash  (cost=2.40..2.40 rows=2 width=35) (actual
> time=0.058..0.058 rows=0 loops=1)
>                      ->  Seq Scan on assignments  (cost=0.00..2.40
> rows=2 width=35) (actual time=0.015..0.052 rows=2 loops=1)
>                            Filter: (package_id = 949589)
>  Total runtime: 4277.078 ms
> (11 rows)
>
> I therefore tried to rephrase the query, to make sure that the function
> is only used for the rows returned by the join but not even the
> following does help (the subselect t1 does not return a single row):
>
> select * from (
>     SELECT
>       a.assignment_id, a.package_id, a.title, s.submission_id,
> COUNT(*) AS Count
>     FROM
>       assignments a INNER JOIN submissions s ON
> (a.assignment_id=s.assignment_id)
>     WHERE
>         a.package_id=949589
>     GROUP BY
>         a.assignment_id, a.package_id, a.title, s.submission_id
> ) t1
> where
>    submission_status(t1.submission_id)='closed'
> order by
>    title;
>
>                                                               QUERY PLAN
>
> ------------------------------------------------------------------------
> --------------------------------------------------------------
>  Sort  (cost=41.21..41.22 rows=1 width=188) (actual
> time=4114.251..4114.251 rows=0 loops=1)
>    Sort Key: title
>    ->  Subquery Scan t1  (cost=41.20..41.20 rows=1 width=188) (actual
> time=4114.242..4114.242 rows=0 loops=1)
>          ->  HashAggregate  (cost=41.20..41.20 rows=1 width=39) (actual
> time=4114.238..4114.238 rows=0 loops=1)
>                ->  Hash Join  (cost=2.40..41.18 rows=1 width=39) (actual
> time=4114.235..4114.235 rows=0 loops=1)
>                      Hash Cond: ("outer".assignment_id =
> "inner".assignment_id)
>                      ->  Seq Scan on submissions s  (cost=0.00..38.73
> rows=9 width=8) (actual time=7.179..4113.984 rows=38 loops=1)
>                            Filter: (submission_status(submission_id) =
> 'closed'::text)
>                      ->  Hash  (cost=2.40..2.40 rows=2 width=35) (actual
> time=0.100..0.100 rows=0 loops=1)
>                            ->  Seq Scan on assignments a
> (cost=0.00..2.40 rows=2 width=35) (actual time=0.045..0.094 rows=2
> loops=1)
>                                  Filter: (package_id = 949589)
>  Total runtime: 4114.356 ms
> (12 rows)
>
> The function is nevertheless executed for every row in the submissions
> table. A simple "select *, submission_status(submission_id) from
> submissions" takes about the same time as the 2 queries stated above.
>
> The whole database has been vacuum analysed right before the explain
> analyse output has been captured.
>
> What can I do to reduce the time this query takes? And why is the
> function executed although there is no row in the result set of t1 in my
> rephrased query?
>
> TIA, peter
>
> --
> peter.alberer@wu-wien.ac.at  Tel: +43/1/31336/4341
> Abteilung für Wirtschaftsinformatik, Wirtschaftsuniversitaet Wien,
> Austria
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
>       subscribe-nomail command to majordomo@postgresql.org so that your
>       message can get through to the mailing list cleanly

В списке pgsql-general по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: COPY command with blank incoming values
Следующее
От: Dennis Gearon
Дата:
Сообщение: administrating permisions, groups, owners