Re: [SQL] can i make this sql query more efficiant?

Поиск
Список
Период
Сортировка
От Manfred Koizar
Тема Re: [SQL] can i make this sql query more efficiant?
Дата
Msg-id d0ir8vgn2sp6uh2h61nkm4tdasqqn4o18l@4ax.com
обсуждение исходный текст
Ответы Re: [SQL] can i make this sql query more efficiant?  (Josh Berkus <josh@agliodbs.com>)
Список pgsql-performance
On Fri, 4 Apr 2003 08:16:01 -0800, Josh Berkus <josh@agliodbs.com>
wrote:
>That version is only more efficient for small data sets.    I've generally
>found that case statements are slower than subselects for large data sets.

I'd be honestly interested in the circumstances where you made that
observation.

>YMMV.

Yes, it does :-)  Out of curiosity I did a few tests with PG 7.2 on my
old notebook:

CREATE TABLE baz (event int, level int);
INSERT INTO baz SELECT (100*random()+0.5), (3*random()+0.5);
INSERT INTO baz SELECT (100*random()+0.5), (3*random()+0.5) FROM baz;
...
INSERT INTO baz SELECT (100*random()+0.5), (3*random()+0.5) FROM baz;
CREATE INDEX baz_event ON baz(event);
ANALYSE baz;

SELECT event,
       SUM (CASE level WHEN 1 THEN 1 ELSE 0 END) AS ones,
       SUM (CASE level WHEN 2 THEN 1 ELSE 0 END) AS twos,
       SUM (CASE level WHEN 3 THEN 1 ELSE 0 END) AS threes
  FROM baz GROUP BY event;

SELECT event,
  (SELECT count(*) FROM baz a
    WHERE level = 1 AND a.event=baz.event) AS ones,
  (SELECT count(*) FROM baz a
    WHERE level = 2 and a.event=baz.event) AS twos,
  (SELECT count(*) FROM baz a
    WHERE level = 3 and a.event=baz.event) AS threes
  FROM baz GROUP BY event;

tuples   case         subselect
  8K    718.48 msec   16199.88 msec
 32K   6168.18 msec   74742.85 msec
128K  25072.34 msec  304585.61 msec

CLUSTER baz_event ON baz; ANALYSE baz;
This changes the subselect plan from seq scan to index scan.

128K  12116.07 msec   17530.85 msec

Add 128K more tuples, so that only the first half of the relation is
clustered.

256K  45663.35 msec  117748.23 msec

CLUSTER baz_event ON baz; ANALYSE baz;

256K  23691.81 msec   35138.26 msec

Maybe it is just the data distribution (100 events, 3 levels,
thousands of tuples) that makes CASE look faster than subselects ...

Servus
 Manfred


В списке pgsql-performance по дате отправления:

Предыдущее
От: Josh Berkus
Дата:
Сообщение: Re: OSS database needed for testing
Следующее
От: Josh Berkus
Дата:
Сообщение: Re: [SQL] can i make this sql query more efficiant?