Re: reforming query for 7.0.2

Поиск
Список
Период
Сортировка
От Markus Bertheau
Тема Re: reforming query for 7.0.2
Дата
Msg-id 1011272192.15907.24.camel@entwicklung01.cenes.de
обсуждение исходный текст
Ответ на Re: reforming query for 7.0.2  ("Josh Berkus" <josh@agliodbs.com>)
Список pgsql-sql
Josh,

> Markus,
>
> > how do I reform this query to work with 7.0.2?
>
> Better question:  Why are you working with 7.0.2?  Even the mass-market Linux
>  distros (like Red Hat and SuSE) now come with 7.1.x.

Yeah, I know... An upgrade is definetely in the queue. But it will not
happen before 7.2 (if my imagination of its release date is correct), so
we will most probably go to that.
>
> > select * from personen join (select count(personen_id), personen_id from
> > orders group by personen_id) as ordertemp on personen.personen_id =
> > ordertemp.personen_id
>
> Acutally, this query needs some reforming on its own.  While it will work, the
>  following version will achieve the same result, much faster, in 7.1 (and 7.2,
>  for that matter):
>
> SELECT personen.field1, personen.field2, personen.field3, count(order_id) as
>  no_orders
> FROM personen LEFT OUTER JOIN orders ON personen.personen_id =
>  orders.personen_id
> GROUP BY  personen.field1, personen.field2, personen.field3
>
> ... you see, your subselect above is completely unnecessary.  And slower than a
>  LEFT OUTER JOIN.  Simplicity, simplicity, simplicity!

Thank you for that.

> Now, we have the problem of no LEFT JOIN support in 7.0, so:
>
> SELECT personen.field1, personen.field2, personen.field3, count(order_id) as
>  no_orders
> FROM personen JOIN orders ON personen.personen_id = orders.personen_id
> GROUP BY  personen.field1, personen.field2, personen.field3
> UNION
> SELECT personen.field1, personen.field2, personen.field3, 0 as no_orders
> FROM personen
> WHERE NOT EXISTS (SELECT personen_id FROM orders WHERE personen_id =
>  personen.personen_id);

Well, I actually don't need the left outer join, because I don't want
persons included that have no orders. I think the original query didn't
include them. So I turn the left outer join into a join and have the
same query on both versions :-). (We use 7.1.3 on our development
server).

Thanks again.

But let's take a look at explain. The old query shows a much smaller
cost estimate compared to the new one. Both databases are vacuumed on a
regular (daily) basis. First 7.1.3:

cenes_test=> select version();                          version
-------------------------------------------------------------PostgreSQL 7.1.3 on i686-pc-linux-gnu, compiled by GCC
2.96
(1 row)

// old query
cenes_test=> explain select * from personen join (select
count(orders_id), personen_id from orders group by personen_id) as
ordertemp on ordertemp.personen_id = personen.personen_id;
NOTICE:  QUERY PLAN:

Nested Loop  (cost=11.76..42.32 rows=15 width=272) ->  Subquery Scan ordertemp  (cost=11.76..12.50 rows=15 width=8)
 ->  Aggregate  (cost=11.76..12.50 rows=15 width=8)             ->  Group  (cost=11.76..12.13 rows=147 width=8)
         ->  Sort  (cost=11.76..11.76 rows=147 width=8)                         ->  Seq Scan on orders
(cost=0.00..6.47
rows=147 width=8) ->  Index Scan using personen_pkey on personen  (cost=0.00..2.02
rows=1 width=264)

EXPLAIN

// new query

cenes_test=> explain select personen.nachname, personen.vorname,
personen.firma, personen.personen_id, count(orders_id) from personen
join orders on personen.personen_id = orders.personen_id group by
personen.nachname, personen.vorname, personen.firma,
personen.personen_id;
NOTICE:  QUERY PLAN:

Aggregate  (cost=162.53..164.37 rows=15 width=48) ->  Group  (cost=162.53..164.00 rows=147 width=48)       ->  Sort
(cost=162.53..162.53rows=147 width=48)             ->  Merge Join  (cost=139.59..157.24 rows=147 width=48)
    ->  Sort  (cost=127.83..127.83 rows=1265 width=40)                         ->  Seq Scan on personen
(cost=0.00..62.65
rows=1265 width=40)                   ->  Sort  (cost=11.76..11.76 rows=147 width=8)                         ->  Seq
Scanon orders  (cost=0.00..6.47 
rows=147 width=8)

EXPLAIN

// That's a lot more. About row estimates:

cenes_test=> select count(personen_id) from personen;count
------- 1272
(1 row)

cenes_test=> select count(orders_id) from orders;count
-------  189
(1 row)

cenes_test=> select count(orders_id) from orders group by personen_id
order by count desc;count
-------   41   33   15   12   10    8    7    5    4    3    2    2    2    2    2    2    2    1
// 36 x 1 snipped
(54 rows)

On the production db:
cenes=> select version();                           version
---------------------------------------------------------------PostgreSQL 7.0.2 on i686-pc-linux-gnu, compiled by gcc
2.95.2
(1 row)

// old query doesn't work
// new query
cenes=> explain select personen.nachname, personen.vorname,
personen.firma, personen.personen_id, count(orders_id) from personen
join orders on personen.personen_id = orders.personen_id group by
personen.nachname, personen.vorname, personen.firma,
personen.personen_id;
NOTICE:  QUERY PLAN:

Aggregate  (cost=201.91..204.87 rows=24 width=48) ->  Group  (cost=201.91..204.28 rows=237 width=48)       ->  Sort
(cost=201.91..201.91rows=237 width=48)             ->  Merge Join  (cost=170.85..192.56 rows=237 width=48)
    ->  Sort  (cost=18.72..18.72 rows=237 width=8)                         ->  Seq Scan on orders  (cost=0.00..9.37 
rows=237 width=8)                   ->  Sort  (cost=152.13..152.13 rows=1500 width=40)                         ->  Seq
Scanon personen  (cost=0.00..73.00 
rows=1500 width=40)

EXPLAIN

cenes=> select count(personen_id) from personen;count
------- 1501
(1 row)

cenes=> select count(orders_id) from orders;count
-------  238
(1 row)
count
-------   34   16   12   12    8    7    6    5    4    2    2    2    2    2    2    2    2    2    2    2    2    2
2    1 
// the rest is 1
(129 rows)

How's that?

Markus Bertheau


В списке pgsql-sql по дате отправления:

Предыдущее
От: Bhuvan A
Дата:
Сообщение: Pattern Matching on Columns
Следующее
От: Jeff Eckermann
Дата:
Сообщение: Re: Pattern Matching on Columns