Обсуждение: reforming query for 7.0.2

Поиск
Список
Период
Сортировка

reforming query for 7.0.2

От
Markus Bertheau
Дата:
Hello,

how do I reform this query to work with 7.0.2?


select * from personen join (select count(personen_id), personen_id from
orders group by personen_id) as ordertemp on personen.personen_id =
ordertemp.personen_id

If the table structure is needed I will happily post it here. Although I
don't think it is.

TIA

Markus Bertheau


Re: reforming query for 7.0.2

От
"Josh Berkus"
Дата:
Markus,

> how do I reform this query to work with 7.0.2?

Better question:  Why are you working with 7.0.2?  Even the mass-market Linuxdistros (like Red Hat and SuSE) now come
with7.1.x.
 

> select * from personen join (select count(personen_id), personen_id from
> orders group by personen_id) as ordertemp on personen.personen_id =
> ordertemp.personen_id

Acutally, this query needs some reforming on its own.  While it will work, thefollowing version will achieve the same
result,much faster, in 7.1 (and 7.2,for that matter):
 

SELECT personen.field1, personen.field2, personen.field3, count(order_id) asno_orders
FROM personen LEFT OUTER JOIN orders ON personen.personen_id =orders.personen_id
GROUP BY  personen.field1, personen.field2, personen.field3

... you see, your subselect above is completely unnecessary.  And slower than aLEFT OUTER JOIN.  Simplicity,
simplicity,simplicity!
 

Now, we have the problem of no LEFT JOIN support in 7.0, so:

SELECT personen.field1, personen.field2, personen.field3, count(order_id) asno_orders
FROM personen JOIN orders ON personen.personen_id = orders.personen_id
GROUP BY  personen.field1, personen.field2, personen.field3
UNION
SELECT personen.field1, personen.field2, personen.field3, 0 as no_orders
FROM personen
WHERE NOT EXISTS (SELECT personen_id FROM orders WHERE personen_id =personen.personen_id);

Share & Enjoy!  

-Josh Berkus

______AGLIO DATABASE SOLUTIONS___________________________                                      Josh Berkus Complete
informationtechnology      josh@agliodbs.com  and data management solutions       (415) 565-7293 for law firms, small
businesses       fax 621-2533   and non-profit organizations.      San Francisco
 


Re: reforming query for 7.0.2

От
Markus Bertheau
Дата:
Josh,

> Markus,
>
> > how do I reform this query to work with 7.0.2?
>
> Better question:  Why are you working with 7.0.2?  Even the mass-market Linux
>  distros (like Red Hat and SuSE) now come with 7.1.x.

Yeah, I know... An upgrade is definetely in the queue. But it will not
happen before 7.2 (if my imagination of its release date is correct), so
we will most probably go to that.
>
> > select * from personen join (select count(personen_id), personen_id from
> > orders group by personen_id) as ordertemp on personen.personen_id =
> > ordertemp.personen_id
>
> Acutally, this query needs some reforming on its own.  While it will work, the
>  following version will achieve the same result, much faster, in 7.1 (and 7.2,
>  for that matter):
>
> SELECT personen.field1, personen.field2, personen.field3, count(order_id) as
>  no_orders
> FROM personen LEFT OUTER JOIN orders ON personen.personen_id =
>  orders.personen_id
> GROUP BY  personen.field1, personen.field2, personen.field3
>
> ... you see, your subselect above is completely unnecessary.  And slower than a
>  LEFT OUTER JOIN.  Simplicity, simplicity, simplicity!

Thank you for that.

> Now, we have the problem of no LEFT JOIN support in 7.0, so:
>
> SELECT personen.field1, personen.field2, personen.field3, count(order_id) as
>  no_orders
> FROM personen JOIN orders ON personen.personen_id = orders.personen_id
> GROUP BY  personen.field1, personen.field2, personen.field3
> UNION
> SELECT personen.field1, personen.field2, personen.field3, 0 as no_orders
> FROM personen
> WHERE NOT EXISTS (SELECT personen_id FROM orders WHERE personen_id =
>  personen.personen_id);

Well, I actually don't need the left outer join, because I don't want
persons included that have no orders. I think the original query didn't
include them. So I turn the left outer join into a join and have the
same query on both versions :-). (We use 7.1.3 on our development
server).

Thanks again.

But let's take a look at explain. The old query shows a much smaller
cost estimate compared to the new one. Both databases are vacuumed on a
regular (daily) basis. First 7.1.3:

cenes_test=> select version();                          version
-------------------------------------------------------------PostgreSQL 7.1.3 on i686-pc-linux-gnu, compiled by GCC
2.96
(1 row)

// old query
cenes_test=> explain select * from personen join (select
count(orders_id), personen_id from orders group by personen_id) as
ordertemp on ordertemp.personen_id = personen.personen_id;
NOTICE:  QUERY PLAN:

Nested Loop  (cost=11.76..42.32 rows=15 width=272) ->  Subquery Scan ordertemp  (cost=11.76..12.50 rows=15 width=8)
 ->  Aggregate  (cost=11.76..12.50 rows=15 width=8)             ->  Group  (cost=11.76..12.13 rows=147 width=8)
         ->  Sort  (cost=11.76..11.76 rows=147 width=8)                         ->  Seq Scan on orders
(cost=0.00..6.47
rows=147 width=8) ->  Index Scan using personen_pkey on personen  (cost=0.00..2.02
rows=1 width=264)

EXPLAIN

// new query

cenes_test=> explain select personen.nachname, personen.vorname,
personen.firma, personen.personen_id, count(orders_id) from personen
join orders on personen.personen_id = orders.personen_id group by
personen.nachname, personen.vorname, personen.firma,
personen.personen_id;
NOTICE:  QUERY PLAN:

Aggregate  (cost=162.53..164.37 rows=15 width=48) ->  Group  (cost=162.53..164.00 rows=147 width=48)       ->  Sort
(cost=162.53..162.53rows=147 width=48)             ->  Merge Join  (cost=139.59..157.24 rows=147 width=48)
    ->  Sort  (cost=127.83..127.83 rows=1265 width=40)                         ->  Seq Scan on personen
(cost=0.00..62.65
rows=1265 width=40)                   ->  Sort  (cost=11.76..11.76 rows=147 width=8)                         ->  Seq
Scanon orders  (cost=0.00..6.47 
rows=147 width=8)

EXPLAIN

// That's a lot more. About row estimates:

cenes_test=> select count(personen_id) from personen;count
------- 1272
(1 row)

cenes_test=> select count(orders_id) from orders;count
-------  189
(1 row)

cenes_test=> select count(orders_id) from orders group by personen_id
order by count desc;count
-------   41   33   15   12   10    8    7    5    4    3    2    2    2    2    2    2    2    1
// 36 x 1 snipped
(54 rows)

On the production db:
cenes=> select version();                           version
---------------------------------------------------------------PostgreSQL 7.0.2 on i686-pc-linux-gnu, compiled by gcc
2.95.2
(1 row)

// old query doesn't work
// new query
cenes=> explain select personen.nachname, personen.vorname,
personen.firma, personen.personen_id, count(orders_id) from personen
join orders on personen.personen_id = orders.personen_id group by
personen.nachname, personen.vorname, personen.firma,
personen.personen_id;
NOTICE:  QUERY PLAN:

Aggregate  (cost=201.91..204.87 rows=24 width=48) ->  Group  (cost=201.91..204.28 rows=237 width=48)       ->  Sort
(cost=201.91..201.91rows=237 width=48)             ->  Merge Join  (cost=170.85..192.56 rows=237 width=48)
    ->  Sort  (cost=18.72..18.72 rows=237 width=8)                         ->  Seq Scan on orders  (cost=0.00..9.37 
rows=237 width=8)                   ->  Sort  (cost=152.13..152.13 rows=1500 width=40)                         ->  Seq
Scanon personen  (cost=0.00..73.00 
rows=1500 width=40)

EXPLAIN

cenes=> select count(personen_id) from personen;count
------- 1501
(1 row)

cenes=> select count(orders_id) from orders;count
-------  238
(1 row)
count
-------   34   16   12   12    8    7    6    5    4    2    2    2    2    2    2    2    2    2    2    2    2    2
2    1 
// the rest is 1
(129 rows)

How's that?

Markus Bertheau