Re: reforming query for 7.0.2
От | Markus Bertheau |
---|---|
Тема | Re: reforming query for 7.0.2 |
Дата | |
Msg-id | 1011272192.15907.24.camel@entwicklung01.cenes.de обсуждение исходный текст |
Ответ на | Re: reforming query for 7.0.2 ("Josh Berkus" <josh@agliodbs.com>) |
Список | pgsql-sql |
Josh, > Markus, > > > how do I reform this query to work with 7.0.2? > > Better question: Why are you working with 7.0.2? Even the mass-market Linux > distros (like Red Hat and SuSE) now come with 7.1.x. Yeah, I know... An upgrade is definetely in the queue. But it will not happen before 7.2 (if my imagination of its release date is correct), so we will most probably go to that. > > > select * from personen join (select count(personen_id), personen_id from > > orders group by personen_id) as ordertemp on personen.personen_id = > > ordertemp.personen_id > > Acutally, this query needs some reforming on its own. While it will work, the > following version will achieve the same result, much faster, in 7.1 (and 7.2, > for that matter): > > SELECT personen.field1, personen.field2, personen.field3, count(order_id) as > no_orders > FROM personen LEFT OUTER JOIN orders ON personen.personen_id = > orders.personen_id > GROUP BY personen.field1, personen.field2, personen.field3 > > ... you see, your subselect above is completely unnecessary. And slower than a > LEFT OUTER JOIN. Simplicity, simplicity, simplicity! Thank you for that. > Now, we have the problem of no LEFT JOIN support in 7.0, so: > > SELECT personen.field1, personen.field2, personen.field3, count(order_id) as > no_orders > FROM personen JOIN orders ON personen.personen_id = orders.personen_id > GROUP BY personen.field1, personen.field2, personen.field3 > UNION > SELECT personen.field1, personen.field2, personen.field3, 0 as no_orders > FROM personen > WHERE NOT EXISTS (SELECT personen_id FROM orders WHERE personen_id = > personen.personen_id); Well, I actually don't need the left outer join, because I don't want persons included that have no orders. I think the original query didn't include them. So I turn the left outer join into a join and have the same query on both versions :-). (We use 7.1.3 on our development server). Thanks again. But let's take a look at explain. The old query shows a much smaller cost estimate compared to the new one. Both databases are vacuumed on a regular (daily) basis. First 7.1.3: cenes_test=> select version(); version -------------------------------------------------------------PostgreSQL 7.1.3 on i686-pc-linux-gnu, compiled by GCC 2.96 (1 row) // old query cenes_test=> explain select * from personen join (select count(orders_id), personen_id from orders group by personen_id) as ordertemp on ordertemp.personen_id = personen.personen_id; NOTICE: QUERY PLAN: Nested Loop (cost=11.76..42.32 rows=15 width=272) -> Subquery Scan ordertemp (cost=11.76..12.50 rows=15 width=8) -> Aggregate (cost=11.76..12.50 rows=15 width=8) -> Group (cost=11.76..12.13 rows=147 width=8) -> Sort (cost=11.76..11.76 rows=147 width=8) -> Seq Scan on orders (cost=0.00..6.47 rows=147 width=8) -> Index Scan using personen_pkey on personen (cost=0.00..2.02 rows=1 width=264) EXPLAIN // new query cenes_test=> explain select personen.nachname, personen.vorname, personen.firma, personen.personen_id, count(orders_id) from personen join orders on personen.personen_id = orders.personen_id group by personen.nachname, personen.vorname, personen.firma, personen.personen_id; NOTICE: QUERY PLAN: Aggregate (cost=162.53..164.37 rows=15 width=48) -> Group (cost=162.53..164.00 rows=147 width=48) -> Sort (cost=162.53..162.53rows=147 width=48) -> Merge Join (cost=139.59..157.24 rows=147 width=48) -> Sort (cost=127.83..127.83 rows=1265 width=40) -> Seq Scan on personen (cost=0.00..62.65 rows=1265 width=40) -> Sort (cost=11.76..11.76 rows=147 width=8) -> Seq Scanon orders (cost=0.00..6.47 rows=147 width=8) EXPLAIN // That's a lot more. About row estimates: cenes_test=> select count(personen_id) from personen;count ------- 1272 (1 row) cenes_test=> select count(orders_id) from orders;count ------- 189 (1 row) cenes_test=> select count(orders_id) from orders group by personen_id order by count desc;count ------- 41 33 15 12 10 8 7 5 4 3 2 2 2 2 2 2 2 1 // 36 x 1 snipped (54 rows) On the production db: cenes=> select version(); version ---------------------------------------------------------------PostgreSQL 7.0.2 on i686-pc-linux-gnu, compiled by gcc 2.95.2 (1 row) // old query doesn't work // new query cenes=> explain select personen.nachname, personen.vorname, personen.firma, personen.personen_id, count(orders_id) from personen join orders on personen.personen_id = orders.personen_id group by personen.nachname, personen.vorname, personen.firma, personen.personen_id; NOTICE: QUERY PLAN: Aggregate (cost=201.91..204.87 rows=24 width=48) -> Group (cost=201.91..204.28 rows=237 width=48) -> Sort (cost=201.91..201.91rows=237 width=48) -> Merge Join (cost=170.85..192.56 rows=237 width=48) -> Sort (cost=18.72..18.72 rows=237 width=8) -> Seq Scan on orders (cost=0.00..9.37 rows=237 width=8) -> Sort (cost=152.13..152.13 rows=1500 width=40) -> Seq Scanon personen (cost=0.00..73.00 rows=1500 width=40) EXPLAIN cenes=> select count(personen_id) from personen;count ------- 1501 (1 row) cenes=> select count(orders_id) from orders;count ------- 238 (1 row) count ------- 34 16 12 12 8 7 6 5 4 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 // the rest is 1 (129 rows) How's that? Markus Bertheau
В списке pgsql-sql по дате отправления: