Обсуждение: Query running a lot faster with enable_nestloop=false
Hi guys, I have a query that runs a lot slower (~5 minutes) when I run it with the default enable_nestloop=true and enable_nestloop=false (~10 secs). The actual query is available here http://pastie.org/2754424 . It is a reporting query with many joins as the database is mainly used for transaction processing. Explain analyse result for both cases: Machine A nestloop=true - http://explain.depesz.com/s/nkj0 (~5 minutes) Machine A nestloop=false - http://explain.depesz.com/s/wBM (~10 secs) On a different slightly slower machine (Machine B), copying the database over and leaving the default enable_nestloop=true it takes ~20 secs. Machine B nestloop=true - http://explain.depesz.com/s/dYO (~ 20secs) For all the cases above I ensured that I did an ANALYZE before running the queries. There were no other queries running in parallel. Both machines are running PostgreSQL 8.4.6. Machine B is using the default configuration provided by the package while for Machine A we applied the changes suggested by pgtune - http://pastie.org/2755113. Machine A is running Ubuntu 10.04 32 bit while Machine B is running Ubuntu 8.04 32 bit. Machine A spec - Intel(R) Xeon(R) CPU X3450 @ 2.67GHz (8 Cores) 8GB RAM (2 x 4GB) 4 x 300GB 15k SAS Machine B spec - Intel(R) Pentium(R) D CPU 2.80GHz x 2 2GB RAM 1 x 80GB SATA HDD 1. For Machine A, what can I do to make the planner choose the faster plan without setting enable_nestloop=false ? 2. From the research I have done it seems to be that the reason the planner is choosing the unoptimal query is because of the huge difference between the estimated and actual rows. How can I get this figure closer ? 3. If I should rewrite the query, what should I change ? 4. Why is it that the planner seems to be doing the right thing for Machine B without setting enable_nestloop=false. What should I be comparing in both the machines to understand the difference in choice that the planner made ? I have tried reading through the manual section "55.1. Row Estimation Examples", "14.2. Statistics Used by the Planner". I am still trying to fully apply the information to my specific case above and hence any help or pointers would be greatly appreciated. In a last ditch effort we also tried upgrading Machine A to PostgresSQL 9.1 and that did not rectify the issue. We have reverted the upgrade for now. Thank you for your time. -- Mohan
Hi Mohanaraj, One thing you should certainly try is to increase the default_statistics_target value from 50 up to say about 1000 for the larger tables. Large tables tend to go off on estimates with smaller values here. I guess I am not helping here, but apart from your query, those estimates on Machine B seem odd, coz they shoot up from 10k to the order of billions without any big change in row-count. Beats me. -- Robins Tharakan > 1. For Machine A, what can I do to make the planner choose the faster > plan without setting enable_nestloop=false ? > > 2. From the research I have done it seems to be that the reason the > planner is choosing the unoptimal query is because of the huge > difference between the estimated and actual rows. How can I get this > figure closer ? > > 3. If I should rewrite the query, what should I change ? > > 4. Why is it that the planner seems to be doing the right thing for > Machine B without setting enable_nestloop=false. What should I be > comparing in both the machines to understand the difference in choice > that the planner made ? > > I have tried reading through the manual section "55.1. Row Estimation > Examples", "14.2. Statistics Used by the Planner". I am still trying > to fully apply the information to my specific case above and hence any > help or pointers would be greatly appreciated. > > In a last ditch effort we also tried upgrading Machine A to > PostgresSQL 9.1 and that did not rectify the issue. We have reverted > the upgrade for now. > > Thank you for your time.
Вложения
On Tue, Oct 25, 2011 at 5:09 AM, Mohanaraj Gopala Krishnan <mohangk@gmail.com> wrote: > I have a query that runs a lot slower (~5 minutes) when I run it with > the default enable_nestloop=true and enable_nestloop=false (~10 secs). > The actual query is available here http://pastie.org/2754424 . It is a > reporting query with many joins as the database is mainly used for > transaction processing. > > Explain analyse result for both cases: > > Machine A nestloop=true - http://explain.depesz.com/s/nkj0 (~5 minutes) > Machine A nestloop=false - http://explain.depesz.com/s/wBM (~10 secs) A good start might be to refactor this: Seq Scan on retailer_categories retailer_category_leaf_nodes (cost=0.00..18.52 rows=1 width=16) (actual time=0.016..0.392 rows=194 loops=1) Filter: ((tree_right - tree_left) = 1) And this: Seq Scan on product_categories product_category_leaf_nodes (cost=0.00..148.22 rows=2 width=32) (actual time=0.031..1.109 rows=383 loops=1) Filter: ((tree_right - tree_left) = 1) The query planner really has no idea what selectivity to assign to that condition, and so it's basically guessing, and it's way off. You could probably improve the estimate a lot by adding a column that stores the values of tree_right - tree_left and is updated manually or by triggers as you insert and update data. Then you could just check tree_left_right_difference = 1, which should get a much more accurate estimate, and hopefully therefore a better plan. You've also got a fairly large estimation error here: Index Scan using invoices_create_date_idx on invoices (cost=0.00..8.28 rows=1 width=4) (actual time=0.055..0.305 rows=109 loops=1) Index Cond: ((create_date >= '2011-09-15'::date) AND (create_date <= '2011-09-15'::date)) Filter: (status = 128) Apparently, status 128 is considerably more common among rows in that date range than it is overall. Unfortunately, it's not so easy to fix this kind of estimation error, unless you can refactor your schema to avoid needing to filter on both create_date and status at the same time. It might be worth using temporary tables here - factor out sections of the query that are referenced multiple times, like the join between sales_order_items and invoices, and create a temporary table. ANALYZE it, and then use it to run the main query. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company