Обсуждение: PostgreSQL performance enhancement when query planner fails to guess the right plan
PostgreSQL performance enhancement when query planner fails to guess the right plan
От
Constantin Teodorescu
Дата:
Hello all, hope you are remembering me, some years ago I've designed the PgAccess , the Tcl/Tk visual interface to PostgreSQL. Thought you haven't received any news from me, I continued working with PostgreSQL, being involved in very big projects in Romania. Right now, the national identification of the cows, sheep, goats and pigs in Romania runs on PostgreSQL on a very big database. Once again , I had to thank you all for keeping up maintaining and improving PostgreSQL. My message to all of you is related to this big project (a government sustained project) and some performance issues. Very few words about the database: approx. 60 tables, 30 of them containing 10 millions to 50 millions records , the whole database is approx 40 Gb size ! In order to get a good performance, the database is operated on a dual XEON with 6 Gb RAM IBM x235 server, the database with tables and indexes carefully distributed on 6 different SCSI disks, in different tablespaces in such a manner to allow parallelizing reads and HDD head movements on different devices when joining those big tables. We have tuned every possible parameter in config file, we have reorganized queries, analyzing explains in order to get the best results for all big queries and we succeeded most of the time. But we have encountered some problems. Due to constant updates and inserts into the database, it's size is growing continuously. Of course we are doing DAILY the needed maintaince, vacuums, analyzes and backups. Due to permanent changes in database size and statistics there are queries that sometimes change their execution plan, badly choosing another plan and executing those queries in 2,3 minutes instead of 10 seconds, the usual execution time since the query plan is "switched". We have done any effort in changing subselects and the query sentence in order to "force" using some indexes, continuously watching the explain results. We have faced yesterday with such a problem with a query that "switched" the query plan to a very bad one, almost putting the whole system down. The only way that we have succeeded to make it work again was by using the "SET ENABLE_MERGE_JOIN to OFF". For the moment it works but in our opinion this is NOT the best approach to guide the planner to a better query-plan variant. Our suggestion would be : extending the EXPLAIN and SELECT commands like that: EXPLAIN VARIANTS SELECT ...... (and so on) that will display the different query plans analyzed by the planner and their "estimated time values" , not just the "best guess" . assuming that the EXPLAIN VARIANTS will show 3 or 4 different query plans, the database manager will be able to experiment, to test, and to decide by himself what is "THE BEST PLAN FOR ME", instead of letting postgresql planner to to that. Doing this, we would be able to clearly specify then in the SELECT statement the "version" of the query-plan that would be used in execution like in the following example: SELECT .... (very big and complex query) ... USING PLAN 3; Specifying the desired plan could be of course, different. I realise that it would be probably better that the query-plan will guess the right and optimal plan. I agree that this can be done be tweaking parameters and costs BUT THIS CAN TAKE A LOT OF TIME, much more than a couple of tests on the real database. An experimented database admin can detect much easier the appropriate plan and "force" the executor to select that one that he desires. In our opinion, this would be the simplest and the most non-intrusive method of "manual choosing" another query plan rather than indirectly setting ON or OFFS various parameters that could affect badly other queries. First of all, it's assumed that the query planner HAS ALREADY evaluated different variants and it decides to use one based upon the statistics informations of the involved tables and "costs" for various types of access. Unfortunately, due to a very difficult adjustment of those costs and timings of the HDD performance, IO transfer speeds, PostgreSQL is choosing sometimes a wrong plan. If we would have the power of choosing and experimenting different plans with "SELECT .... USING PLAN <that-one>" we can select than the right one in our real world. The "... USING PLAN 9" extension to the language I hope that it's the most delicate and innocent :-) that I hope that it can be accepted and it will give a extremely powerful way of controlling the execution performance. Hope that you know the old joke with someone who is receiving an email message : "Hi , I'm the Albanian virus. Due to our poor technologies, I cannot do much so please delete some of your files and pretend to be scared" :-) I admit that this approach might be called "the Albanian way of choosing the best query plan" :-) but you must admit also that it does not place a big burden on the developers, it does not change anything in what have been done since now and it allows the developers and database administrators to dive into the query plan ocean and to get out of there the best of the quickest of the fastest query plan. :-) Hope that this long message have not disturb you so much ... so I'm waiting for your comments and suggestions. Best regards, Constantin Teodorescu Braila, ROMANIA
Constantin Teodorescu wrote: > EXPLAIN VARIANTS SELECT ...... (and so on) that will display the > different query plans analyzed by the planner and their "estimated time > values" , not just the "best guess" . > > assuming that the EXPLAIN VARIANTS will show 3 or 4 different query > plans, the database manager will be able to experiment, to test, and to > decide by himself what is "THE BEST PLAN FOR ME", instead of letting > postgresql planner to to that. Doing this, we would be able to clearly > specify then in the SELECT statement the "version" of the query-plan > that would be used in execution like in the following example: > > SELECT .... (very big and complex query) ... USING PLAN 3; I think something like what Sybase has would be better. A friend told me that in Sybase you can dump a query plan into a meta structure you can handtweak (leaving out non relevant aspects) and force the usage of that handtweaked query plan, where the missing pieces are filled in at run time. It might still be helpful to get a dump of alternative query plans via EXPLAINto get started. I guess generally the idea must be to improve the planner to not even need things like this, but it seems like a very useful fallback for the time between finding a planner limitation and getting the improvement into production. An alternative approach that is already possible today is to handtweak the table stats in order to generate a stable query plan. But this seems like a very indirect way to get something you explicitly know you want. regards, Lukas
Constantin, What binding are you using ? We here use Java+JDBC, and we were able to get stable query plans by forcing server side prepared statements (using PGStatement#setPrepareThreshold with 1 as the threshold), where the query is prepared without knowing the parameter values. This can backfire too, but for our purposes it was the right thing (probably sacrificing some performance, but getting a stable system). The plans in this case are made to work with guessed mean values for the estimates, and that's usually resulting in a stable plan, so once you got it right it will stay like that. Cheers, Csaba. On Mon, 2006-08-07 at 22:02, Constantin Teodorescu wrote: > Hello all, hope you are remembering me, some years ago I've designed the > PgAccess , the Tcl/Tk visual interface to PostgreSQL. > > Thought you haven't received any news from me, I continued working with > PostgreSQL, being involved in very big projects in Romania. > Right now, the national identification of the cows, sheep, goats and > pigs in Romania runs on PostgreSQL on a very big database. > Once again , I had to thank you all for keeping up maintaining and > improving PostgreSQL. > > My message to all of you is related to this big project (a government > sustained project) and some performance issues. > > Very few words about the database: approx. 60 tables, 30 of them > containing 10 millions to 50 millions records , the whole database is > approx 40 Gb size ! > > In order to get a good performance, the database is operated on a dual > XEON with 6 Gb RAM IBM x235 server, the database with tables and indexes > carefully distributed on 6 different SCSI disks, in different > tablespaces in such a manner to allow parallelizing reads and HDD head > movements on different devices when joining those big tables. > > We have tuned every possible parameter in config file, we have > reorganized queries, analyzing explains in order to get the best results > for all big queries and we succeeded most of the time. > But we have encountered some problems. Due to constant updates and > inserts into the database, it's size is growing continuously. > Of course we are doing DAILY the needed maintaince, vacuums, analyzes > and backups. > Due to permanent changes in database size and statistics there are > queries that sometimes change their execution plan, badly choosing > another plan and executing those queries in 2,3 minutes instead of 10 > seconds, the usual execution time since the query plan is "switched". We > have done any effort in changing subselects and the query sentence in > order to "force" using some indexes, continuously watching the explain > results. > > We have faced yesterday with such a problem with a query that "switched" > the query plan to a very bad one, almost putting the whole system down. > The only way that we have succeeded to make it work again was by using > the "SET ENABLE_MERGE_JOIN to OFF". > For the moment it works but in our opinion this is NOT the best approach > to guide the planner to a better query-plan variant. > > Our suggestion would be : extending the EXPLAIN and SELECT commands like > that: > > EXPLAIN VARIANTS SELECT ...... (and so on) that will display the > different query plans analyzed by the planner and their "estimated time > values" , not just the "best guess" . > > assuming that the EXPLAIN VARIANTS will show 3 or 4 different query > plans, the database manager will be able to experiment, to test, and to > decide by himself what is "THE BEST PLAN FOR ME", instead of letting > postgresql planner to to that. Doing this, we would be able to clearly > specify then in the SELECT statement the "version" of the query-plan > that would be used in execution like in the following example: > > SELECT .... (very big and complex query) ... USING PLAN 3; > > Specifying the desired plan could be of course, different. > I realise that it would be probably better that the query-plan will > guess the right and optimal plan. I agree that this can be done be > tweaking parameters and costs BUT THIS CAN TAKE A LOT OF TIME, much more > than a couple of tests on the real database. An experimented database > admin can detect much easier the appropriate plan and "force" the > executor to select that one that he desires. > > In our opinion, this would be the simplest and the most non-intrusive > method of "manual choosing" another query plan rather than indirectly > setting ON or OFFS various parameters that could affect badly other > queries. > First of all, it's assumed that the query planner HAS ALREADY evaluated > different variants and it decides to use one based upon the statistics > informations of the involved tables and "costs" for various types of > access. > Unfortunately, due to a very difficult adjustment of those costs and > timings of the HDD performance, IO transfer speeds, PostgreSQL is > choosing sometimes a wrong plan. > If we would have the power of choosing and experimenting different plans > with "SELECT .... USING PLAN <that-one>" we can select than the right > one in our real world. > > The "... USING PLAN 9" extension to the language I hope that it's the > most delicate and innocent :-) that I hope that it can be accepted and > it will give a extremely powerful way of controlling the execution > performance. > > Hope that you know the old joke with someone who is receiving an email > message : "Hi , I'm the Albanian virus. Due to our poor technologies, I > cannot do much so please delete some of your files and pretend to be > scared" :-) > I admit that this approach might be called "the Albanian way of choosing > the best query plan" :-) but you must admit also that it does not place > a big burden on the developers, it does not change anything in what have > been done since now and it allows the developers and database > administrators to dive into the query plan ocean and to get out of there > the best of the quickest of the fastest query plan. :-) > > Hope that this long message have not disturb you so much ... so I'm > waiting for your comments and suggestions. > > Best regards, > Constantin Teodorescu > Braila, ROMANIA > > ---------------------------(end of broadcast)--------------------------- > TIP 5: don't forget to increase your free space map settings
On Tue, 2006-08-08 at 12:36, Constantin Teodorescu wrote: > We have tried PGStatement#setPrepareThreshold with 1 as the threshold > but it's not a good solution. > Actually is worst. Considering that you have 5 different query plans, > you are selecting approx. random one of them, not taking into account > the statistics. Wrong, you'll select _the same_ plan, that's what matters. If it's not the plan you wanted, you have to rewrite the query, and try again, but once you got the plan you wanted, it's pretty much you'll get always the same plan. So you only need to test as long as you get the right query to trigger the right plan... but of course this requires that your queries are so constructed to always be OK with that plan, regardless the parameter values. Usually this means a suboptimal plan, but stable execution times. If you need to give hints to the DB based on the parameter values and choose different plans for different parameter values, then you basically do the job of the planner in your application, and I guess sooner or later you'll make wrong choices too. Some hinting mechanism would be good for cases where the developer really know better how the data is laid out (e.g. forcing the use of a specific access method for one table in a complex join), but that forcing a complete plan is probably not good. Even the hinting is only a workaround for the planner fixes which will cannot make it to the stable version... On the daydreaming part, how about a 2 phase planner ? Modus operandi: Phase 1: compile and cache plan decision tree: - collect all reasonable plans without taking into account the parameter values; - check the parameter bounds where each plan is the fastest; - compile a decision tree which based on theparameter values chooses one plan or the other; - cache this plan decision tree; - there's no need to cache plans which will always loose to someother plan no matter what parameter values you give (to limit the size of the decision tree); Phase 2: run the decision tree to chose the best cached plan for the parameter values; You could use variables coming from the statistics system in the decision tree so it doesn't have to be recalculated too often on statistics changes. With a system like this, you could at system startup make the decision tree for all your frequently used queries and have fast planning at runtime which is optimized for the parameter values (takes the decision tree from the cache, runs it with the current parameters). Or just store the whole thing in a system table... or tweak the decision tree manually... This is actually not addressing the plan stability issue, but if manual tweaking would be allowed, it would... Cheers, Csaba.
Csaba Nagy wrote: > Constantin, > > What binding are you using ? We here use Java+JDBC, and we were able to > get stable query plans by forcing server side prepared statements (using > PGStatement#setPrepareThreshold with 1 as the threshold), where the > query is prepared without knowing the parameter values. This can > backfire too, but for our purposes it was the right thing (probably > sacrificing some performance, but getting a stable system). The plans in > this case are made to work with guessed mean values for the estimates, > and that's usually resulting in a stable plan, so once you got it right > it will stay like that. > We have tried PGStatement#setPrepareThreshold with 1 as the threshold but it's not a good solution. Actually is worst. Considering that you have 5 different query plans, you are selecting approx. random one of them, not taking into account the statistics. The situation is simpler than it's at the first view. Guessing what is the best plan, based on statistics and costs, IS NOT A EASY THING TO DO. Tweaking costs and statistics CAN TAKE A VERY LONG TIME and need strong knowledge about database architecture, hardware performances and many other things. Not every average user of PostgreSQL can do that! Experimenting the first 3 or 4 query plans in the descending order of their estimated cost, IS SIMPLER and it can take less than an hour and can be done by less experimented people. Choosing the "proved" better query plan IS SIMPLER and that means "PERFORMANCE EVEN FOR THE AVERAGE USER". We are talking about open-source, free-source and the freedom of choice, isn't it? So, why not give the user the freedom of choosing a different query plan that will give a better performances. Maybe I'm not interested in developing WHY the query planner is choosing wrong. Of course , the developers will enhance it but until then, let's give the user the power of manually selecting the right query plan. The final result may be something like that : "I heard that PostgreSQL has a very handy tool that gives you a better performance in queries. It gives you the ability to make fine adjustments". Sound good, isn't it ? :-) Teo
On Mon, Aug 07, 2006 at 10:47:39PM +0200, Lukas Smith wrote: > Constantin Teodorescu wrote: > > >EXPLAIN VARIANTS SELECT ...... (and so on) that will display the > >different query plans analyzed by the planner and their "estimated time > >values" , not just the "best guess" . > > > >assuming that the EXPLAIN VARIANTS will show 3 or 4 different query > >plans, the database manager will be able to experiment, to test, and to > >decide by himself what is "THE BEST PLAN FOR ME", instead of letting > >postgresql planner to to that. Doing this, we would be able to clearly > >specify then in the SELECT statement the "version" of the query-plan > >that would be used in execution like in the following example: > > > >SELECT .... (very big and complex query) ... USING PLAN 3; > > I think something like what Sybase has would be better. A friend told me > that in Sybase you can dump a query plan into a meta structure you can > handtweak (leaving out non relevant aspects) and force the usage of that > handtweaked query plan, where the missing pieces are filled in at run > time. It might still be helpful to get a dump of alternative > query plans via EXPLAIN to get > started. Another option would be adding planner hints. > I guess generally the idea must be to improve the planner to not even > need things like this, but it seems like a very useful fallback for the > time between finding a planner limitation and getting the improvement > into production. Yeah, this comes up at least twice a year and every time the answer is "We won't add planner hints because we should just fix the planner". Fact is, all of the 'big 3' have planner hints, even though they've invested litterally millions (if not billions) of dollars in their planners. If they can't get it right I think it's hubris for us to think we'll magically be able to, especially when there are known deficiencies in our statistics system. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
Csaba Nagy wrote: > On Tue, 2006-08-08 at 12:36, Constantin Teodorescu wrote: >> We have tried PGStatement#setPrepareThreshold with 1 as the threshold >> but it's not a good solution. >> Actually is worst. Considering that you have 5 different query plans, >> you are selecting approx. random one of them, not taking into account >> the statistics. > > Wrong, you'll select _the same_ plan, that's what matters. If it's not > the plan you wanted, you have to rewrite the query, and try again, but > once you got the plan you wanted, it's pretty much you'll get always the > same plan. So you only need to test as long as you get the right query > to trigger the right plan... but of course this requires that your > queries are so constructed to always be OK with that plan, regardless > the parameter values. Usually this means a suboptimal plan, but stable > execution times. Well it should usually be possible to find a query that gives a stable query plan. However in some cases stable query plan means varying performance which is also not ideal. So you have to actually find a query that will give you stable performance (which often means finding a query that is a good compromise and that producses a stable plan). But if you have changing data, very different selectivity for values etc this can become very hard, maybe even impossible. For these kinds of queries it might just be easier to put in the effort to specify (parts of) the query plan explicitly. Especially as an interim solution until a new stable release comes around that fixes the underlying planner issue (which will usually be atleast 6-12 months). regards, Lukas