Re: An Idea for planner hints
От | Arturo Pérez |
---|---|
Тема | Re: An Idea for planner hints |
Дата | |
Msg-id | 32EA4E1D-7005-4ED4-8D49-C7A172EFF3F5@ethicist.net обсуждение исходный текст |
Ответ на | Re: An Idea for planner hints ("Jim C. Nasby" <jnasby@pervasive.com>) |
Ответы |
Re: An Idea for planner hints
(Peter Eisentraut <peter_e@gmx.net>)
|
Список | pgsql-hackers |
On Aug 15, 2006, at 10:40 AM, Jim C. Nasby wrote: > On Mon, Aug 14, 2006 at 11:41:29PM +0300, Hannu Krosing wrote: >> ??hel kenal p??eval, E, 2006-08-14 kell 18:21, kirjutas Peter >> Eisentraut: >>> Perez wrote: >>>> I thought, from watching the list for a while, that the planner >>>> statistics needed were known but that how to gather the statistics >>>> was not? >>> >>> I think over the course of the discussion we have figured out >>> that we >>> would like to have cross-column correlation statistics. The precise >>> mathematical incarnation hasn't been determined yet, as far as I can >>> see. Collecting the statistics thereafter isn't that hard, but >>> there >>> needs to be a way to not collect an exponential volume of >>> statistics on >>> all column combinations. >> >> I understood that the proposal was to collect only the stats where >> needed (determined by user/dba) and use some rule-of-thumb values >> if no >> collected stats were available. > > Yeah, unless someone comes up with some kind of 'magic', I think > trying > to handle every cross-column possibility is a non-starter. IIRC, that > argument is what's stalled cross-column stats every time in the > past. It > makes a lot more sense to allow defining what combinations of > columns we > need stats for. > > After that's done, it'd be easy to then write a script that will tell > the database to collect stats on all multi-column indexes, RI, etc. > Down > the road, the planner could even be made to log (in a machine-readable > format) every time it needs cross-column stats, and that data could be > used to add stats that are needed. If we're talking about my random neuron firing then I think the responses have gone off a bit. My thought was to just tell the planner the statistics that are of interest. An example of what I'm thinking would probably be helpful. Let's say that the DBA knows, through whatever means at his/her disposal (heck! the magic you mention) that column a & column b have some sort of correlation that the planner can't determine on its own but can use if it had it. The DBA therefore pokes the right information into the planner's statistical tables (or, perhaps, a more human- manageable one that gets "compiled" into the planner's stats). For this to work we'd have to 1. Define the types of statistics that the planner could use in its planning that it cannot currently (or ever) collect itself. Cross-column correlations, suitable join selectivity, anything that would be useful to the planner. 2. Create a table or other data structure to contain this planner information. Modify the planner to use this information. 3. Document what these stats are, and the influence they have in a format suitable for use by DBAs, and how to add the stats to the above table. Mere mortals can tinker with this feature at their own peril :-) Now, when a DBA has information that could steer the planner in the right direction he/she has a mechanism to do so that does not involve hinting the specific query. My hope would be that this information wouldn't go stale as fast as a query hint would. Furthermore, the DBA can improve an application's performance without having to go into every query it executes. The planner would look in that table and say Ah! there's information in here that says that when a is joined to be it's going to eliminate 90% of my I/O. Seems to me that such a feature would be a cool knob and address most/ all of the need for query hints. One other possibility for the above information would be just to have a place for the planner to save information for itself when it finds a plan to be either horribly over-optimistic or pessimistic. Hope this blathering makes some kind of sense... -arturo
В списке pgsql-hackers по дате отправления: