Обсуждение: Restructuring plancache.c API
I've been thinking about supporting automatic replan of cached plans using specific parameter values, as has been discussed several times, at greatest length in this thread: http://archives.postgresql.org/pgsql-hackers/2010-02/msg00607.php There doesn't seem to be full consensus about what the control method ought to be, but right at the moment I'm thinking about mechanism not policy. I think that what we need to do is restructure the API of plancache.c to make it more amenable to returning "throwaway" plans. It can already do that to some extent using the fully_planned = false code path, but that's not the design center and it was shoehorned in in perhaps a less than clean fashion. I want to rearrange it so there's an explicit notion of three levels of cacheable object: 1. Raw parse tree + source string. These obviously never change. 2. The result tree of parsing and rewriting (ie, the output of pg_analyze_and_rewrite applied to level 1). This can change, but only as a result of schema changes on the tables and other objects referenced in the query. We already have entirely adequate mechanisms for recognizing when this has to be rebuilt. 3. The finished plan (ie, the output of pg_plan_queries applied to level 2). This might be either cached for reuse, or a throwaway object, depending on the control mechanism's decisions. I think we could get rid of the fully_planned switch and instead design the API around caching levels 1 and 2. Then there's a GetCachedPlan function (replacing RevalidateCachedPlan) that returns a finished plan, but it's unspecified whether you get a persistent cached plan or a throwaway one. The control mechanism would execute inside this function. We'd still have ReleaseCachedPlan, which would take care of throwing away the plan if it's throwaway. Right now the API is structured so that the initial creator of a cacheable plan has to build levels 2 and 3 first, and the plancache.c code just copies that data into persistent storage. I'm thinking that might have been a mistake. Maybe we should just have the caller hand over the data for level 1, with parse analysis + rewrite done solely internally within plancache.c. The level-2 data wouldn't be exposed outside plancache.c at all. With this focus, the name "plancache" becomes a little bit of a misnomer, but I am inclined to stick with it because a better name isn't apparent. "rewritecache" isn't an improvement really. Comments? regards, tom lane
On 2010-11-11 23:21, Tom Lane wrote: > I've been thinking about supporting automatic replan of cached plans > using specific parameter values, as has been discussed several times, > at greatest length in this thread: > http://archives.postgresql.org/pgsql-hackers/2010-02/msg00607.php .. > I want to rearrange it so there's > an explicit notion of three levels of cacheable object: > > 1. Raw parse tree + source string. These obviously never change. In the context of cached plans and specific parameter values, a idea for the future might be to also consider a cached plan for planning of simple queries. A way to do this is by regarding all constants in a simple query as parameters, and look for a cached plan for that parameterized query. To lower the chance for choosing a bad plan for the actual parameter values, a cached plan could also store the actual parameter values used during planning. (where planning was done with constants, not parameters, this would require back replacing the actual values as constants in the parameterized query). Based on exact match on the raw parse tree of the parameterized source tree and neighbourhood of the actual parameter values of the cached and current query, a plan could be chosen or not. If replanning was chosen, this new plan could also be stored as new cached plan of the same query but with different parameter values. It would require one more level in the plan cache 1 raw parse tree of parameterized query 2 one or more "source string + actual parameter values" (these were the replaced constants) then for each entry in level 2 the remaining levels. regards, Yeb Havinga
Excerpts from Tom Lane's message of jue nov 11 19:21:34 -0300 2010: > I've been thinking about supporting automatic replan of cached plans > using specific parameter values, as has been discussed several times, > at greatest length in this thread: > http://archives.postgresql.org/pgsql-hackers/2010-02/msg00607.php > There doesn't seem to be full consensus about what the control method > ought to be, but right at the moment I'm thinking about mechanism not > policy. I think that what we need to do is restructure the API of > plancache.c to make it more amenable to returning "throwaway" plans. > It can already do that to some extent using the fully_planned = false > code path, but that's not the design center and it was shoehorned in > in perhaps a less than clean fashion. I want to rearrange it so there's > an explicit notion of three levels of cacheable object: I was wondering if this could help with the separation of labour of functions in postgres.c that we were talking about a couple of weeks ago. The main impedance mismatch, so to speak, is that those functions aren't at all related to caching of any sort; but then, since you're looking for a new name for the source file, I return to my earlier suggestion of a generic "queries.c" or some such, which could handle all these issues. (Of course, querycache.c doesn't make any sense.) -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera <alvherre@commandprompt.com> writes: > Excerpts from Tom Lane's message of jue nov 11 19:21:34 -0300 2010: >> I think that what we need to do is restructure the API of >> plancache.c to make it more amenable to returning "throwaway" plans. > I was wondering if this could help with the separation of labour of > functions in postgres.c that we were talking about a couple of weeks > ago. Yeah, it was in the back of my mind that this patch might create some merge conflicts for that one, but I figured we could deal with that when the time came. I wasn't intending to refactor the behavior of pg_analyze_and_rewrite or pg_plan_queries, just change where they might get called from, so I think any conflict will be inessential and easily resolved. > The main impedance mismatch, so to speak, is that those functions > aren't at all related to caching of any sort; but then, since you're > looking for a new name for the source file, I return to my earlier > suggestion of a generic "queries.c" or some such, which could handle all > these issues. (Of course, querycache.c doesn't make any sense.) I thought about querycache.c too, but it seems to carry the wrong connotations --- in mysql-land I believe they use that term to imply caching a query's *results*. But queries.c seems so generic as to convey no information at all. regards, tom lane