Обсуждение: Re: [PERFORM] Hints proposal
[ trying once again to push this thread over to -hackers where it belongs ] Arjen van der Meijden <acmmailing@tweakers.net> writes: > On 12-10-2006 21:07 Jeff Davis wrote: >> On Thu, 2006-10-12 at 19:15 +0200, Csaba Nagy wrote: >> To formalize the proposal a litte, you could have syntax like: >> CREATE HINT [FOR USER username] MATCHES regex APPLY HINT some_hint; >> >> Where "some_hint" would be a hinting language perhaps like Jim's, except >> not guaranteed to be compatible between versions of PostgreSQL. The >> developers could change the hinting language at every release and people >> can just re-write the hints without changing their application. Do you have any idea how much push-back there would be to that? In practice we'd be bound by backwards-compatibility concerns for the hints too. > There are some disadvantages of not writing the hints in a query. But of > course there are disadvantages to do as well ;) > One I can think of is that it can be very hard to define which hint > should apply where. Especially in complex queries, defining at which > point exaclty you'd like your hint to work is not a simple matter, > unless you can just place a comment right at that position. The problems that you are seeing all come from the insistence that a hint should be textually associated with a query. Using a regex is a little better than putting it right into the query, but the only thing that really fixes is not having the hints directly embedded into client-side code. It's still wrong at the conceptual level. The right way to think about it is to ask why is the planner not picking the right plan to start with --- is it missing a statistical correlation, or are its cost parameters wrong for a specific case, or is it perhaps unable to generate the desired plan at all? (If the latter, no amount of hinting is going to help.) If it's a statistics or costing problem, I think the right thing is to try to fix it with hints at that level. You're much more likely to fix the behavior across a class of queries than you will be with a hint textually matched to a specific query. regards, tom lane
On Thu, 2006-10-12 at 17:28 -0400, Tom Lane wrote: > [ trying once again to push this thread over to -hackers where it belongs ] > > Arjen van der Meijden <acmmailing@tweakers.net> writes: > > On 12-10-2006 21:07 Jeff Davis wrote: > >> On Thu, 2006-10-12 at 19:15 +0200, Csaba Nagy wrote: > >> To formalize the proposal a litte, you could have syntax like: > >> CREATE HINT [FOR USER username] MATCHES regex APPLY HINT some_hint; > >> > >> Where "some_hint" would be a hinting language perhaps like Jim's, except > >> not guaranteed to be compatible between versions of PostgreSQL. The > >> developers could change the hinting language at every release and people > >> can just re-write the hints without changing their application. > > Do you have any idea how much push-back there would be to that? In > practice we'd be bound by backwards-compatibility concerns for the hints > too. > No, I don't have any idea, except that it would be less push-back than changing a language that's embedded in client code. Also, I see no reason to think that a hint would not be obsolete upon a new release anyway. > The problems that you are seeing all come from the insistence that a > hint should be textually associated with a query. Using a regex is a > little better than putting it right into the query, but the only thing "Little better" is all I was going for. I was just making the observation that we can separate two concepts: (1) Embedding code in the client's queries, which I see as very undesirable and unnecessary (2) Providing very specific hints which at least gives us a place to talk about the debate more reasonably. > that really fixes is not having the hints directly embedded into > client-side code. It's still wrong at the conceptual level. > I won't disagree with that. I will just say it's no more wrong than applying the same concept in addition to embedding the hints in client queries. > The right way to think about it is to ask why is the planner not picking > the right plan to start with --- is it missing a statistical > correlation, or are its cost parameters wrong for a specific case, or > is it perhaps unable to generate the desired plan at all? (If the > latter, no amount of hinting is going to help.) If it's a statistics or > costing problem, I think the right thing is to try to fix it with hints > at that level. You're much more likely to fix the behavior across a > class of queries than you will be with a hint textually matched to a > specific query. > Agreed. Regards, Jeff Davis
Quoth pgsql@j-davis.com (Jeff Davis): > On Thu, 2006-10-12 at 17:28 -0400, Tom Lane wrote: >> [ trying once again to push this thread over to -hackers where it belongs ] >> >> Arjen van der Meijden <acmmailing@tweakers.net> writes: >> > On 12-10-2006 21:07 Jeff Davis wrote: >> >> On Thu, 2006-10-12 at 19:15 +0200, Csaba Nagy wrote: >> >> To formalize the proposal a litte, you could have syntax like: >> >> CREATE HINT [FOR USER username] MATCHES regex APPLY HINT some_hint; >> >> >> >> Where "some_hint" would be a hinting language perhaps like >> >> Jim's, except not guaranteed to be compatible between versions >> >> of PostgreSQL. The developers could change the hinting language >> >> at every release and people can just re-write the hints without >> >> changing their application. >> >> Do you have any idea how much push-back there would be to that? In >> practice we'd be bound by backwards-compatibility concerns for the >> hints too. > > No, I don't have any idea, except that it would be less push-back > than changing a language that's embedded in client code. Also, I see > no reason to think that a hint would not be obsolete upon a new > release anyway. I see *plenty* of reason. 1. Suppose the scenario where Hint h was useful hasn't been affected by *any* changes in how the query planner works inthe new version, it *obviously* continues to be necessary. 2. If Version n+0.1 hasn't resolved all/most cases where Hint h was useful in Version n, then people will entirely reasonablyexpect for Hint h to continue to be in effect in version n+0.1 3. Suppose support for Hint h is introduced in PostgreSQL version n, and an optimization that makes it obsolete does notarrive until version n+0.3, which is quite possible. That hint has been carried forward for 2 versions already, longenough for client code that contains it to start to ossify. (After all, if developers get promoted to new projectsevery couple of years, two versions is plenty of time for the original programmer to be gone...) That's not just one good reason, but three. >> The problems that you are seeing all come from the insistence that a >> hint should be textually associated with a query. Using a regex is a >> little better than putting it right into the query, but the only thing > > "Little better" is all I was going for. I was just making the > observation that we can separate two concepts: > (1) Embedding code in the client's queries, which I see as very > undesirable and unnecessary > (2) Providing very specific hints > > which at least gives us a place to talk about the debate more > reasonably. It seems to me that there is a *LOT* of merit in trying to find alternatives to embedding code into client queries, to be sure. >> that really fixes is not having the hints directly embedded into >> client-side code. It's still wrong at the conceptual level. > > I won't disagree with that. I will just say it's no more wrong than > applying the same concept in addition to embedding the hints in client > queries. > >> The right way to think about it is to ask why is the planner not >> picking the right plan to start with --- is it missing a >> statistical correlation, or are its cost parameters wrong for a >> specific case, or is it perhaps unable to generate the desired plan >> at all? (If the latter, no amount of hinting is going to help.) >> If it's a statistics or costing problem, I think the right thing is >> to try to fix it with hints at that level. You're much more likely >> to fix the behavior across a class of queries than you will be with >> a hint textually matched to a specific query. > > Agreed. That's definitely a useful way to look at the issue, which seems to be lacking in many of the cries for hints. Perhaps I'm being unfair, but it often seems that people demanding hinting systems are uninterested in why the planner is getting things wrong. Yes, they have an immediate problem (namely the wrong plan that is getting generated) that they want to resolve. But I'm not sure that you can get anything out of hinting without coming close to answering "why the planner got it wrong." -- "cbbrowne","@","gmail.com" http://linuxfinances.info/info/lsf.html "Optimization hinders evolution." -- Alan Perlis
On Thu, 2006-10-12 at 23:12 -0400, Christopher Browne wrote: > > No, I don't have any idea, except that it would be less push-back > > than changing a language that's embedded in client code. Also, I see > > no reason to think that a hint would not be obsolete upon a new > > release anyway. > > I see *plenty* of reason. > > 1. Suppose the scenario where Hint h was useful hasn't been affected > by *any* changes in how the query planner works in the new > version, it *obviously* continues to be necessary. > > 2. If Version n+0.1 hasn't resolved all/most cases where Hint h was > useful in Version n, then people will entirely reasonably expect > for Hint h to continue to be in effect in version n+0.1 > Fair enough. I had considered those situations, but a lot of people are talking about "I need a better plan now, can't wait for planner improvements". Also, even if the hint is still useful, I would think that on a new version you'd want to test to see how useful it still is. > 3. Suppose support for Hint h is introduced in PostgreSQL version > n, and an optimization that makes it obsolete does not arrive > until version n+0.3, which is quite possible. That hint has been > carried forward for 2 versions already, long enough for client > code that contains it to start to ossify. (After all, if > developers get promoted to new projects every couple of years, > two versions is plenty of time for the original programmer to > be gone...) Ok, that is a good reason. But it's not helped at all by putting the hints in the queries themselves. > > "Little better" is all I was going for. I was just making the > > observation that we can separate two concepts: > > (1) Embedding code in the client's queries, which I see as very > > undesirable and unnecessary > > (2) Providing very specific hints > > > > which at least gives us a place to talk about the debate more > > reasonably. > > It seems to me that there is a *LOT* of merit in trying to find > alternatives to embedding code into client queries, to be sure. > I think almost any alternative to client query hints is worth considering. > >> The right way to think about it is to ask why is the planner not > >> picking the right plan to start with --- is it missing a > >> statistical correlation, or are its cost parameters wrong for a > >> specific case, or is it perhaps unable to generate the desired plan > >> at all? (If the latter, no amount of hinting is going to help.) > >> If it's a statistics or costing problem, I think the right thing is > >> to try to fix it with hints at that level. You're much more likely > >> to fix the behavior across a class of queries than you will be with > >> a hint textually matched to a specific query. > > > > Agreed. > > That's definitely a useful way to look at the issue, which seems to be > lacking in many of the cries for hints. > > Perhaps I'm being unfair, but it often seems that people demanding > hinting systems are uninterested in why the planner is getting things > wrong. Yes, they have an immediate problem (namely the wrong plan > that is getting generated) that they want to resolve. > > But I'm not sure that you can get anything out of hinting without > coming close to answering "why the planner got it wrong." Right. And it's not always easy to determine why the planner got it wrong without making it execute other plans through hinting :) Note: I'll restate this just to be clear. I'm not advocating an overly- specific, band-aid style hinting language. My only real concern is that if one appears, I would not like it to appear in the client's queries. Same goes for more general kinds of hints. We don't want a bunch of client queries to contain comments like "table foo has a random_page_cost of 1.1". That belongs in the system catalogs. Regards,Jeff Davis