Обсуждение: Re: [COMMITTERS] pgsql: Further fixes for degenerate outer join clauses.
Andrew Dunstan <andrew@dunslane.net> writes: > On 08/06/2015 03:36 PM, Tom Lane wrote: >> Further fixes for degenerate outer join clauses. > Looks like this might have upset brolga on 9.0 and 9.1 - it's coming up > with a different plan from what's expected. I looked into this, and while I can't be certain of the diagnosis without looking at the assembly code brolga generates (no, I don't want to), I'm pretty sure I know what's going on here. The two plans in question have exactly the same estimated costs on my machine, which results in add_path discarding the second one to arrive. Evidently, on brolga the second path gets some fractionally cheaper estimate due to some compiler-specific variation in the way the arithmetic is done, so it gets kept rather than the first one. In 9.2 and up we got rid of that class of problems with this patch: Author: Tom Lane <tgl@sss.pgh.pa.us> Branch: master Release: REL9_2_BR [33e99153e] 2012-04-21 00:51:14 -0400 Use fuzzy not exact cost comparison for the final tie-breaker in add_path. Instead of an exact cost comparison, usea fuzzy comparison with 1e-10 delta after all other path metrics have proved equal. This is to avoid having platform-specificroundoff behaviors determine the choice when two paths are really the same to our cost estimators. Adjustthe recently-added test case that made it obvious we had a problem here. but 9.0 and 9.1 still have platform-dependent treatment of paths with essentially equal cost estimates. I'm not entirely sure what to do about this. We could back-patch that patch into 9.0 and 9.1, but it's conceivable somebody would squawk about planner behavioral changes. The only other idea that seems practical is to remove regression test cases that have platform-specific results in those branches. Probably that wouldn't result in a real reduction in the quality of the test coverage for those branches (we could still execute the query, just not EXPLAIN it). But it seems like a pretty ad-hoc answer, and the next case might be one that hurts more not to test. Thoughts? regards, tom lane
* Tom Lane (tgl@sss.pgh.pa.us) wrote: > I'm not entirely sure what to do about this. We could back-patch that > patch into 9.0 and 9.1, but it's conceivable somebody would squawk about > planner behavioral changes. The only other idea that seems practical is > to remove regression test cases that have platform-specific results in > those branches. Probably that wouldn't result in a real reduction in the > quality of the test coverage for those branches (we could still execute > the query, just not EXPLAIN it). But it seems like a pretty ad-hoc > answer, and the next case might be one that hurts more not to test. > > Thoughts? Have an alternate file for those other cases, rather than remove the test? The complaint was about one buildfarm member, so hopefully that's practical and doesn't require a lot of different permutations. Thanks! Stephen
Stephen Frost <sfrost@snowman.net> writes: > * Tom Lane (tgl@sss.pgh.pa.us) wrote: >> I'm not entirely sure what to do about this. We could back-patch that >> patch into 9.0 and 9.1, but it's conceivable somebody would squawk about >> planner behavioral changes. The only other idea that seems practical is >> to remove regression test cases that have platform-specific results in >> those branches. Probably that wouldn't result in a real reduction in the >> quality of the test coverage for those branches (we could still execute >> the query, just not EXPLAIN it). But it seems like a pretty ad-hoc >> answer, and the next case might be one that hurts more not to test. >> >> Thoughts? > Have an alternate file for those other cases, rather than remove the > test? The complaint was about one buildfarm member, so hopefully that's > practical and doesn't require a lot of different permutations. I considered that but don't find it practical or attractive, especially not if the only way to keep such a file updated is to wait and see whether the buildfarm complains. On the whole I'm leaning towards back-patching 33e99153e. While the case of exactly equal plan costs does come up in the regression tests (which tend to inspect plans for queries on small simple tables), I think it's relatively unlikely to happen with real-world data. regards, tom lane
* Tom Lane (tgl@sss.pgh.pa.us) wrote: > Stephen Frost <sfrost@snowman.net> writes: > > * Tom Lane (tgl@sss.pgh.pa.us) wrote: > >> I'm not entirely sure what to do about this. We could back-patch that > >> patch into 9.0 and 9.1, but it's conceivable somebody would squawk about > >> planner behavioral changes. The only other idea that seems practical is > >> to remove regression test cases that have platform-specific results in > >> those branches. Probably that wouldn't result in a real reduction in the > >> quality of the test coverage for those branches (we could still execute > >> the query, just not EXPLAIN it). But it seems like a pretty ad-hoc > >> answer, and the next case might be one that hurts more not to test. > >> > >> Thoughts? > > > Have an alternate file for those other cases, rather than remove the > > test? The complaint was about one buildfarm member, so hopefully that's > > practical and doesn't require a lot of different permutations. > > I considered that but don't find it practical or attractive, especially > not if the only way to keep such a file updated is to wait and see whether > the buildfarm complains. I agree, that's a bit unfortunate, but it strikes me as pretty unlikely that we're ever going to change those tests or that a code change would end up causing yet another different plan before 9.1 is completely out of support in the next couple years. > On the whole I'm leaning towards back-patching 33e99153e. While the case > of exactly equal plan costs does come up in the regression tests (which > tend to inspect plans for queries on small simple tables), I think it's > relatively unlikely to happen with real-world data. I agree it's unlikely, but I don't particularly like changing our mind on a back-patching decision 3 years later to satisfy our regression tests. Still, I don't feel particularly strongly about either side of this, so I'm happy with you making the decision. Thanks! Stephen
Stephen Frost <sfrost@snowman.net> writes: > * Tom Lane (tgl@sss.pgh.pa.us) wrote: >> Stephen Frost <sfrost@snowman.net> writes: >>> Have an alternate file for those other cases, rather than remove the >>> test? The complaint was about one buildfarm member, so hopefully that's >>> practical and doesn't require a lot of different permutations. >> I considered that but don't find it practical or attractive, especially >> not if the only way to keep such a file updated is to wait and see whether >> the buildfarm complains. > I agree, that's a bit unfortunate, but it strikes me as pretty unlikely > that we're ever going to change those tests or that a code change would > end up causing yet another different plan before 9.1 is completely out > of support in the next couple years. Meh. 9.0 will be dead in a month or two, but 9.1 still has a ways to go, and I don't really want to be dealing with this issue every time Andreas' fuzz testing finds another corner case :-(. It was hard enough devising regression tests that used stable table data for those cases as it was. Frequently, when building a planner regression test, it's important that a particular plan shape be selected (eg because what you're trying to test is that setrefs.c postprocessing does the right thing with a given case). So if we take supporting brolga's behavior as-is as a requirement, it's not always going to be good enough to just maintain an alternate output file; we'd have to try to adjust the test query to make it do the right thing. It's bad enough dealing with 32/64bit and endianness dependencies for that, I don't want minute compiler codegen choices entering into it as well. >> On the whole I'm leaning towards back-patching 33e99153e. While the case >> of exactly equal plan costs does come up in the regression tests (which >> tend to inspect plans for queries on small simple tables), I think it's >> relatively unlikely to happen with real-world data. > I agree it's unlikely, but I don't particularly like changing our mind > on a back-patching decision 3 years later to satisfy our regression > tests. Actually, I'd take that the other way around: now that the patch has been out for awhile, and we've not heard squawks that could be traced to it, there's a much better argument that back-patching would be safe than there was at the time. There is certainly some risk in making changes that are only for ease of maintenance and don't fix a provable bug ... but it's not like the other changes I've committed into 9.0 and 9.1 lately don't change any corner-case planner behaviors. Besides, you could argue that it is a bug if rebuilding with a different compiler can change the planner's behavior. regards, tom lane
* Tom Lane (tgl@sss.pgh.pa.us) wrote: > Stephen Frost <sfrost@snowman.net> writes: > > I agree, that's a bit unfortunate, but it strikes me as pretty unlikely > > that we're ever going to change those tests or that a code change would > > end up causing yet another different plan before 9.1 is completely out > > of support in the next couple years. > > Meh. 9.0 will be dead in a month or two, but 9.1 still has a ways to go, > and I don't really want to be dealing with this issue every time Andreas' > fuzz testing finds another corner case :-(. It was hard enough devising > regression tests that used stable table data for those cases as it was. > > Frequently, when building a planner regression test, it's important that > a particular plan shape be selected (eg because what you're trying to test > is that setrefs.c postprocessing does the right thing with a given case). > So if we take supporting brolga's behavior as-is as a requirement, it's > not always going to be good enough to just maintain an alternate output > file; we'd have to try to adjust the test query to make it do the right > thing. It's bad enough dealing with 32/64bit and endianness dependencies > for that, I don't want minute compiler codegen choices entering into it > as well. Certainly a good point. > >> On the whole I'm leaning towards back-patching 33e99153e. While the case > >> of exactly equal plan costs does come up in the regression tests (which > >> tend to inspect plans for queries on small simple tables), I think it's > >> relatively unlikely to happen with real-world data. > > > I agree it's unlikely, but I don't particularly like changing our mind > > on a back-patching decision 3 years later to satisfy our regression > > tests. > > Actually, I'd take that the other way around: now that the patch has been > out for awhile, and we've not heard squawks that could be traced to it, > there's a much better argument that back-patching would be safe than there > was at the time. I definitely agree with that when it comes to bug cases, but I'm a bit more wary around planner changes. Still, I don't recall any specific complaints about plan changes from pre-9.2 to 9.2+ in such corner cases and I do agree with you that real data would make this far less likely to happen. > There is certainly some risk in making changes that are only for ease of > maintenance and don't fix a provable bug ... but it's not like the other > changes I've committed into 9.0 and 9.1 lately don't change any > corner-case planner behaviors. Besides, you could argue that it is a bug > if rebuilding with a different compiler can change the planner's behavior. Good point. Thanks! Stephen