Обсуждение: Re: Rewriting the test of pg_upgrade as a TAP test
Michael, * Michael Paquier (michael.paquier@gmail.com) wrote: > On Tue, Apr 4, 2017 at 10:52 PM, Peter Eisentraut > <peter.eisentraut@2ndquadrant.com> wrote: > > On 4/3/17 11:32, Andres Freund wrote: > >> That doesn't strike as particularly future proof. We intentionally > >> leave objects behind pg_regress runs, but that only works if we actually > >> run them... > > > > I generally agree with the sentiments expressed later in this thread. > > But just to clarify what I meant here: We don't need to run a, say, > > 1-minute serial test to load a few "left behind" objects for the > > pg_upgrade test, if we can load the same set of objects using dedicated > > scripting in say 2 seconds. This would make both the pg_upgrade tests > > faster and would reduce the hidden dependencies in the main tests about > > which kinds of objects need to be left behind. > > Making the tests run shorter while maintaining the current code > coverage is nice. But this makes more complicated the test suite > maintenance as this needs either a dedicated regression schedule or an > extra test suite where objects are created just for the sake of > pg_upgrade. This increases the risks of getting a rotten test suite > with the time if patch makers and reviewers are not careful. I believe that what Peter was getting at is that the pg_dump TAP tests create a whole slew of objects in just a few seconds and are able to then exercise those code-paths in pg_dump, without needing to run the entire serial regression test run. I'm still not completely convinced that we actually need to independently test pg_upgrade by creating all the objects which the pg_dump TAP tests do, given that pg_upgrade just runs pg_dump underneath. If we really want to do that, however, what we should do is abstract out the pg_dump set of tests into a place that both the pg_dump and pg_upgrade TAP tests could use them to create all the types of objects which are supported to perform their tests. Thanks! Stephen
Stephen Frost <sfrost@snowman.net> writes: > I believe that what Peter was getting at is that the pg_dump TAP tests > create a whole slew of objects in just a few seconds and are able to > then exercise those code-paths in pg_dump, without needing to run the > entire serial regression test run. Right. But there's a certain amount of serendipity involved in using the core regression tests' final results. For example, I don't know how long it would've taken us to understand the problems around dumping and reloading child tables with inconsistent column orders, had there not been examples of that in the regression tests. I worry that creating a sterile set of objects for testing pg_dump will leave blind spots, because it will mean that we only test cases that we explicitly created test cases for. > I'm still not completely convinced that we actually need to > independently test pg_upgrade by creating all the objects which the > pg_dump TAP tests do, given that pg_upgrade just runs pg_dump > underneath. If we really want to do that, however, what we should do is > abstract out the pg_dump set of tests into a place that both the pg_dump > and pg_upgrade TAP tests could use them to create all the types of > objects which are supported to perform their tests. I think it's largely pointless to test pg_dump --binary-upgrade except as a part of pg_upgrade. regards, tom lane
Tom, * Tom Lane (tgl@sss.pgh.pa.us) wrote: > Stephen Frost <sfrost@snowman.net> writes: > > I believe that what Peter was getting at is that the pg_dump TAP tests > > create a whole slew of objects in just a few seconds and are able to > > then exercise those code-paths in pg_dump, without needing to run the > > entire serial regression test run. > > Right. But there's a certain amount of serendipity involved in using the > core regression tests' final results. For example, I don't know how long > it would've taken us to understand the problems around dumping and > reloading child tables with inconsistent column orders, had there not been > examples of that in the regression tests. I worry that creating a sterile > set of objects for testing pg_dump will leave blind spots, because it will > mean that we only test cases that we explicitly created test cases for. We don't need to only create sterile sets of objects in the pg_dump TAP tests. I don't believe we need to populate GIN indexes or vacuum them to test pg_dump/pg_upgrade either (at least, not if we're going to stick to the pg_upgrade test basically being if pg_dump returns the same results before-and-after). I'm all for adding tests into pg_dump which do things like drop columns and change column names and other cases which could impact if the pg_dump is correct or not, and there's nothing preventing those tests from being added in the existing structure. Certainly, before we remove the coverage provided by running the serial test suite and then using pg_upgrade, we should analyze what is being tested and ensure that we're providing that same set of testing in the pg_dump TAP tests. > > I'm still not completely convinced that we actually need to > > independently test pg_upgrade by creating all the objects which the > > pg_dump TAP tests do, given that pg_upgrade just runs pg_dump > > underneath. If we really want to do that, however, what we should do is > > abstract out the pg_dump set of tests into a place that both the pg_dump > > and pg_upgrade TAP tests could use them to create all the types of > > objects which are supported to perform their tests. > > I think it's largely pointless to test pg_dump --binary-upgrade except > as a part of pg_upgrade. That's how I discovered that comments and security labels weren't being pulled through to the new cluster for blobs, so I would have to disagree with this. Frankly, it's also much more straight-forward to run pg_dump --binary-upgrade than it is to get pg_upgrade to do the same. Still, I'm not actually against centralizing the tests done with pg_dump such that they could be used by pg_upgrade also. Creating all those objects takes less than a second, at least on my system, so it would still be quite a bit faster than running the serial regression suite. We might also consider if there's a way to change the format for those tests to make them a bit less impenetrable for non-Perl folks to work with and to make it simpler to add new tests as new features are added. Thanks! Stephen
Hi, On 2017-04-05 10:40:41 -0400, Stephen Frost wrote: > * Tom Lane (tgl@sss.pgh.pa.us) wrote: > > Stephen Frost <sfrost@snowman.net> writes: > > > I believe that what Peter was getting at is that the pg_dump TAP tests > > > create a whole slew of objects in just a few seconds and are able to > > > then exercise those code-paths in pg_dump, without needing to run the > > > entire serial regression test run. > > > > Right. But there's a certain amount of serendipity involved in using the > > core regression tests' final results. For example, I don't know how long > > it would've taken us to understand the problems around dumping and > > reloading child tables with inconsistent column orders, had there not been > > examples of that in the regression tests. I worry that creating a sterile > > set of objects for testing pg_dump will leave blind spots, because it will > > mean that we only test cases that we explicitly created test cases for. > > We don't need to only create sterile sets of objects in the pg_dump TAP > tests. I really, really don't understand why we're conflating making pg_upgrade tests less fragile / duplicative with changing what we use to test it. This seems to have the sole result that we're not going to get anywhere. > I don't believe we need to populate GIN indexes or vacuum them > to test pg_dump/pg_upgrade either (at least, not if we're going to stick > to the pg_upgrade test basically being if pg_dump returns the same > results before-and-after). I think we *should* have populated GIN indexes. Yes, the coverage isn't perfect, but the VACUUM definitely gives a decent amount of coverage whether the gin index looks halfway sane after the upgrade. Greetings, Andres Freund
Andres, * Andres Freund (andres@anarazel.de) wrote: > On 2017-04-05 10:40:41 -0400, Stephen Frost wrote: > > * Tom Lane (tgl@sss.pgh.pa.us) wrote: > > > Stephen Frost <sfrost@snowman.net> writes: > > > > I believe that what Peter was getting at is that the pg_dump TAP tests > > > > create a whole slew of objects in just a few seconds and are able to > > > > then exercise those code-paths in pg_dump, without needing to run the > > > > entire serial regression test run. > > > > > > Right. But there's a certain amount of serendipity involved in using the > > > core regression tests' final results. For example, I don't know how long > > > it would've taken us to understand the problems around dumping and > > > reloading child tables with inconsistent column orders, had there not been > > > examples of that in the regression tests. I worry that creating a sterile > > > set of objects for testing pg_dump will leave blind spots, because it will > > > mean that we only test cases that we explicitly created test cases for. > > > > We don't need to only create sterile sets of objects in the pg_dump TAP > > tests. > > I really, really don't understand why we're conflating making pg_upgrade > tests less fragile / duplicative with changing what we use to test it. > This seems to have the sole result that we're not going to get anywhere. Probably because the point was brought up that the regression tests for pg_upgrade spend a bunch of time doing something which, ultimately, don't actually add any real value. Yes, there are bits of the core regression tests that currently add value over what we have through other approaches, but that's not where the bulk of running those tests go. > > I don't believe we need to populate GIN indexes or vacuum them > > to test pg_dump/pg_upgrade either (at least, not if we're going to stick > > to the pg_upgrade test basically being if pg_dump returns the same > > results before-and-after). > > I think we *should* have populated GIN indexes. Yes, the coverage isn't > perfect, but the VACUUM definitely gives a decent amount of coverage > whether the gin index looks halfway sane after the upgrade. We don't look at the gin index after the upgrade in the current pg_upgrade testing, so I don't see why you feel it's at all valuable. If we *did* do that (and I'm all for adding such tests), then perhaps this argument would make sense, but we don't today and I haven't seen anyone propose changing that. Thanks! Stephen
On 2017-04-05 10:50:19 -0400, Stephen Frost wrote: > Andres, > > * Andres Freund (andres@anarazel.de) wrote: > > On 2017-04-05 10:40:41 -0400, Stephen Frost wrote: > > > * Tom Lane (tgl@sss.pgh.pa.us) wrote: > > > > Stephen Frost <sfrost@snowman.net> writes: > > > > > I believe that what Peter was getting at is that the pg_dump TAP tests > > > > > create a whole slew of objects in just a few seconds and are able to > > > > > then exercise those code-paths in pg_dump, without needing to run the > > > > > entire serial regression test run. > > > > > > > > Right. But there's a certain amount of serendipity involved in using the > > > > core regression tests' final results. For example, I don't know how long > > > > it would've taken us to understand the problems around dumping and > > > > reloading child tables with inconsistent column orders, had there not been > > > > examples of that in the regression tests. I worry that creating a sterile > > > > set of objects for testing pg_dump will leave blind spots, because it will > > > > mean that we only test cases that we explicitly created test cases for. > > > > > > We don't need to only create sterile sets of objects in the pg_dump TAP > > > tests. > > > > I really, really don't understand why we're conflating making pg_upgrade > > tests less fragile / duplicative with changing what we use to test it. > > This seems to have the sole result that we're not going to get anywhere. > > Probably because the point was brought up that the regression tests for > pg_upgrade spend a bunch of time doing something which, ultimately, > don't actually add any real value. Yes, there are bits of the core > regression tests that currently add value over what we have through > other approaches, but that's not where the bulk of running those tests > go. Create a separate patch [& thread] about that, don't conflate the topics. I'm very much in favor of this rewrite, I'm very much not in favor of only using some targeted testsuite. By combining two independent changes, you're just making it less likely that anything happens. > > > I don't believe we need to populate GIN indexes or vacuum them > > > to test pg_dump/pg_upgrade either (at least, not if we're going to stick > > > to the pg_upgrade test basically being if pg_dump returns the same > > > results before-and-after). > > > > I think we *should* have populated GIN indexes. Yes, the coverage isn't > > perfect, but the VACUUM definitely gives a decent amount of coverage > > whether the gin index looks halfway sane after the upgrade. > > We don't look at the gin index after the upgrade in the current > pg_upgrade testing, so I don't see why you feel it's at all valuable. It's be trivial to add a VACUUM to the point where analyze_new_cluster is currently run. And I've previously run more manual tests. Is that perfect - no, definitely not. Greetings, Andres Freund
Stephen Frost <sfrost@snowman.net> writes: > * Tom Lane (tgl@sss.pgh.pa.us) wrote: >> Right. But there's a certain amount of serendipity involved in using the >> core regression tests' final results. For example, I don't know how long >> it would've taken us to understand the problems around dumping and >> reloading child tables with inconsistent column orders, had there not been >> examples of that in the regression tests. I worry that creating a sterile >> set of objects for testing pg_dump will leave blind spots, because it will >> mean that we only test cases that we explicitly created test cases for. > I'm all for adding tests into pg_dump which do things like drop columns > and change column names and other cases which could impact if the > pg_dump is correct or not, and there's nothing preventing those tests > from being added in the existing structure. Certainly, before we remove > the coverage provided by running the serial test suite and then using > pg_upgrade, we should analyze what is being tested and ensure that we're > providing that same set of testing in the pg_dump TAP tests. I don't think you grasped my basic point, which is that I'm worried about emergent cases that we don't foresee needing to test (and that no amount of code coverage checking would have shown up as being overlooked). Admittedly, relying on the core regression tests to trigger such cases is a pretty haphazard strategy, but it's way better than no strategy at all. > We might also consider if there's a way to change the format for those > tests to make them a bit less impenetrable for non-Perl folks to work > with and to make it simpler to add new tests as new features are added. TBH, that's part of my allergy to this concept, ie that this test mechanism seems pretty write-only. I do not think that people will add pg_dump test cases except when required to by project policy, so that we will end up with a very skeletal set of tests that won't find any unforeseen behaviors. The TAP tests in general are utterly developer-unfriendly from where I sit: not only is the code pretty unreadable, but god help you when you need to try to debug a failure. I think that some serious effort needs to be spent on improving that situation before we imagine that we can throw away other test mechanisms we have today in favor of TAP tests. regards, tom lane
Andres, * Andres Freund (andres@anarazel.de) wrote: > On 2017-04-05 10:50:19 -0400, Stephen Frost wrote: > > Probably because the point was brought up that the regression tests for > > pg_upgrade spend a bunch of time doing something which, ultimately, > > don't actually add any real value. Yes, there are bits of the core > > regression tests that currently add value over what we have through > > other approaches, but that's not where the bulk of running those tests > > go. > > Create a separate patch [& thread] about that, don't conflate the > topics. I'm very much in favor of this rewrite, I'm very much not in > favor of only using some targeted testsuite. By combining two > independent changes, you're just making it less likely that anything > happens. I've made it clear, I thought, a couple of times that I agree with the rewrite and that we should move forward with it. Nothing on this sub-thread changes that. It's also registered in the 2017-07 commitfest, so I wouldn't think that there's a risk of it being forgotten or that we need to cut off all discussion about what may change between now and July that would be relevant to this patch. > > We don't look at the gin index after the upgrade in the current > > pg_upgrade testing, so I don't see why you feel it's at all valuable. > > It's be trivial to add a VACUUM to the point where analyze_new_cluster > is currently run. And I've previously run more manual tests. Is that > perfect - no, definitely not. Being trivial doesn't mean it's something we're actually doing today. Given that we aren't actually changing anything in the index during a same-version pg_upgrade, nor are we changing the code that's run by that VACUUM, I'm curious just what we're ending up testing that's different from just restarting the existing cluster and running a new VACUUM. Thanks! Stephen
Tom, * Tom Lane (tgl@sss.pgh.pa.us) wrote: > Stephen Frost <sfrost@snowman.net> writes: > > I'm all for adding tests into pg_dump which do things like drop columns > > and change column names and other cases which could impact if the > > pg_dump is correct or not, and there's nothing preventing those tests > > from being added in the existing structure. Certainly, before we remove > > the coverage provided by running the serial test suite and then using > > pg_upgrade, we should analyze what is being tested and ensure that we're > > providing that same set of testing in the pg_dump TAP tests. > > I don't think you grasped my basic point, which is that I'm worried about > emergent cases that we don't foresee needing to test (and that no amount > of code coverage checking would have shown up as being overlooked). > Admittedly, relying on the core regression tests to trigger such cases is > a pretty haphazard strategy, but it's way better than no strategy at all. The tests that were added to the core regression suite were done so for a reason and hopefully we can identify cases where it'd make sense to also run those tests for pg_upgrade/pg_dump testing. More-or-less anything that materially changes the catalog should be included, I would think. Things that are only really only working with the heap/index files don't really need to be performed because the pg_upgrade process doesn't change those. > > We might also consider if there's a way to change the format for those > > tests to make them a bit less impenetrable for non-Perl folks to work > > with and to make it simpler to add new tests as new features are added. > > TBH, that's part of my allergy to this concept, ie that this test > mechanism seems pretty write-only. I do not think that people will add > pg_dump test cases except when required to by project policy, so that > we will end up with a very skeletal set of tests that won't find any > unforeseen behaviors. I certainly agree that the current structure for the tests isn't trivial to work with and would welcome suggestions as to how to improve it. Now that we've had this testing structure for a year and have added quite a bit more to it, it's definitely clear that we need to find a more developer-friendly approach. > The TAP tests in general are utterly developer-unfriendly from where > I sit: not only is the code pretty unreadable, but god help you when > you need to try to debug a failure. I think that some serious effort > needs to be spent on improving that situation before we imagine that > we can throw away other test mechanisms we have today in favor of > TAP tests. Agreed. Thanks! Stephen
On Wed, Apr 05, 2017 at 11:13:33AM -0400, Stephen Frost wrote: > * Tom Lane (tgl@sss.pgh.pa.us) wrote: > > Stephen Frost <sfrost@snowman.net> writes: > > > I'm all for adding tests into pg_dump which do things like drop columns > > > and change column names and other cases which could impact if the > > > pg_dump is correct or not, and there's nothing preventing those tests > > > from being added in the existing structure. Certainly, before we remove > > > the coverage provided by running the serial test suite and then using > > > pg_upgrade, we should analyze what is being tested and ensure that we're > > > providing that same set of testing in the pg_dump TAP tests. > > > > I don't think you grasped my basic point, which is that I'm worried about > > emergent cases that we don't foresee needing to test (and that no amount > > of code coverage checking would have shown up as being overlooked). > > Admittedly, relying on the core regression tests to trigger such cases is > > a pretty haphazard strategy, but it's way better than no strategy at all. > > The tests that were added to the core regression suite were done so for > a reason and hopefully we can identify cases where it'd make sense to > also run those tests for pg_upgrade/pg_dump testing. I think you _are_ missing Tom's point. We've caught pg_dump and pg_upgrade bugs thanks to regression database objects created for purposes unrelated to pg_dump. It's true that there exist other test strategies that are more efficient or catch more bugs overall. None of them substitute 100% for the serendipity seen in testing dump/restore on the regression database. > More-or-less > anything that materially changes the catalog should be included, I would > think. Things that are only really only working with the heap/index > files don't really need to be performed because the pg_upgrade process > doesn't change those. That is formally true. Also, I agree with Andres that this is not a thread for discussing test changes beyond mechanical translation of the pg_upgrade test suite.