Обсуждение: Re: pgsql: Clean up role created in new subscription test.
On Thu, Mar 30, 2023 at 1:07 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Clean up role created in new subscription test. > > This oversight broke repeated runs of "make installcheck". GAAAAH. You would think that I would have learned better by now, but evidently not. Is there some way we can add an automated guard against this? -- Robert Haas EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes: > On Thu, Mar 30, 2023 at 1:07 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> This oversight broke repeated runs of "make installcheck". > GAAAAH. You would think that I would have learned better by now, but > evidently not. Is there some way we can add an automated guard against > this? Hm. We could add a final test step that prints out all still-existing roles, but the trick is to have it not fail in a legitimate installcheck context (ie, when there are indeed some pre-existing roles). Maybe it'd be close enough to expect there to be no roles named "regress_xxx". In combination with -DENFORCE_REGRESSION_TEST_NAME_RESTRICTIONS, that would prevent us from accidentally leaving stuff behind, and we could hope that it doesn't cause false failures in real installations. Another idea could be for pg_regress to enforce that "select count(*) from pg_roles" gives the same answer before and after the test run. That would then be enforced against all pg_regress suites not just the main one, but perhaps that's good. Likewise for tablespaces, subscriptions, and other globally-visible objects, of course. regards, tom lane
> On 30 Mar 2023, at 20:44, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Maybe it'd be close enough to expect there to be no roles named > "regress_xxx". In combination with > -DENFORCE_REGRESSION_TEST_NAME_RESTRICTIONS, that would prevent us > from accidentally leaving stuff behind, and we could hope that it doesn't > cause false failures in real installations. Would that check be always on or only when pg_regress is compiled with -DENFORCE_REGRESSION_TEST_NAME_RESTRICTIONS? > Another idea could be for pg_regress to enforce that "select count(*) > from pg_roles" gives the same answer before and after the test run. That wouldn't prevent the contents of pg_roles to have changed though, so there is a (slim) false positive risk with that no? -- Daniel Gustafsson
Daniel Gustafsson <daniel@yesql.se> writes: >> On 30 Mar 2023, at 20:44, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Maybe it'd be close enough to expect there to be no roles named >> "regress_xxx". In combination with >> -DENFORCE_REGRESSION_TEST_NAME_RESTRICTIONS, that would prevent us >> from accidentally leaving stuff behind, and we could hope that it doesn't >> cause false failures in real installations. > Would that check be always on or only when pg_regress is compiled with > -DENFORCE_REGRESSION_TEST_NAME_RESTRICTIONS? I envisioned it as being on all the time. >> Another idea could be for pg_regress to enforce that "select count(*) >> from pg_roles" gives the same answer before and after the test run. > That wouldn't prevent the contents of pg_roles to have changed though, so there > is a (slim) false positive risk with that no? Well, we could do "select rolname from pg_roles order by 1" and actually compare the results of the two selects. That might be advisable anyway, in order to produce a complaint with useful detail when there is something wrong. regards, tom lane
> On 30 Mar 2023, at 22:29, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Daniel Gustafsson <daniel@yesql.se> writes: >>> On 30 Mar 2023, at 20:44, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Another idea could be for pg_regress to enforce that "select count(*) >>> from pg_roles" gives the same answer before and after the test run. > >> That wouldn't prevent the contents of pg_roles to have changed though, so there >> is a (slim) false positive risk with that no? > > Well, we could do "select rolname from pg_roles order by 1" and > actually compare the results of the two selects. That might be > advisable anyway, in order to produce a complaint with useful > detail when there is something wrong. I can see the value in doing something like this to keep us honest. -- Daniel Gustafsson
> On 30 Mar 2023, at 22:29, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Well, we could do "select rolname from pg_roles order by 1" and > actually compare the results of the two selects. That might be > advisable anyway, in order to produce a complaint with useful > detail when there is something wrong. I took a look at this and came up with the attached. This adds a new parameter to pg_regress for specifying a test which will be executed before and after the suite, where the first invocation creates the expectfile for the second. For storing the expecfile the temp dir creation is somewhat refactored. I've added a sample test in the patch (to regress, not ECPG), but I'm sure it can be expanded to be a bit more interesting. The comment which is now incorrectly formatted was left like that to make review easier, if this gets committed it will be fixed then. I opted for this to use the machinery that pg_regress already has rather than add a new mechanism (and dependency) for running and verifying queries. This also avoids hardcoding the test making it easier to have custom queries during hacking etc. Looking at this I also found a bug introduced in the TAP format patch, which made failed single run tests report as 0ms due to the parameters being mixed up in the report function call. This is in 0002, which I'll apply to HEAD regardless of 0001 as they are unrelated. -- Daniel Gustafsson
Вложения
> On 15 May 2023, at 10:59, Daniel Gustafsson <daniel@yesql.se> wrote: > Looking at this I also found a bug introduced in the TAP format patch, which > made failed single run tests report as 0ms due to the parameters being mixed up > in the report function call. This is in 0002, which I'll apply to HEAD > regardless of 0001 as they are unrelated. With 0002 applied, attached is just the 0001 rebased to keep the CFBot from being angry when applying an already applied patch. Parked in the July CF for now. -- Daniel Gustafsson
Вложения
> On 16 May 2023, at 11:17, Daniel Gustafsson <daniel@yesql.se> wrote: > Parked in the July CF for now. Rebased to fix a trivial conflict highlighted by the CFBot. -- Daniel Gustafsson
Вложения
On 06.07.23 00:00, Daniel Gustafsson wrote: >> On 16 May 2023, at 11:17, Daniel Gustafsson <daniel@yesql.se> wrote: > >> Parked in the July CF for now. > > Rebased to fix a trivial conflict highlighted by the CFBot. I think the problem with this approach is that one would need to reapply it to each regression test suite separately. For example, there are several tests under contrib/ that create roles. These would not be covered by this automatically. I think the earlier idea of just counting roles, tablespaces, etc. before and after would be sufficient.
On 2023-Nov-08, Peter Eisentraut wrote: > I think the earlier idea of just counting roles, tablespaces, etc. before > and after would be sufficient. Maybe record global objects in a permanent table in test_setup.sql create table global_objs as select 'role', rolname from pg_roles union all select 'tablespace', spcname from pg_tablespace; and at the end (maybe in test tablespace, though it's unrelated but it's what runs last and drops regress_tablespace), have (select 'role', rolname from pg_roles union all select 'tablespace', spcname from pg_tablespace) except select * from global_objs; and check the expected as empty. -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ "La verdad no siempre es bonita, pero el hambre de ella sí"
> On 8 Nov 2023, at 08:55, Peter Eisentraut <peter@eisentraut.org> wrote: > > On 06.07.23 00:00, Daniel Gustafsson wrote: >>> On 16 May 2023, at 11:17, Daniel Gustafsson <daniel@yesql.se> wrote: >>> Parked in the July CF for now. >> Rebased to fix a trivial conflict highlighted by the CFBot. > > I think the problem with this approach is that one would need to reapply it to each regression test suite separately. For example, there are several tests under contrib/ that create roles. These would not be covered by this automatically. > > I think the earlier idea of just counting roles, tablespaces, etc. before and after would be sufficient. It's been a while but if memory serves me right, one of the reasons for this approach was that pg_regress didn't use libpq so running queries and storing results for comparisons other than diffing .out files was painful at best. Since 66d6086cbc pg_regress does have a dependency on libpq so we can now perform that bookkeeping a bit easier. I still find it more elegant to at least compare the contents and not just the count, but I can take a stab at a revised patch since this approach doesn't seem to appeal to the thread. -- Daniel Gustafsson
> On 8 Nov 2023, at 12:42, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > On 2023-Nov-08, Peter Eisentraut wrote: >> I think the earlier idea of just counting roles, tablespaces, etc. before >> and after would be sufficient. > > Maybe record global objects in a permanent table in test_setup.sql Since test_setup.sql is part of the regress schedule and not pg_regress we would have to implement this for each test run (regress, contribs etc), which is what Peter didn't like about the original suggestion. -- Daniel Gustafsson
On 2023-Nov-08, Daniel Gustafsson wrote: > Since test_setup.sql is part of the regress schedule and not pg_regress we > would have to implement this for each test run (regress, contribs etc), which > is what Peter didn't like about the original suggestion. Oh, somehow that aspect of his reply failed to register with me. I agree with your approach of using libpq in pg_regress then. I suppose you're just thinking of using PQexec() or whatever, run one query with sufficient ORDER BY, save the result, and at the end of the test run just run the same query and compare that they are cell-by-cell identical? This sounds a lot simpler than the patch you posted. -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
> On 8 Nov 2023, at 13:32, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > On 2023-Nov-08, Daniel Gustafsson wrote: > >> Since test_setup.sql is part of the regress schedule and not pg_regress we >> would have to implement this for each test run (regress, contribs etc), which >> is what Peter didn't like about the original suggestion. > > Oh, somehow that aspect of his reply failed to register with me. I > agree with your approach of using libpq in pg_regress then. > > I suppose you're just thinking of using PQexec() or whatever, run one > query with sufficient ORDER BY, save the result, and at the end of the > test run just run the same query and compare that they are cell-by-cell > identical? This sounds a lot simpler than the patch you posted. Correct, that's my plan. The rationale for the earlier patch was to avoid adding a dependency on libpq, but with that already discussed and done we can leverage the fact that we can run such queries easy. -- Daniel Gustafsson
> On 8 Nov 2023, at 13:32, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > I suppose you're just thinking of using PQexec() or whatever, run one > query with sufficient ORDER BY, save the result, and at the end of the > test run just run the same query and compare that they are cell-by-cell > identical? This sounds a lot simpler than the patch you posted. I found some spare cycles for this and came up with the attached. The idea was to keep it in-line with how pg_regress already today manipulate and traverse _stringlists for various things. With the addition of the 0001 patch to clean up global objects left in test_pg_dump it passes check-world. -- Daniel Gustafsson
Вложения
On 01/12/2023 13:22, Daniel Gustafsson wrote: >> On 8 Nov 2023, at 13:32, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > >> I suppose you're just thinking of using PQexec() or whatever, run one >> query with sufficient ORDER BY, save the result, and at the end of the >> test run just run the same query and compare that they are cell-by-cell >> identical? This sounds a lot simpler than the patch you posted. > > I found some spare cycles for this and came up with the attached. The idea was > to keep it in-line with how pg_regress already today manipulate and traverse > _stringlists for various things. With the addition of the 0001 patch to clean > up global objects left in test_pg_dump it passes check-world. Do we want to impose this policy to all extensions too? > + /* > + * Store the global objects before the test starts such that we can check > + * for any objects left behind after the tests finish. > + */ > + query_to_stringlist("postgres", > + "(SELECT rolname AS obj FROM pg_catalog.pg_roles ORDER BY 1) " > + "UNION ALL " > + "(SELECT spcname AS obj FROM pg_catalog.pg_tablespace ORDER BY 1) " > + "UNION ALL " > + "(SELECT subname AS obj FROM pg_catalog.pg_subscription ORDER BY 1)", > + &globals_before); > + Strictly speaking, the order of this query isn't guaranteed to be stable, although in practice it probably is. Maybe something like this: (SELECT 'role', rolname AS obj FROM pg_catalog.pg_roles UNION ALL SELECT 'tablespace', spcname AS obj FROM pg_catalog.pg_tablespace UNION ALL SELECT 'subscription', subname AS obj FROM pg_catalog.pg_subscription ) ORDER BY 1, 2 Is it OK to leave behind extra databases? -- Heikki Linnakangas Neon (https://neon.tech)
> On 1 Dec 2023, at 12:37, Heikki Linnakangas <hlinnaka@iki.fi> wrote: > > On 01/12/2023 13:22, Daniel Gustafsson wrote: >>> On 8 Nov 2023, at 13:32, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: >>> I suppose you're just thinking of using PQexec() or whatever, run one >>> query with sufficient ORDER BY, save the result, and at the end of the >>> test run just run the same query and compare that they are cell-by-cell >>> identical? This sounds a lot simpler than the patch you posted. >> I found some spare cycles for this and came up with the attached. The idea was >> to keep it in-line with how pg_regress already today manipulate and traverse >> _stringlists for various things. With the addition of the 0001 patch to clean >> up global objects left in test_pg_dump it passes check-world. > > Do we want to impose this policy to all extensions too? I don't think it would be bad, and as of today the policy holds for all of check-world apart from this one test module. >> + /* >> + * Store the global objects before the test starts such that we can check >> + * for any objects left behind after the tests finish. >> + */ >> + query_to_stringlist("postgres", >> + "(SELECT rolname AS obj FROM pg_catalog.pg_roles ORDER BY 1) " >> + "UNION ALL " >> + "(SELECT spcname AS obj FROM pg_catalog.pg_tablespace ORDER BY 1) " >> + "UNION ALL " >> + "(SELECT subname AS obj FROM pg_catalog.pg_subscription ORDER BY 1)", >> + &globals_before); >> + > > Strictly speaking, the order of this query isn't guaranteed to be stable, although in practice it probably is. Of course, will fix. I originally had three separate query_to_stringlist calls and had a brainfade when combining. It seemed like pointless use of cycles when we can get everything in one connection. > Is it OK to leave behind extra databases? The test suite for pg_upgrade can make use of left behind databases to seed the old cluster, so I think that's allowed by design. -- Daniel Gustafsson
Isn't it simpler to use DROP OWNED BY in 0001? -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ "Hay que recordar que la existencia en el cosmos, y particularmente la elaboración de civilizaciones dentro de él no son, por desgracia, nada idílicas" (Ijon Tichy)
> On 1 Dec 2023, at 13:19, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > Isn't it simpler to use DROP OWNED BY in 0001? I suppose it is, I kind of like the explicit drops but we do use OWNED BY quite a lot in the regression tests so changed to that in the attached v5. -- Daniel Gustafsson
Вложения
On Fri, 1 Dec 2023 at 18:23, Daniel Gustafsson <daniel@yesql.se> wrote: > > > On 1 Dec 2023, at 13:19, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > > > Isn't it simpler to use DROP OWNED BY in 0001? > > I suppose it is, I kind of like the explicit drops but we do use OWNED BY quite > a lot in the regression tests so changed to that in the attached v5. There are a lot of failures in CFBot at [1] with: # test failed ----------------------------------- stderr ----------------------------------- # left over global object detected: regress_test_bypassrls # 1 of 2 tests failed. More details of the same are available at [2]. Do we need to clean up the objects leftover for the reported issues in the test? [1] - https://cirrus-ci.com/task/6222185975513088 [2] - https://api.cirrus-ci.com/v1/artifact/task/6222185975513088/meson_log/build/meson-logs/testlog-running.txt Regards, Vignesh
> On 18 Jan 2024, at 01:57, vignesh C <vignesh21@gmail.com> wrote: > There are a lot of failures in CFBot at [1] with: > More details of the same are available at [2]. > Do we need to clean up the objects leftover for the reported issues in the test? Not really, these should not need cleaning up, and it's quite odd that it only happens on FreeBSD. I need to investigate further so I'll mark this waiting on author in the meantime. -- Daniel Gustafsson
On 19.01.24 15:26, Daniel Gustafsson wrote: >> On 18 Jan 2024, at 01:57, vignesh C <vignesh21@gmail.com> wrote: > >> There are a lot of failures in CFBot at [1] with: > >> More details of the same are available at [2]. >> Do we need to clean up the objects leftover for the reported issues in the test? > > Not really, these should not need cleaning up, and it's quite odd that it only > happens on FreeBSD. I need to investigate further so I'll mark this waiting on > author in the meantime Most likely because only the FreeBSD job uses ENFORCE_REGRESSION_TEST_NAME_RESTRICTIONS.
Hi, On 2024-01-19 15:40:21 +0100, Peter Eisentraut wrote: > On 19.01.24 15:26, Daniel Gustafsson wrote: > > > On 18 Jan 2024, at 01:57, vignesh C <vignesh21@gmail.com> wrote: > > > > > There are a lot of failures in CFBot at [1] with: > > > > > More details of the same are available at [2]. > > > Do we need to clean up the objects leftover for the reported issues in the test? > > > > Not really, these should not need cleaning up, and it's quite odd that it only > > happens on FreeBSD. I need to investigate further so I'll mark this waiting on > > author in the meantime > > Most likely because only the FreeBSD job uses > ENFORCE_REGRESSION_TEST_NAME_RESTRICTIONS. I don't think it's that, but that the freebsd task tests the installcheck equivalent in meson. I haven't checked what your patch is doing, but perhaps the issue is that it's seeing global objects concurrently created by another test? Greetings, Andres Freund
> On 25 Mar 2024, at 19:48, Andres Freund <andres@anarazel.de> wrote: > I don't think it's that, but that the freebsd task tests the installcheck > equivalent in meson. I haven't checked what your patch is doing, but perhaps > the issue is that it's seeing global objects concurrently created by another > test? Sorry, I had a look when Peter replied a while back but forgot to update this thread. The reason for the failure is that when running multiple pg_regress in parallel against a single instance it is impossible to avoid global object pollution from other tests concurrently executing. Since pg_regress has no idea about the contents of the tests it also cannot apply any smarts in filtering out such objects. The CI failures comes from the contrib tests which run in parallel. The only option is to make the check opt-in via a command-line parameter for use it in the main regress tests, and not use it for the contrib tests. The attached v7 does that and passes CI, but I wonder if it's worth it all with that restriction? The 0001 cleanup patch is still relevant (and was found by using this feature) though but that might be all for this thread. -- Daniel Gustafsson
Вложения
Daniel Gustafsson <daniel@yesql.se> writes: > The only option is to make the check opt-in via a command-line parameter for > use it in the main regress tests, and not use it for the contrib tests. The > attached v7 does that and passes CI, but I wonder if it's worth it all with > that restriction? Yeah, that seems hardly worth it --- and it's only an accident of implementation that we don't run the main tests in parallel with something else, anyway. Too bad, but ... regards, tom lane
> On 27 Mar 2024, at 16:26, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Daniel Gustafsson <daniel@yesql.se> writes: >> The only option is to make the check opt-in via a command-line parameter for >> use it in the main regress tests, and not use it for the contrib tests. The >> attached v7 does that and passes CI, but I wonder if it's worth it all with >> that restriction? > > Yeah, that seems hardly worth it --- and it's only an accident of > implementation that we don't run the main tests in parallel with > something else, anyway. Too bad, but ... Agreed. The excercise did catch the leftovers in 0001 so I'll go ahead with those before closing the CF entry. -- Daniel Gustafsson