Обсуждение: Re: BUG #17212: pg_amcheck fails on checking temporary relations
> On Oct 3, 2021, at 10:04 AM, Mark Dilger <mark.dilger@enterprisedb.com> wrote: > >> On Oct 2, 2021, at 10:32 AM, Peter Geoghegan <pg@bowt.ie> wrote: >> >> On Sat, Oct 2, 2021 at 4:49 AM PG Bug reporting form >> <noreply@postgresql.org> wrote: >>> Although you can add --exclude-relation=*.pg_temp*.*, this behaviour differs >>> from the behaviour of pg_dump and friends, which skip such relations >>> silently. >> >> I agree -- this behavior is a bug. >> >> Can you propose a fix, Mark? > > The attached patch includes a test case for this, which shows the problems against the current pg_amcheck.c, and a newversion of pg_amcheck.c which fixes the bug. Could you review it? > > Thanks for bringing this to my attention. Reposting to pgsql-hackers in preparation for making a commitfest entry. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Вложения
Hello Mark, 04.10.2021 01:20, Mark Dilger wrote: > The attached patch includes a test case for this, which shows the problems against the current pg_amcheck.c, and a newversion of pg_amcheck.c which fixes the bug. Could you review it? > > Thanks for bringing this to my attention. There is another issue, that maybe should be discussed separately (or this thread could be renamed to "... on checking specific relations"), but the solution could be similar to that. pg_amcheck also fails on checking invalid indexes, that could be created legitimately by the CREATE INDEX CONCURRENTLY command. For example, consider the following script: psql -c "CREATE TABLE t(i numeric); INSERT INTO t VALUES (generate_series(1, 10000000));" psql -c "CREATE INDEX CONCURRENTLY t_idx ON t(i);" & pg_amcheck -a --install-missing --heapallindexed --rootdescend --progress || echo "FAIL" pg_amcheck fails with: btree index "regression.public.t_idx": ERROR: cannot check index "t_idx" DETAIL: Index is not valid. 781/781 relations (100%), 2806/2806 pages (100%) FAIL When an index created without CONCURRENTLY, it runs successfully. Beside that, it seems that pg_amcheck produces a deadlock in such a case: 2021-10-04 11:23:38.584 MSK [1451296] ERROR: deadlock detected 2021-10-04 11:23:38.584 MSK [1451296] DETAIL: Process 1451296 waits for ShareLock on virtual transaction 5/542; blocked by process 1451314. Process 1451314 waits for ShareLock on relation 16385 of database 16384; blocked by process 1451296. Process 1451296: CREATE INDEX CONCURRENTLY t_idx ON t(i); Process 1451314: SELECT * FROM "pg_catalog".bt_index_parent_check(index := '16390'::regclass, heapallindexed := true, rootdescend := true) 2021-10-04 11:23:38.584 MSK [1451296] HINT: See server log for query details. 2021-10-04 11:23:38.584 MSK [1451296] STATEMENT: CREATE INDEX CONCURRENTLY t_idx ON t(i); I think that the deadlock is yet another issue, as invalid indexes could appear in other circumstances too. Best regards, Alexander
> On Oct 4, 2021, at 2:00 AM, Alexander Lakhin <exclusion@gmail.com> wrote: Thank you, Alexander, for these bug reports. > There is another issue, that maybe should be discussed separately (or > this thread could be renamed to "... on checking specific relations"), > but the solution could be similar to that. > pg_amcheck also fails on checking invalid indexes, that could be created > legitimately by the CREATE INDEX CONCURRENTLY command. I believe this is a bug in amcheck's btree checking functions. Peter, can you take a look? > For example, consider the following script: > psql -c "CREATE TABLE t(i numeric); INSERT INTO t VALUES > (generate_series(1, 10000000));" > psql -c "CREATE INDEX CONCURRENTLY t_idx ON t(i);" & > pg_amcheck -a --install-missing --heapallindexed --rootdescend > --progress || echo "FAIL" > > pg_amcheck fails with: > btree index "regression.public.t_idx": > ERROR: cannot check index "t_idx" > DETAIL: Index is not valid. > 781/781 relations (100%), 2806/2806 pages (100%) > FAIL Yes, I can reproduce this following your steps. (It's always appreciated to have steps to reproduce.) I can also get this failure without pg_amcheck, going directly to the btree checking code. Having already built the tableas you prescribe: amcheck % psql -c "CREATE INDEX CONCURRENTLY t_idx ON t(i);" & sleep 0.1 && psql -c "SELECT * FROM pg_catalog.bt_index_parent_check(index:= 't_idx'::regclass, heapallindexed := true, rootdescend := true)" [1] 9553 ERROR: deadlock detected DETAIL: Process 9555 waits for ShareLock on virtual transaction 5/11; blocked by process 9558. Process 9558 waits for ShareLock on relation 16406 of database 16384; blocked by process 9555. HINT: See server log for query details. ERROR: cannot check index "t_idx" DETAIL: Index is not valid. [1] + exit 1 psql -c "CREATE INDEX CONCURRENTLY t_idx ON t(i);" If Peter agrees that this is not pg_amcheck specific, then we should start a new thread to avoid confusing the commitfesttickets for these two items. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Oct 4, 2021 at 8:10 AM Mark Dilger <mark.dilger@enterprisedb.com> wrote: > > There is another issue, that maybe should be discussed separately (or > > this thread could be renamed to "... on checking specific relations"), > > but the solution could be similar to that. > > pg_amcheck also fails on checking invalid indexes, that could be created > > legitimately by the CREATE INDEX CONCURRENTLY command. > > I believe this is a bug in amcheck's btree checking functions. Peter, can you take a look? Why do you say that? verify_nbtree.c will throw an error when called with an invalid index -- which is what we actually see here. Obviously that is the intended behavior, and always has been. This hasn't been a problem before now, probably because the sample verification query in the docs (under bt_index_check()) accounts for this directly. Why shouldn't we expect pg_amcheck to do the same thing, at the SQL level? It's practically the same thing as the temp table issue. Indeed, verify_nbtree.c will throw an error on a temp table (at least if it's from another session). > I can also get this failure without pg_amcheck, going directly to the btree checking code. Having already built the tableas you prescribe: > ERROR: deadlock detected > DETAIL: Process 9555 waits for ShareLock on virtual transaction 5/11; blocked by process 9558. > Process 9558 waits for ShareLock on relation 16406 of database 16384; blocked by process 9555. > HINT: See server log for query details. > ERROR: cannot check index "t_idx" > DETAIL: Index is not valid. I think that the deadlock is just another symptom of the same problem. -- Peter Geoghegan
On Mon, Oct 4, 2021 at 2:00 AM Alexander Lakhin <exclusion@gmail.com> wrote: > There is another issue, that maybe should be discussed separately (or > this thread could be renamed to "... on checking specific relations"), > but the solution could be similar to that. Thanks for the report! I wonder if verify_heapam.c does the right thing with unlogged tables when verification runs on a standby -- a brief glance at the code leaves me with the impression that it's not handled there. Note that verify_nbtree.c initially got it wrong. The issue was fixed by bugfix commit 6754fe65. Before then nbtree verification could throw a nasty low-level smgr error, just because we had an unlogged table in hot standby mode. Note that we deliberately skip indexes when this happens (we don't error out), unlike the temp buffers (actually temp table) case. This seems like the right set of behaviors. We really don't want to have to throw an "invalid object type" style error just because verification runs during recovery. Plus it just seems logical to assume that unlogged indexes/tables don't have storage when we're in hot standby mode, and so must simply have nothing for us to verify. -- Peter Geoghegan
> On Oct 4, 2021, at 10:58 AM, Peter Geoghegan <pg@bowt.ie> wrote: > > On Mon, Oct 4, 2021 at 8:10 AM Mark Dilger <mark.dilger@enterprisedb.com> wrote: >>> There is another issue, that maybe should be discussed separately (or >>> this thread could be renamed to "... on checking specific relations"), >>> but the solution could be similar to that. >>> pg_amcheck also fails on checking invalid indexes, that could be created >>> legitimately by the CREATE INDEX CONCURRENTLY command. >> >> I believe this is a bug in amcheck's btree checking functions. Peter, can you take a look? > > Why do you say that? Because REINDEX CONCURRENTLY and the bt_index_parent_check() function seem to have lock upgrade hazards that are unrelatedto pg_amcheck. > This hasn't been a > problem before now, probably because the sample verification query in > the docs (under bt_index_check()) accounts for this directly. It doesn't say anything about deadlocks, but yes, it mentions errors will be raised unless the caller filters out indexesthat are invalid or not ready. On to pg_amcheck's behavior.... I see no evidence in the OP's complaint that pg_amcheck is misbehaving. It launches a worker to check each relation, printsfor the user's benefit any errors those checks raise, and finally returns 0 if they all pass and 2 otherwise. Sincenot all relations could be checked, 2 is returned. Returning 0 would be misleading, as it implies everything was checkedand passed, and it can't honestly say that. The return value 2 does not mean that anything failed. It means thatnot all checks passed. When a 2 is returned, the user is expected to read the output and decide what, if anything, theywant to do about it. In this case, the user might decide to wait until the reindex finishes and check again, or theymight decide they don't care. It is true that pg_amcheck is calling bt_index_parent_check() on an invalid index, but so what? If it chose not to do so,it would still need to print a message about the index being unavailable for checking, and it would still have to return2. It can't return 0, and it is unhelpful to leave the user in the dark about the fact that not all indexes are inthe right state for checking. So it would still print the same error message and still return 2. I think this bug report is really a feature request. The OP appears to want an option to toggle on/off the printing of suchinformation, perhaps with not printing it as the default. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Oct 4, 2021 at 3:36 PM Mark Dilger <mark.dilger@enterprisedb.com> wrote: > >> I believe this is a bug in amcheck's btree checking functions. Peter, can you take a look? > > > > Why do you say that? > > Because REINDEX CONCURRENTLY and the bt_index_parent_check() function seem to have lock upgrade hazards that are unrelatedto pg_amcheck. The problem with that argument is that the bt_index_parent_check() function isn't doing anything particularly special, apart from dropping the lock. That has been its behavior for many years now. > On to pg_amcheck's behavior.... > > I see no evidence in the OP's complaint that pg_amcheck is misbehaving. It launches a worker to check each relation, printsfor the user's benefit any errors those checks raise, and finally returns 0 if they all pass and 2 otherwise. Sincenot all relations could be checked, 2 is returned. Returning 0 would be misleading, as it implies everything was checkedand passed, and it can't honestly say that. The return value 2 does not mean that anything failed. It means thatnot all checks passed. When a 2 is returned, the user is expected to read the output and decide what, if anything, theywant to do about it. In this case, the user might decide to wait until the reindex finishes and check again, or theymight decide they don't care. > > It is true that pg_amcheck is calling bt_index_parent_check() on an invalid index, but so what? If it chose not to doso, it would still need to print a message about the index being unavailable for checking, and it would still have to return2. Why would it have to print such a message? You seem to be presenting this as if there is some authoritative, precise, relevant definition of "the relations that pg_amcheck sees". But to me the relevant details look arbitrary at best. > It can't return 0, and it is unhelpful to leave the user in the dark about the fact that not all indexes are in the rightstate for checking. Why is that unhelpful? More to the point, *why* would this alternative behavior constitute "leaving the user in the dark"? What about the case where the pg_class entry isn't visible to our MVCC snapshot? Why is "skipping" such a relation not just as unhelpful? > I think this bug report is really a feature request. The OP appears to want an option to toggle on/off the printing ofsuch information, perhaps with not printing it as the default. And I don't understand why you think that clearly-accidental implementation details (really just bugs) should be treated as axiomatic truths about how pg_amcheck must work. Should we now "fix" pg_dump so that it matches pg_amcheck? All of the underlying errors are cases that were clearly intended to catch user error -- every single one. But apparently pg_amcheck is incapable of error, by definition. Like HAL 9000. -- Peter Geoghegan
> On Oct 4, 2021, at 4:10 PM, Peter Geoghegan <pg@bowt.ie> wrote: > > And I don't understand why you think that clearly-accidental > implementation details (really just bugs) should be treated as > axiomatic truths about how pg_amcheck must work. Should we now "fix" > pg_dump so that it matches pg_amcheck? > > All of the underlying errors are cases that were clearly intended to > catch user error -- every single one. But apparently pg_amcheck is > incapable of error, by definition. Like HAL 9000. On the contrary, I got all the way finished writing a patch to have pg_amcheck do as you suggest before it dawned on me towonder if that was the right way to go. I certainly don't assume pg_amcheck is correct by definition. I already posteda patch for the temporary tables bug upthread having never argued that it was anything other than a bug. I also wrotea patch for verify_heapam to fix the problem with unlogged tables on standbys, and was developing a test for that, whenI got your email. I'm not arguing against that being a bug, either. Hopefully, I can get that properly tested and postit before too long. I am concerned about giving the user the false impression that an index (or table) was checked when it was not. I don'tsee the logic in pg_amcheck -i idx1 -i idx2 -i idx3 skipping all three indexes and then reporting success. What if the user launches the pg_amcheck command precisely becausethey see error messages in the logs during a long running reindex command, and are curious if the index so generatedis corrupt. You can't assume the user knows the index is still being reindexed. If the last message logged wassome time ago, they might assume the process has finished. So something other than a silent success is needed to letthem know what is going on. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
> On Oct 4, 2021, at 4:28 PM, Mark Dilger <mark.dilger@enterprisedb.com> wrote: > > pg_amcheck -i idx1 -i idx2 -i idx3 I forgot to mention: There's a continuum between `pg_amcheck -a` which checks everything in all databases of the cluster,and `pg_amcheck -i just_one_index`. There are any number of combinations of object names, schema names, databasenames, and patterns over the same, which select anything from an empty set to a huge set of things to check. I'mtrying to keep the behavior the same for all of those, and that's why I'm trying to avoid having `pg_amcheck -a` silentlyskip indexes that are unavailable for checking while having `pg_amcheck -i just_one_index` give a report about theindex. I wouldn't know where to draw the line between reporting the issue and not, and I doubt whatever line I choosewill be intuitive to users. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Oct 4, 2021 at 4:28 PM Mark Dilger <mark.dilger@enterprisedb.com> wrote: > I am concerned about giving the user the false impression that an index (or table) was checked when it was not. I don'tsee the logic in > > pg_amcheck -i idx1 -i idx2 -i idx3 > > skipping all three indexes and then reporting success. This is the first time that anybody mentioned the -i option on the thread. I read your previous remarks as making a very broad statement, about every single issue. Anyway, the issue with -i doesn't seem like it changes that much. Why not just behave as if there is no such "visible" index? That's the same condition, for all practical purposes. If that approach doesn't seem good enough, then the error message can be refined to make the user aware of the specific issue. > What if the user launches the pg_amcheck command precisely because they see error messages in the logs during a long runningreindex command, and are curious if the index so generated is corrupt. I'm guessing that you meant REINDEX CONCURRENTLY. Since you're talking about the case where it has an error, the whole index build must have failed. So the user would get exactly what they'd expect -- verification of the original index, without any hindrance from the new/failed index. (Thinks for a moment...) Actually, I think that we'd only verify the original index, even before the error with CONCURRENTLY (though I've not checked that point myself). -- Peter Geoghegan
> On Oct 4, 2021, at 4:45 PM, Peter Geoghegan <pg@bowt.ie> wrote: > > I'm guessing that you meant REINDEX CONCURRENTLY. Yes. > Since you're talking about the case where it has an error Sorry, I realized after hitting <send> that you might take it that way, but I mean the logs generally, not just postgreslogs. If somebody runs "reindex concurrently" on all tables at midnight every night, and they see one morning inthe (non-postgres) logs from the midnight hour weird error messages about their RAID controller, they may well want tocheck all their indexes to see if any of them were corrupted. This is a totally made-up example, but the idea that a usermight want to check their indexes, tables, or both owing to errors of some kind is not far-fetched. > , the whole > index build must have failed. So the user would get exactly what > they'd expect -- verification of the original index, without any > hindrance from the new/failed index. Right, in that case, but not if hardware errors corrupted the index, and generated logs, without happening to trip up thereindex concurrently statement itself. > (Thinks for a moment...) > > Actually, I think that we'd only verify the original index, even > before the error with CONCURRENTLY (though I've not checked that point > myself). To get back on track, let me say that I'm not taking the position that what pg_amcheck currently does is necessarily correct,but just that I'd like to be careful about what we change, if anything. There are three things that seem to irritatepeople: 1) A non-zero exit code means "not everything was checked and passed" rather than "at least one thing is definitely corrupt". 2) Skipping of indexes is reported to the user with the word 'ERROR' in the report rather than, say, 'NOTICE'. 3) Deadlocks can occur I have resisted changing #1 on the theory that `pg_amcheck --all && ./post_all_checks_pass.sh` should only run the post_all_checks_pass.shif indeed all checks have passed, and I'm interpreting skipping an index check as being contrary tothat. But maybe that's wrong of me. I don't know. There is already sloppiness between the time that pg_amcheck resolveswhich database relations are matched by --all, --table, --index, etc. and the time that all the checks are started,and again between that time and when the last one is complete. Database objects could be created or dropped duringthose spans of time, in which case --all doesn't have quite so well defined a meaning. But the user running pg_amcheckmight also *know* that they aren't running any such DDL, and therefore expect --all to actually result in everythingbeing checked. I find it strange that I should do anything about #2 in pg_amcheck, since it's the function in verify_nbtree that phrasesthe situation as an error. But I suppose I can just ignore that and have it print as a notice. I'm genuinely nottrying to give you grief here -- I simply don't like that pg_amcheck is adding commentary atop what the checking functionsare doing. I see a clean division between what pg_amcheck is doing and what amcheck is doing, and this feels tome to put that on the wrong side of the divide. If refusing to check the index because it is not in the requisite stateis a notice, then why wouldn't verify_nbtree raise it as one and return early rather than raising an error? I also find it strange that #3 is being attributed to pg_amcheck's choice of how to call the checking function, because Ican't think of any other function where we require the SQL caller to do anything like what you are requiring here in orderto prevent deadlocks, and also because the docs for the functions don't say that a deadlock is possible, merely thatthe function may return with an error. I was totally content to get an error back, since errors are how the verify_nbtreefunctions communicate everything else, and the handler for those functions is already prepared to deal withthe error messages so returned. But it clearly annoys you that pg_amcheck is doing this, so I'll go forward with thepatch that I already have written which does otherwise. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Oct 4, 2021 at 5:32 PM Mark Dilger <mark.dilger@enterprisedb.com> wrote: > Sorry, I realized after hitting <send> that you might take it that way, but I mean the logs generally, not just postgreslogs. If somebody runs "reindex concurrently" on all tables at midnight every night, and they see one morning inthe (non-postgres) logs from the midnight hour weird error messages about their RAID controller, they may well want tocheck all their indexes to see if any of them were corrupted. I don't see what the point of this example is. Why is the REINDEX CONCURRENTLY index special here? Presumably the user is using pg_amcheck with its -i option in this scenario, since you've scoped it that way. Where did they get that index name from? Presumably it's just the original familiar index name? How did an error message that's not from the Postgres logs (or something similar) contain any index name? If the overnight rebuild completed successfully then we'll verify the newly rebuilt smgr relfilenode for the index. It if failed then we'll just verify the original. In other words, if we treat the validity of indexes as a "visibility concern", everything works out just fine. If there is an orphaned index (because of the implementation issue with CONCURRENTLY) then it is *definitely* "corrupt" -- but not in any sense that pg_amcheck ought to concern itself with. Such an orphaned index can never actually be used by anybody. (We should fix this wart in the CONCURRENTLY implementation some day.) > To get back on track, let me say that I'm not taking the position that what pg_amcheck currently does is necessarily correct,but just that I'd like to be careful about what we change, if anything. There are three things that seem to irritatepeople: > > 1) A non-zero exit code means "not everything was checked and passed" rather than "at least one thing is definitely corrupt". Right. > 2) Skipping of indexes is reported to the user with the word 'ERROR' in the report rather than, say, 'NOTICE'. Right -- but it's also the specifics of the error. These are errors that only make sense when there was specific human error. Which is clearly not the case at all, except perhaps in the narrow -i case. > 3) Deadlocks can occur Right. > I have resisted changing #1 on the theory that `pg_amcheck --all && ./post_all_checks_pass.sh` should only run the post_all_checks_pass.shif indeed all checks have passed, and I'm interpreting skipping an index check as being contrary tothat. You're also interpreting it as "skipping". This is a subjective interpretation. Which is fair enough - I can see why you'd put it that way. But that's not how I see it. Again, consider that pg_dump cares about the "indisready" status of indexes, for a variety of reasons. Now, the pg_dump example doesn't necessarily mean that pg_amcheck *must* do the same thing (though it certainly suggests as much). To me the important point is that we are perfectly entitled to define "the indexes that pg_amcheck can see" in whatever way seems to make most sense overall, based on practical considerations. > But the user running pg_amcheck might also *know* that they aren't running any such DDL, and therefore expect --all toactually result in everything being checked. The user would also have to know precisely how the system catalogs work during DDL. They'd have to know that the pg_class entry might become visible very early on, rather than at the end, in some cases. They'd know all that, but still be surprised by the current pg_amcheck behavior. Which is itself not consistent with pg_dump. > I find it strange that I should do anything about #2 in pg_amcheck, since it's the function in verify_nbtree that phrasesthe situation as an error. I don't find it strange. It does that because it *is* an error. There is simply no alternative. The solution for amcheck is the same as it has always been: just write the SQL query in a way that avoids it entirely. > I'm genuinely not trying to give you grief here -- I simply don't like that pg_amcheck is adding commentary atop what thechecking functions are doing. pg_amcheck would not be adding commentary if this was addressed in the way that I have in mind. It would merely be dealing with the issue in the way that the amcheck docs have recommended, for years. The problem here (as I see it) is that pg_amcheck is already adding commentary, by not just doing that. > I also find it strange that #3 is being attributed to pg_amcheck's choice of how to call the checking function, becauseI can't think of any other function where we require the SQL caller to do anything like what you are requiring herein order to prevent deadlocks, and also because the docs for the functions don't say that a deadlock is possible, merelythat the function may return with an error. I will need to study the deadlock issue separately. -- Peter Geoghegan
> On Oct 4, 2021, at 6:19 PM, Peter Geoghegan <pg@bowt.ie> wrote: > > I don't see what the point of this example is. It doesn't matter. I am changing pg_amcheck to filter out indexes as you say. Since the btree check should no longer error in these cases,the issue of pg_amcheck exit(2) sorts itself out without further code changes. I am changing verify_heapam to skip unlogged tables during recovery. In testing, checking such a table results in a simplenotice: NOTICE: cannot verify unlogged relation "u_tbl" during recovery, skipping While testing, I also created an index on the unlogged table and checked that index using bt_index_parent_check, and wassurprised that checking it using bt_index_parent_check raises an error: ERROR: cannot acquire lock mode ShareLock on database objects while recovery is in progress HINT: Only RowExclusiveLock or less can be acquired on database objects during recovery. It doesn't get as far as btree_index_mainfork_expected(). So I am changing pg_amcheck to filter out indexes when pg_is_in_recovery()is true and relpersistence='u'. Does that sound right to you? — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Oct 4, 2021 at 8:19 PM Mark Dilger <mark.dilger@enterprisedb.com> wrote: > I am changing pg_amcheck to filter out indexes as you say. Since the btree check should no longer error in these cases,the issue of pg_amcheck exit(2) sorts itself out without further code changes. Cool. > I am changing verify_heapam to skip unlogged tables during recovery. In testing, checking such a table results in a simplenotice: > > NOTICE: cannot verify unlogged relation "u_tbl" during recovery, skipping That makes sense to me. > While testing, I also created an index on the unlogged table and checked that index using bt_index_parent_check, and wassurprised that checking it using bt_index_parent_check raises an error: > > ERROR: cannot acquire lock mode ShareLock on database objects while recovery is in progress > HINT: Only RowExclusiveLock or less can be acquired on database objects during recovery. Calling bt_index_parent_check() in hot standby mode is kind of asking for it to error-out. It requires a ShareLock on the relation, which is inherently not possible during recovery. So I don't feel too badly about letting it just happen. > So I am changing pg_amcheck to filter out indexes when pg_is_in_recovery() is true and relpersistence='u'. Does that soundright to you? Yes, that all sounds right to me. Thanks -- Peter Geoghegan
On Mon, Oct 4, 2021 at 7:10 PM Peter Geoghegan <pg@bowt.ie> wrote: > All of the underlying errors are cases that were clearly intended to > catch user error -- every single one. But apparently pg_amcheck is > incapable of error, by definition. Like HAL 9000. After some thought, I agree with the idea that pg_amcheck ought to skip relations that can't be expected to be valid -- which includes both unlogged relations while in recovery, and also invalid indexes left behind by failed index builds. Otherwise it can only find non-problems, which we don't want to do. But this comment seems like mockery to me, and I don't like that. -- Robert Haas EDB: http://www.enterprisedb.com
On Tue, Oct 5, 2021 at 9:41 AM Robert Haas <robertmhaas@gmail.com> wrote: > On Mon, Oct 4, 2021 at 7:10 PM Peter Geoghegan <pg@bowt.ie> wrote: > > All of the underlying errors are cases that were clearly intended to > > catch user error -- every single one. But apparently pg_amcheck is > > incapable of error, by definition. Like HAL 9000. > But this comment seems like mockery to me, and I don't like that. It was certainly not a constructive way of getting my point across. I apologize to Mark. -- Peter Geoghegan
> On Oct 5, 2021, at 9:58 AM, Peter Geoghegan <pg@bowt.ie> wrote: > > I apologize to Mark. I took no offense. Actually, I owe you a thank-you for having put so much effort into debating the behavior with me. Ithink the patch to fix bug #17212 will be better for it. (And thanks to Robert for the concern.) — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Tue, Oct 5, 2021 at 10:03 AM Mark Dilger <mark.dilger@enterprisedb.com> wrote: > I took no offense. Actually, I owe you a thank-you for having put so much effort into debating the behavior with me. I think the patch to fix bug #17212 will be better for it. Glad that you think so. And, thanks for working on the issue so promptly. This was a question of fundamental definitions. Those are often very tricky. -- Peter Geoghegan
Hello Mark, Peter, Robert, 05.10.2021 20:22, Peter Geoghegan пишет: > On Tue, Oct 5, 2021 at 10:03 AM Mark Dilger > <mark.dilger@enterprisedb.com> wrote: >> I took no offense. Actually, I owe you a thank-you for having put so much effort into debating the behavior with me. I think the patch to fix bug #17212 will be better for it. > Glad that you think so. And, thanks for working on the issue so promptly. > > This was a question of fundamental definitions. Those are often very tricky. Thanks for the discussion and fixing the issues! (I haven't found the latest fix in the thread yet, but I agree with the approach.) I think that ideally pg_amcheck should not fail on a live database, that does not contain corrupted data, and should not affect the database usage by other users (as it's "only a check"). So for example, pg_amcheck should run successfully in parallel with `make installcheck` and should not cause any of the tests fail. (There could be nuances with, say, volatile functions called by the index expressions, but in general it could be possible.) I tried to run the following script: (for i in `seq 100`; do echo "=== iteration $i ===" >>pg_amcheck.log; pg_amcheck -a --install-missing --heapallindexed --rootdescend --progress >>pg_amcheck.log 2>&1 || echo "FAIL" >>pg_amcheck.log; done) & make installcheck And got several deadlocks again (manifested by some tests failing) and also "ERROR: could not open relation with OID xxxx" - that could be considered as a transition state (it is possible without locking), that cause pg_amcheck to report an overall error. Best regards, Alexander
On Tue, Oct 5, 2021 at 11:00 PM Alexander Lakhin <exclusion@gmail.com> wrote: > I think that ideally pg_amcheck should not fail on a live database, that > does not contain corrupted data, and should not affect the database > usage by other users (as it's "only a check"). I agree that that's ideal. As you said, one or two narrow exceptions may need to be made -- cases where there is unavoidable though weird ambiguity (and not a report of true corruption). Overall the user should never see failure from pg_amcheck unless the database is corrupt, or unless things are defined in a pretty odd way, that creates ambiguity. Ordinary DDL certainly doesn't count as unusual here. -- Peter Geoghegan
Hi, hackers!
We've looked through the initial patch and the exclusion of temporary tables from pg_amcheck seems the right thing. Also it is not the matter anyone disagrees here, and we propose to commit it alone.
Supplementary things/features might be left for further discussion but refusing to check temporary tables is the only option IMO.
The patch applies cleanly, tests succeed. I'd propose to set it as RFC.
--
> On Oct 6, 2021, at 8:14 AM, Pavel Borisov <pashkin.elfe@gmail.com> wrote: > > We've looked through the initial patch and the exclusion of temporary tables from pg_amcheck seems the right thing. Alsoit is not the matter anyone disagrees here, and we propose to commit it alone. Thanks for reviewing! I expect to post a new version shortly. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Oct 6, 2021 at 9:25 AM Mark Dilger <mark.dilger@enterprisedb.com> wrote: > Thanks for reviewing! > > I expect to post a new version shortly. Not sure how much it matters, but I have some thoughts on the return value of pg_amcheck. (I'm mostly going into this now because it seems related to how we discuss these issues generally.) A return value of 0 cannot be said to indicate that the database is not corrupt; strictly speaking the verification process doesn't actually verify anything. The null hypothesis is that the database isn't corrupt. pg_amcheck looks for disconfirmatory evidence (evidence of corruption) on a best-effort basis. This seems fundamental. If this philosophy of science stuff seems too abstract, then I can be more concrete: pg_amcheck doesn't even attempt to verify indexes that aren't B-Tree indexes. Clearly we cannot be sure that the database contains no corruption when there happens to be even one such index. And yet the return value from pg_amcheck is still 0 (barring problems elsewhere). I think that it'll always be possible to make *some* argument like that, even in a world where pg_amcheck + amcheck are very feature complete. As I said, it seems fundamental. -- Peter Geoghegan
> On Oct 6, 2021, at 10:16 AM, Peter Geoghegan <pg@bowt.ie> wrote: > > A return value of 0 cannot be said to indicate that the database is > not corrupt; Nor can a non-zero value be said to indicate that the database is corrupt. These invocations will still return a non-zero exit status: pg_amcheck -D no_privs_database pg_amcheck --index="no_such_index" pg_amcheck --table="somebody_elses_temp_table" pg_amcheck --index="somebody_elses_temp_index" but these have been modified to no longer do so: pg_amcheck -D my_database_in_recovery --parent-check pg_amcheck -D my_database_in_recovery --heapallindexed pg_amcheck --all Please compare to: find /private || echo "FAIL" rm /not/my/file || echo "FAIL" I'm not sure how the idea that pg_amcheck should never give back a failure code unless there is corruption got inserted intothis thread, but I'm not on board with that as an invariant statement. The differences in the upcoming version are 1) --all no longer means "all relations" but rather "all checkable relations" 2) checking options should be automatically downgraded under circumstances where they cannot be applied 3) unlogged relations during replication are by definition not corrupt I think #1 and #3 are unsurprising enough that they need no documentation update. #2 is slightly surprising (at least tome) so I updated the docs for it. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Oct 6, 2021 at 10:19 AM Mark Dilger <mark.dilger@enterprisedb.com> wrote: > > A return value of 0 cannot be said to indicate that the database is > > not corrupt; > > Nor can a non-zero value be said to indicate that the database is corrupt. I never said otherwise. I think it's perfectly fine that there are multiple non-zero return values. It's totally unrelated. > I'm not sure how the idea that pg_amcheck should never give back a failure code unless there is corruption got insertedinto this thread, but I'm not on board with that as an invariant statement. I agree; I'm also not on board with it as an invariant statement. > The differences in the upcoming version are > > 1) --all no longer means "all relations" but rather "all checkable relations" Clearly pg_amcheck never checked all relations, because it never checked indexes that are not B-Tree indexes. I'm pretty sure that I can poke big holes in almost any positivist statement like that with little effort. > 2) checking options should be automatically downgraded under circumstances where they cannot be applied > 3) unlogged relations during replication are by definition not corrupt > > I think #1 and #3 are unsurprising enough that they need no documentation update. #2 is slightly surprising (at leastto me) so I updated the docs for it. To me #2 sounds like a tautology. It could almost be phrased as "pg_amcheck does not check the things that it cannot check". -- Peter Geoghegan
> On Oct 6, 2021, at 10:39 AM, Peter Geoghegan <pg@bowt.ie> wrote: > >> The differences in the upcoming version are >> >> 1) --all no longer means "all relations" but rather "all checkable relations" > > Clearly pg_amcheck never checked all relations, because it never > checked indexes that are not B-Tree indexes. I'm pretty sure that I > can poke big holes in almost any positivist statement like that with > little effort. There is a distinction here that you are (intentionally?) failing to acknowledge. On the one hand, there are relation typesthat cannot be checked because no checking functions for them exist. (Hash, gin, gist, etc.) On the other hand, thereare relations which could be check but for the current state of the system, or could be checked in some particular waybut for the current state of the system. One of those has to do with code that doesn't exist, and the other has to dowith the state of the system. I'm only talking about the second. > >> 2) checking options should be automatically downgraded under circumstances where they cannot be applied >> 3) unlogged relations during replication are by definition not corrupt >> >> I think #1 and #3 are unsurprising enough that they need no documentation update. #2 is slightly surprising (at leastto me) so I updated the docs for it. > > To me #2 sounds like a tautology. It could almost be phrased as > "pg_amcheck does not check the things that it cannot check". I totally disagree. It is uncomfortable to me that `pg_amcheck --parent-check` will now silently not perform the parentcheck that was explicitly requested. That reported an error before, and now it just downgrades the check. This ishardly tautological. I'm only willing to post a patch with that change because I can see a practical argument that somebodymight run that as a cron job and they don't want the cron job failing when the database happens to go into recovery. But again, not at all tautological. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Oct 6, 2021 at 10:57 AM Mark Dilger <mark.dilger@enterprisedb.com> wrote: > > Clearly pg_amcheck never checked all relations, because it never > > checked indexes that are not B-Tree indexes. I'm pretty sure that I > > can poke big holes in almost any positivist statement like that with > > little effort. > > There is a distinction here that you are (intentionally?) failing to acknowledge. On the one hand, there are relation typesthat cannot be checked because no checking functions for them exist. (Hash, gin, gist, etc.) On the other hand, thereare relations which could be check but for the current state of the system, or could be checked in some particular waybut for the current state of the system. One of those has to do with code that doesn't exist, and the other has to dowith the state of the system. I'm only talking about the second. I specifically acknowledge and reject that distinction. That's my whole point. Your words were: '--all no longer means "all relations" but rather "all checkable relations"'. But somehow the original clean definition of "--all" was made no less clean by not including GiST indexes and so on from the start. You're asking me to believe that it was really implied all along that "all checkable relations" didn't include the relations that obviously weren't checkable. You're probably going to have to keep making post-hoc amendments to your original statement like this. Obviously the gap in functionality from non-standard index AMs is far more important than the totally theoretical issue with failed CONCURRENTLY indexes. But even if they were equally important, your emphasis on the latter would still be arbitrary. > I totally disagree. It is uncomfortable to me that `pg_amcheck --parent-check` will now silently not perform the parentcheck that was explicitly requested. But the whole definition of "check that was explicitly requested" relies on your existing understanding of what pg_amcheck is supposed to do. That's not actually essential. I don't see it that way, for example. -- Peter Geoghegan
On Wed, Oct 6, 2021 at 1:57 PM Mark Dilger <mark.dilger@enterprisedb.com> wrote: > > To me #2 sounds like a tautology. It could almost be phrased as > > "pg_amcheck does not check the things that it cannot check". > > I totally disagree. It is uncomfortable to me that `pg_amcheck --parent-check` will now silently not perform the parentcheck that was explicitly requested. That reported an error before, and now it just downgrades the check. This ishardly tautological. I'm only willing to post a patch with that change because I can see a practical argument that somebodymight run that as a cron job and they don't want the cron job failing when the database happens to go into recovery. But again, not at all tautological. Yeah, I don't think that's OK. -1 from me on making any such change. If I say pg_amcheck --heapallindexed, I expect it to pass heapallindexed = true to bt_index_check(). I don't expect it to make a decision internally whether I really meant it when I said I wanted --heapallindexed checking. All of the decisions we're talking about here really have to do with determining the user's intent. I think that if the user says pg_amcheck --all, there's a good argument that they don't want us to check unlogged relations on a standby which will never be valid, or failed index builds which need not be valid. But even that is not necessarily true. If the user typed pg_amcheck -i some_index_that_failed_to_build, there is a pretty strong argument that they want us to check that index and maybe fail, not skip checking that index and report success without doing anything. I think it's reasonable to accept that unfortunate deviation from the user's intent in order to get the benefit of not failing for silly reasons when, as will normally be the case, somebody just tries to check the entire database, or some subset of tables and their corresponding indexes. In those cases the user pretty clearly only wants to check the valid things. So I agree, with some reservations, that excluding unlogged relations while in recovery and invalid indexes is probably the thing which is most likely to give the users what they want. But how can we possibly say that a user who specifies --heapallindexed doesn't really mean what they said? -- Robert Haas EDB: http://www.enterprisedb.com
On Wed, Oct 6, 2021 at 11:32 AM Robert Haas <robertmhaas@gmail.com> wrote: > All of the decisions we're talking about here really have to do with > determining the user's intent. I think that if the user says > pg_amcheck --all, there's a good argument that they don't want us to > check unlogged relations on a standby which will never be valid, or > failed index builds which need not be valid. But even that is not > necessarily true. If the user typed pg_amcheck -i > some_index_that_failed_to_build, there is a pretty strong argument > that they want us to check that index and maybe fail, not skip > checking that index and report success without doing anything. I think > it's reasonable to accept that unfortunate deviation from the user's > intent in order to get the benefit of not failing for silly reasons > when, as will normally be the case, somebody just tries to check the > entire database, or some subset of tables and their corresponding > indexes. In those cases the user pretty clearly only wants to check > the valid things. So I agree, with some reservations, that excluding > unlogged relations while in recovery and invalid indexes is probably > the thing which is most likely to give the users what they want. > > But how can we possibly say that a user who specifies --heapallindexed > doesn't really mean what they said? I am pretty sure that I agree with you about all these details. We need to tease them apart some more. --heapallindexed doesn't complicate things for us at all. It changes nothing about the locking considerations. It's just an additive thing, some extra checks with the same basic underlying requirements. Maybe you meant to say --parent-check, not --heapallindexed? --parent-check does present us with the question of what to do in Hot Standby mode, where it will surely fail (because it requires a relation level ShareLock, etc). But I actually don't think it's complicated: we must throw an error, because it's fundamentally not something that will ever work (with any index). Whether the error comes from pg_amcheck or amcheck proper doesn't seem important to me. I think it's pretty clear that verify_heapam.c (from amcheck proper) should just follow verify_nbtree.c when directly invoked against an unlogged index in Hot Standby. That is, it should assume that the relation has no storage, but still "verify" it conceptually. Just show a NOTICE about it. Assume no storage to verify. Finally, there is the question of what happens inside pg_amcheck (not amcheck proper) deals with unlogged relations in Hot Standby mode. There are two reasonable options: it can either "verify" the indexes (actually just show those NOTICE messages), or skip them entirely. I lean towards the former option, on the grounds that I don't think it should be special-cased. But I don't feel very strongly about it. -- Peter Geoghegan
On Wed, Oct 6, 2021 at 11:55 AM Peter Geoghegan <pg@bowt.ie> wrote: > I am pretty sure that I agree with you about all these details. We > need to tease them apart some more. I think that what I've said boils down to this: * pg_amcheck shouldn't attempt to verify temp relations, on the grounds that this is fundamentally not useful, and not something that could ever be sensibly interpreted as "just doing what the user asked for". * pg_amcheck calls to bt_index_check()/bt_index_parent_check() must only be made with "i.indisready AND i.indisvalid" indexes, just like the old query from the docs. (Actually, the same query also filters out temp relations -- which is why I view this issue as almost identical to the first.) Why would the user ask for something that fundamentally doesn't make any sense? The argument "that's just what they asked for" has it backwards, because *not* asking for it is very difficult, while asking for it (which, remember, fundamentally makes no sense) is very easy. * --parent-check can and should fail in hot standby mode. The argument "that's just what the user asked for" works perfectly here. -- Peter Geoghegan
On Wed, Oct 6, 2021 at 2:56 PM Peter Geoghegan <pg@bowt.ie> wrote: > --heapallindexed doesn't complicate things for us at all. It changes > nothing about the locking considerations. It's just an additive thing, > some extra checks with the same basic underlying requirements. Maybe > you meant to say --parent-check, not --heapallindexed? To me, it doesn't matter which specific option we're talking about. If I tell pg_amcheck to pass a certain flag to the underlying functions, then it should do that. If the behavior needs to be changed, it should be changed in those underlying functions, not in pg_amcheck. If we start putting some of the intelligence into amcheck itself, and some of it into pg_amcheck, I think it's going to become confusing and in fact I think it's going to become unreliable, at least from the user point of view. People will get confused if they run pg_amcheck and get some result (either pass or fail) and then they do the same thing with pg_amcheck and get a different result. > --parent-check does present us with the question of what to do in Hot > Standby mode, where it will surely fail (because it requires a > relation level ShareLock, etc). But I actually don't think it's > complicated: we must throw an error, because it's fundamentally not > something that will ever work (with any index). Whether the error > comes from pg_amcheck or amcheck proper doesn't seem important to me. That detail, to me, is actually very important. > I think it's pretty clear that verify_heapam.c (from amcheck proper) > should just follow verify_nbtree.c when directly invoked against an > unlogged index in Hot Standby. That is, it should assume that the > relation has no storage, but still "verify" it conceptually. Just show > a NOTICE about it. Assume no storage to verify. I haven't checked the code, but that sounds right. I interpret this to mean that the different sub-parts of amcheck don't handle this case in ways that are consistent with each other, and that seems wrong. We should make them consistent. > Finally, there is the question of what happens inside pg_amcheck (not > amcheck proper) deals with unlogged relations in Hot Standby mode. > There are two reasonable options: it can either "verify" the indexes > (actually just show those NOTICE messages), or skip them entirely. I > lean towards the former option, on the grounds that I don't think it > should be special-cased. But I don't feel very strongly about it. I like having it do this: ereport(NOTICE, (errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION), errmsg("cannot verify unlogged index \"%s\" during recovery, skipping", RelationGetRelationName(rel)))); I think the fewer decisions the command-line tool makes, the better. We should put the policy decisions in amcheck itself. -- Robert Haas EDB: http://www.enterprisedb.com
> On Oct 6, 2021, at 12:28 PM, Peter Geoghegan <pg@bowt.ie> wrote: > > I think that what I've said boils down to this: > > * pg_amcheck shouldn't attempt to verify temp relations, on the > grounds that this is fundamentally not useful, and not something that > could ever be sensibly interpreted as "just doing what the user asked > for". Right. I don't think there has been any disagreement on this. There is a bug in pg_amcheck with respect to this issue,and we all agree on that. > * pg_amcheck calls to bt_index_check()/bt_index_parent_check() must > only be made with "i.indisready AND i.indisvalid" indexes, just like > the old query from the docs. (Actually, the same query also filters > out temp relations -- which is why I view this issue as almost > identical to the first.) > > Why would the user ask for something that fundamentally doesn't make > any sense? The user may not know that the system has changed. For example, if I see errors in the logs suggesting corruption in a relation named "mark" and run pg_amcheck --relation=mark,I expect that to check the relation. If that relation is a temporary table, I'd like to know that it's notgoing to be checked, not just have pg_amcheck report that everything is ok. As another example, if I change my environment variables to connect to the standby rather than the primary, and forget thatI did so, and then run pg_amcheck --relation=unlogged_relation, I'd rather get a complaint that I can't check an unloggedrelation on a standby than get nothing. Sure, what I did doesn't make sense, but why should the application paperover that mistake? — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Oct 6, 2021 at 12:33 PM Robert Haas <robertmhaas@gmail.com> wrote: > To me, it doesn't matter which specific option we're talking about. If > I tell pg_amcheck to pass a certain flag to the underlying functions, > then it should do that. If the behavior needs to be changed, it should > be changed in those underlying functions, not in pg_amcheck. I agree, with the stipulation that the caller (in this case pg_amcheck) is required to know certain basic things about the relation in order to get useful behavior. For example, if you use bt_index_check() with a GIN index, you're going to get an error. That much we can all agree on, I'm sure. Where I might go further than you or Mark (not sure) is on this: I also think that it's the caller's job to not call the functions with temp relations, or (in the case of the index verification stuff) with !indisready or !indisvalid relations. I believe that these ought to also be treated as basic questions about the relation, just like in my GIN example. But that's as far as I go here. > If we > start putting some of the intelligence into amcheck itself, and some > of it into pg_amcheck, I think it's going to become confusing and in > fact I think it's going to become unreliable, at least from the user > point of view. People will get confused if they run pg_amcheck and get > some result (either pass or fail) and then they do the same thing with > pg_amcheck and get a different result. Agreed on all that. > > --parent-check does present us with the question of what to do in Hot > > Standby mode, where it will surely fail (because it requires a > > relation level ShareLock, etc). But I actually don't think it's > > complicated: we must throw an error, because it's fundamentally not > > something that will ever work (with any index). Whether the error > > comes from pg_amcheck or amcheck proper doesn't seem important to me. > > That detail, to me, is actually very important. I believe that you actually reached the same conclusion, though: we should let it just fail. That makes this question easy. > > I think it's pretty clear that verify_heapam.c (from amcheck proper) > > should just follow verify_nbtree.c when directly invoked against an > > unlogged index in Hot Standby. That is, it should assume that the > > relation has no storage, but still "verify" it conceptually. Just show > > a NOTICE about it. Assume no storage to verify. > > I haven't checked the code, but that sounds right. I interpret this to > mean that the different sub-parts of amcheck don't handle this case in > ways that are consistent with each other, and that seems wrong. We > should make them consistent. I agree that nbtree and heapam verification ought to agree here. But my point was just that this behavior just makes sense: what we have is something just like an empty relation. > > Finally, there is the question of what happens inside pg_amcheck (not > > amcheck proper) deals with unlogged relations in Hot Standby mode. > > There are two reasonable options: it can either "verify" the indexes > > (actually just show those NOTICE messages), or skip them entirely. I > > lean towards the former option, on the grounds that I don't think it > > should be special-cased. But I don't feel very strongly about it. > > I like having it do this: > > ereport(NOTICE, > (errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION), > errmsg("cannot verify unlogged index \"%s\" > during recovery, skipping", > RelationGetRelationName(rel)))); > > I think the fewer decisions the command-line tool makes, the better. > We should put the policy decisions in amcheck itself. Wait, so you're arguing that we should change amcheck (both nbtree and heapam verification) to simply reject unlogged indexes during recovery? That doesn't seem like very friendly or self-consistent behavior. At first (in hot standby) it fails. As soon as the DB is promoted, we'll then also have no on-disk storage for the same unlogged relation, but now suddenly it's okay, just because of that. I find it far more logical to just assume that there is no relfilenode storage to check when in hot standby. This isn't the same as the --parent-check thing at all, because that's about an implementation restriction of Hot Standby. Whereas this is about the physical index structure itself. -- Peter Geoghegan
On Wed, Oct 6, 2021 at 12:36 PM Mark Dilger <mark.dilger@enterprisedb.com> wrote: > The user may not know that the system has changed. > > For example, if I see errors in the logs suggesting corruption in a relation named "mark" and run pg_amcheck --relation=mark,I expect that to check the relation. If that relation is a temporary table, I'd like to know that it's notgoing to be checked, not just have pg_amcheck report that everything is ok. This is just a detail to me. I agree that it's reasonable to say "I can't do that specific thing you asked for with the temp relation", instead of "no such verifiable relation" -- but only because it's more specific and user friendly. Providing a slightly friendlier error message like this does not actually conflict with the idea of generally treating temp relations as "not visible to pg_amcheck". Ditto for the similar !indisready/!i.indisvalid B-Tree case. > As another example, if I change my environment variables to connect to the standby rather than the primary, and forgetthat I did so, and then run pg_amcheck --relation=unlogged_relation, I'd rather get a complaint that I can't checkan unlogged relation on a standby than get nothing. Sure, what I did doesn't make sense, but why should the applicationpaper over that mistake? I think that it shouldn't get an error at all -- this should be treated like an empty relation, per the verify_nbtree.c precedent. pg_amcheck doesn't need to concern itself with this at all. -- Peter Geoghegan
On Wed, Oct 6, 2021 at 3:56 PM Peter Geoghegan <pg@bowt.ie> wrote: > I agree, with the stipulation that the caller (in this case > pg_amcheck) is required to know certain basic things about the > relation in order to get useful behavior. For example, if you use > bt_index_check() with a GIN index, you're going to get an error. That > much we can all agree on, I'm sure. Yes. > Where I might go further than you or Mark (not sure) is on this: I > also think that it's the caller's job to not call the functions with > temp relations, or (in the case of the index verification stuff) with > !indisready or !indisvalid relations. I believe that these ought to > also be treated as basic questions about the relation, just like in my > GIN example. But that's as far as I go here. I am on board with this, with slight trepidation. > > > --parent-check does present us with the question of what to do in Hot > > > Standby mode, where it will surely fail (because it requires a > > > relation level ShareLock, etc). But I actually don't think it's > > > complicated: we must throw an error, because it's fundamentally not > > > something that will ever work (with any index). Whether the error > > > comes from pg_amcheck or amcheck proper doesn't seem important to me. > > > > That detail, to me, is actually very important. > > I believe that you actually reached the same conclusion, though: we > should let it just fail. That makes this question easy. Great. > > > I think it's pretty clear that verify_heapam.c (from amcheck proper) > > > should just follow verify_nbtree.c when directly invoked against an > > > unlogged index in Hot Standby. That is, it should assume that the > > > relation has no storage, but still "verify" it conceptually. Just show > > > a NOTICE about it. Assume no storage to verify. > > > > I haven't checked the code, but that sounds right. I interpret this to > > mean that the different sub-parts of amcheck don't handle this case in > > ways that are consistent with each other, and that seems wrong. We > > should make them consistent. > > I agree that nbtree and heapam verification ought to agree here. But > my point was just that this behavior just makes sense: what we have is > something just like an empty relation. I am not confident that this behavior is optimal. It's pretty arbitrary. It's like saying "well, you asked me to check that everyone in the car was wearing seatbelts, and the car has no seatbelts, so we're good!" To which I respond: maybe. Were we trying to verify that people are complying with safety regulations as well as may be possible under the circumstances, or that people are actually safe? The analogy here is: are we trying to verify that the relations are valid? Or are we just trying to verify that they are as valid as we can expect them to be? For me, the deciding point is that verify_nbtree.c was here first, and it set a precedent. Unless there is a compelling reason to do otherwise, we should make later things conform to that precedent. Whether that's actually best, I'm not certain. It might be, but I'm not sure that it is. > > > Finally, there is the question of what happens inside pg_amcheck (not > > > amcheck proper) deals with unlogged relations in Hot Standby mode. > > > There are two reasonable options: it can either "verify" the indexes > > > (actually just show those NOTICE messages), or skip them entirely. I > > > lean towards the former option, on the grounds that I don't think it > > > should be special-cased. But I don't feel very strongly about it. > > > > I like having it do this: > > > > ereport(NOTICE, > > (errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION), > > errmsg("cannot verify unlogged index \"%s\" > > during recovery, skipping", > > RelationGetRelationName(rel)))); > > > > I think the fewer decisions the command-line tool makes, the better. > > We should put the policy decisions in amcheck itself. > > Wait, so you're arguing that we should change amcheck (both nbtree and > heapam verification) to simply reject unlogged indexes during > recovery? No, that's existing code from btree_index_mainfork_expected. I thought you were saying that verify_heapam.c should adopt the same approach, and I was agreeing, not because I think it's necessarily the perfect approach, but for consistency. -- Robert Haas EDB: http://www.enterprisedb.com
On Wed, Oct 6, 2021 at 1:15 PM Robert Haas <robertmhaas@gmail.com> wrote: > > Where I might go further than you or Mark (not sure) is on this: I > > also think that it's the caller's job to not call the functions with > > temp relations, or (in the case of the index verification stuff) with > > !indisready or !indisvalid relations. I believe that these ought to > > also be treated as basic questions about the relation, just like in my > > GIN example. But that's as far as I go here. > > I am on board with this, with slight trepidation. It may not be a great design, or even a good one. My argument is just that it's the least worst design overall. It is the most consistent with the general design of the system, for reasons that are pretty deeply baked into the system. I'm reminded of the fact that REINDEX CONCURRENTLY's completion became blocked due to similar trepidations. Understandably so. > > I agree that nbtree and heapam verification ought to agree here. But > > my point was just that this behavior just makes sense: what we have is > > something just like an empty relation. > > I am not confident that this behavior is optimal. It's pretty > arbitrary. It's like saying "well, you asked me to check that everyone > in the car was wearing seatbelts, and the car has no seatbelts, so > we're good!" I prefer to think of it as "there is nobody in the car, so we're all good!". > The analogy here is: are we trying to verify that the relations are > valid? Or are we just trying to verify that they are as valid as we > can expect them to be? I think that we do the latter (or something much closer to the latter than to the former). It's actually a very Karl Popper thing. Absence of evidence isn't evidence of absence -- period. We can get into a conversation about degrees of confidence, but that doesn't seem like it'll ever affect how we go about designing these things. A lot of my disagreements around this stuff (especially with Mark) seem to stem from this basic understanding of things, in one way or another. > No, that's existing code from btree_index_mainfork_expected. I thought > you were saying that verify_heapam.c should adopt the same approach, > and I was agreeing, not because I think it's necessarily the perfect > approach, but for consistency. Sorry, I somehow read that code as having an ERROR, not a NOTICE. (Even though I wrote the code myself.) -- Peter Geoghegan
It is the most consistent with the general design of the system, for
reasons that are pretty deeply baked into the system. I'm reminded of
the fact that REINDEX CONCURRENTLY's completion became blocked due to
similar trepidations. Understandably so.
I may mistake, but I recall the fact that all indexes builds started during some other (long) index build do not finish with indexes usable for selects until that long index is built. This may and may not be a source of amcheck misbehavior. Just a note what could be possibly considered.
Best regards,
Pavel Borisov
> On Oct 6, 2021, at 1:49 PM, Peter Geoghegan <pg@bowt.ie> wrote: > >> The analogy here is: are we trying to verify that the relations are >> valid? Or are we just trying to verify that they are as valid as we >> can expect them to be? > > I think that we do the latter (or something much closer to the latter > than to the former). It's actually a very Karl Popper thing. Absence > of evidence isn't evidence of absence -- period. We can get into a > conversation about degrees of confidence, but that doesn't seem like > it'll ever affect how we go about designing these things. > > A lot of my disagreements around this stuff (especially with Mark) > seem to stem from this basic understanding of things, in one way or > another. I think the disagreements are about something else. Talking about pg_amcheck "checking" a database, or "checking" a relation, is actually short-hand for saying that pg_amcheckhanded off the objects to amcheck's functions. The pg_amcheck client application itself isn't checking anything. This short-hand leads to misunderstandings that makes it really hard for me to understand what people mean in thisthread. Your comments suggest that I (or pg_amcheck) take some view on whether the database is corrupt, or whether we'veproven that it is corrupt, or whether we've proven that it is not corrupt. In truth, all the pg_amcheck frontend clientcan take a view on is whether it was able to issue all the commands to the backend that it was asked to issue, andwhether any of those commands responded with an error. Talking about pg_amcheck "failing" is also confusing. I don't understand what people mean by this. The example towardsthe top of this thread from Alexander was about pg_amcheck || echo "fail", but that suggests that failure is justa question of whether pg_amcheck exited with non-zero exit code. In other parts of the thead, talking about pg_amcheck"failing" seems to be used to mean "pg_amcheck has diagnosed corruption". This all gets muddled together. Upthread, I decided to just make the changes to pg_amcheck that you seemed to want, but now I don't know what you want. Can you opine on each of the following. I need to know what they should print, and whether they should return with a non-zeroexit status. I genuinely can't post a patch until I know what these are supposed to do, because I need to updatethe regression tests accordingly: pg_amcheck -d db1 -d db2 -d db3 --table=mytable In this case, mytable is a regular table on db1, a temporary table on db2, and an unlogged table on db3, and db3 is in recovery. pg_amcheck --all --index="*accounting*" --parent-check --table="*human_resources*" --table="*peter*" --relation="*alexander*" Assume a multitude of databases, some primary, some standby, some indexes logged, some unlogged, some temporary. Some ofthe human resources tables are unlogged, some not, and they're scattered across different databases, some in recovery,some not. There is exactly one table per database that matches the pattern /*peter*/, but it's circumstances aredifferent from one database to the next, and likewise for the pattern /*alexander*/ except that in some databases it matchesan index and in others it matches a table. I thought that we were headed toward a decision where (despite my discomfort) pg_amcheck would downgrade options as necessary,but now it sounds like that's not so. So what should it do — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
> On Oct 6, 2021, at 2:45 PM, Mark Dilger <mark.dilger@enterprisedb.com> wrote: > > and db3 is in recovery. <snip> > they're scattered across different databases, some in recovery, some not. What I mean here is that, since pg_amcheck might run for many hours, and database may start in recovery but then exit recovery,or may be restarted and go into recovery while we're not connected to them, the tool may see differences when processinga pattern against one database at one point in time and the same or different patterns against the same or differentdatabases at some other point in time. We don't get the luxury of assuming that nothing changes out from underus. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Oct 6, 2021 at 2:45 PM Mark Dilger <mark.dilger@enterprisedb.com> wrote: > I think the disagreements are about something else. Informally speaking, you could say that pg_amcheck and amcheck verify relations. More formally speaking, both amcheck (whether called by pg_amcheck or some other thing) can only prove the presence of corruption. They cannot prove its absence. (The amcheck docs have always said almost these exact words.) This seems to come up a lot because at various points you seem to be concerned about introducing specific imperfections. But it's not like your starting point was ever perfection, or ever could be. I can always describe a scenario in which amcheck misses real corruption -- a scenario which may be very contrived. So the mere fact that some new theoretical possibility of corruption is introduced by some action does not in itself mean much. We're dealing with that constantly, and always will be. Let's suppose I was to "directly fix amcheck + !indisvalid indexes". I don't even know what that means -- I honestly don't have a clue. You're focussing on one small piece of code in verify_nbtree.c, that seems to punt responsibility, but the fact is that there are deeply baked-in reasons why it does so. That's a reflection of how many things about the system work, in general. Attributing blame to any one small snippet of code (code in verify_nbtree.c, or wherever) just isn't helpful. > In truth, all the pg_amcheck frontend client can take a view on is whether it was able to issue all the commands to thebackend that it was asked to issue, and whether any of those commands responded with an error. AFAICT that pg_amcheck has to do is follow the amcheck user docs, by generalizing from the example SQL query for the B-Tree stuff. And, it should separately filter non-temp relations for the heap stuff, for the same reasons (exactly the same situation there). > pg_amcheck -d db1 -d db2 -d db3 --table=mytable > > In this case, mytable is a regular table on db1, a temporary table on db2, and an unlogged table on db3, and db3 is inrecovery. I don't think that pg_amcheck needs to care about being in recovery, at all. I agreed with you about using pg_is_in_recovery() from at one point. That was a mistake on my part. > I thought that we were headed toward a decision where (despite my discomfort) pg_amcheck would downgrade options as necessary,but now it sounds like that's not so. So what should it do Downgrade is how you refer to it. I just think of it as making sure that pg_amcheck only asks amcheck to verify relations that are basically capable of being verified (e.g., not a temp table). -- Peter Geoghegan
> On Oct 6, 2021, at 3:20 PM, Peter Geoghegan <pg@bowt.ie> wrote: > >> I think the disagreements are about something else. > > Informally speaking, you could say that pg_amcheck and amcheck verify > relations. More formally speaking, both amcheck (whether called by > pg_amcheck or some other thing) can only prove the presence of > corruption. They cannot prove its absence. (The amcheck docs have > always said almost these exact words.) I totally agree that the amcheck functions cannot prove the absence of corruption. I prefer not to even use language about proving the presence of corruption when discussing pg_amcheck. I have let that slideupthread as a convenient short-hand, but I think it doesn't help. For pg_amcheck to take any view whatsoever on whethera btree index is corrupt, it would have to introspect the error message that it gets back from bt_index_check(). It doesn't do that, nor do I think that it should. It just prints the contents of the error for the user and records thatfact and eventually exits with a non-zero exit code. The error might have been something about the command exiting dueto the crash of another backend, or to do with a deadlock against some other process, or whatever, and pg_amcheck hasno opinion about whether any of that is to do with corruption or not. > This seems to come up a lot because at various points you seem to be > concerned about introducing specific imperfections. But it's not like > your starting point was ever perfection, or ever could be. From the point of view of detecting corruptions, I agree that it never could be. But I'm not talking about that. I'm talkingabout whether pg_amcheck issues all the commands that it is supposed to issue. If I work for Daddy Warbucks and hegives me 30 classic cars to take to 10 different mechanics, I can do that job perfectly even if the mechanics do less thanperfect work. If I leave three cars in the driveway, that's on me. Likewise, it's not on pg_amcheck if the checkingfunctions can't do perfect work, but it is on pg_amcheck if it doesn't issue all the expected commands. But lateron in this email, it appears we don't have any remaining disagreements about that. Read on.... > I can > always describe a scenario in which amcheck misses real corruption -- > a scenario which may be very contrived. So the mere fact that some new > theoretical possibility of corruption is introduced by some action > does not in itself mean much. We're dealing with that constantly, and > always will be. I wish we could stop discussing this. I really don't think this ticket has anything to do with how well or how poorly orhow completely the amcheck functions work. > Let's suppose I was to "directly fix amcheck + !indisvalid indexes". I > don't even know what that means -- I honestly don't have a clue. > You're focussing on one small piece of code in verify_nbtree.c, that > seems to punt responsibility, but the fact is that there are deeply > baked-in reasons why it does so. That's a reflection of how many > things about the system work, in general. Attributing blame to any one > small snippet of code (code in verify_nbtree.c, or wherever) just > isn't helpful. I think we have agreed that pg_amcheck can filter out invalid indexes. I don't have a problem with that. I admit that Idid have a problem with that upthread, but its been a while since I conceded that point so I'd rather not have to argueit again. >> In truth, all the pg_amcheck frontend client can take a view on is whether it was able to issue all the commands to thebackend that it was asked to issue, and whether any of those commands responded with an error. > > AFAICT that pg_amcheck has to do is follow the amcheck user docs, by > generalizing from the example SQL query for the B-Tree stuff. And, it > should separately filter non-temp relations for the heap stuff, for > the same reasons (exactly the same situation there). I think we have agreed on that one, too, without me having ever argued it. I posted a patch to filter out the temporarytables already. >> pg_amcheck -d db1 -d db2 -d db3 --table=mytable >> >> In this case, mytable is a regular table on db1, a temporary table on db2, and an unlogged table on db3, and db3 is inrecovery. > > I don't think that pg_amcheck needs to care about being in recovery, > at all. I agreed with you about using pg_is_in_recovery() from at one > point. That was a mistake on my part. Ok, excellent, that was probably the only thing that had me really hung up. I thought you were still asking for pg_amcheckto filter out the --parent-check option when in recovery, but if you're not asking for that, then I might haveenough to go on now. >> I thought that we were headed toward a decision where (despite my discomfort) pg_amcheck would downgrade options as necessary,but now it sounds like that's not so. So what should it do > > Downgrade is how you refer to it. I just think of it as making sure > that pg_amcheck only asks amcheck to verify relations that are > basically capable of being verified (e.g., not a temp table). I was using "downgrading" to mean downgrading from bt_index_parent_check() to bt_index_check() when pg_is_in_recovery() istrue, but you've clarified that you're not requesting that downgrade, so I think we've now gotten past the last stickingpoint about that whole issue. There are other sticking points that don't seem to be things you have taken a view on. Specifically, pg_amcheck complainsif a relation pattern doesn't match anything, so that pg_amcheck --table="*acountng*" will complain if no tables match, giving the user the opportunity to notice that they spelled "accounting" wrong. If therehappens to be a table named "xyzacountngo", and that matches, too bad. There isn't any way pg_amcheck can be responsiblefor that. But if there is a temporary table named "xyzacountngo" and that gets skipped because it's a temp table,I don't know what feedback the user should get. That's a thorny user interfaces question, not a corruption checkingquestion, and I don't think you need to weigh in unless you want to. I'll most likely go with whatever is the simplestto code and/or most similar to what is currently in the tree, because I don't see any knock-down arguments one wayor the other. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Oct 6, 2021 at 3:47 PM Mark Dilger <mark.dilger@enterprisedb.com> wrote: > I totally agree that the amcheck functions cannot prove the absence of corruption. > > I prefer not to even use language about proving the presence of corruption when discussing pg_amcheck. I agree that it doesn't usually help. But sometimes it is important. > > This seems to come up a lot because at various points you seem to be > > concerned about introducing specific imperfections. But it's not like > > your starting point was ever perfection, or ever could be. > From the point of view of detecting corruptions, I agree that it never could be. But I'm not talking about that. I'mtalking about whether pg_amcheck issues all the commands that it is supposed to issue. If I work for Daddy Warbucks andhe gives me 30 classic cars to take to 10 different mechanics, I can do that job perfectly even if the mechanics do lessthan perfect work. If I leave three cars in the driveway, that's on me. Likewise, it's not on pg_amcheck if the checkingfunctions can't do perfect work, but it is on pg_amcheck if it doesn't issue all the expected commands. But lateron in this email, it appears we don't have any remaining disagreements about that. Read on.... When you say "expected commands", I am entitled to ask: expected by whom, based on what underlying principle? Similarly, when you suggest that amcheck should directly deal with !indisvalid indexes itself, it naturally leads to a tricky discussion of the precise definition of a relation (in particular in the presence of REINDEX CONCURRENTLY), and the limits of what is possible with amcheck. That's just where the discussion has to go. You cannot say that amcheck must (say) "directly deal with indisvalid indexes", without at least saying why. pg_amcheck works by querying pg_class, finding relations to verify. There is no way that that can work that allows pg_amcheck to completely sidestep these awkward questions -- just like with pg_dump. There is no safe neutral starting point for a program like that. > > I can > > always describe a scenario in which amcheck misses real corruption -- > > a scenario which may be very contrived. So the mere fact that some new > > theoretical possibility of corruption is introduced by some action > > does not in itself mean much. We're dealing with that constantly, and > > always will be. > > I wish we could stop discussing this. I really don't think this ticket has anything to do with how well or how poorlyor how completely the amcheck functions work. It's related to !indisvalid indexes. At one point you were concerned about not having coverage of them in certain scenarios. Which is fine. But the inevitable direction of that conversation is towards fundamental definitional questions. Quite happy to drop all of this now, though. > Ok, excellent, that was probably the only thing that had me really hung up. I thought you were still asking for pg_amcheckto filter out the --parent-check option when in recovery, but if you're not asking for that, then I might haveenough to go on now. Sorry about that. I realized my mistake (not specifically addressing pg_is_in_recovery()) after I hit "send", and should have corrected the record sooner. > I was using "downgrading" to mean downgrading from bt_index_parent_check() to bt_index_check() when pg_is_in_recovery()is true, but you've clarified that you're not requesting that downgrade, so I think we've now gotten pastthe last sticking point about that whole issue. Right. I never meant anything like making a would-be bt_index_parent_check() call into a bt_index_check() call, just because of the state of the system (e.g., it's in recovery). That seems awful, in fact. > will complain if no tables match, giving the user the opportunity to notice that they spelled "accounting" wrong. If therehappens to be a table named "xyzacountngo", and that matches, too bad. There isn't any way pg_amcheck can be responsiblefor that. But if there is a temporary table named "xyzacountngo" and that gets skipped because it's a temp table,I don't know what feedback the user should get. That's a thorny user interfaces question, not a corruption checkingquestion, and I don't think you need to weigh in unless you want to. I'll most likely go with whatever is the simplestto code and/or most similar to what is currently in the tree, because I don't see any knock-down arguments one wayor the other. I agree with you that this is a UI thing, since in any case the temp table is pretty much "not visible to pg_amcheck". I have no particular feelings about it. Thanks -- Peter Geoghegan
On Wed, Oct 6, 2021 at 2:28 PM Pavel Borisov <pashkin.elfe@gmail.com> wrote: >> It is the most consistent with the general design of the system, for >> reasons that are pretty deeply baked into the system. I'm reminded of >> the fact that REINDEX CONCURRENTLY's completion became blocked due to >> similar trepidations. Understandably so. > > > I may mistake, but I recall the fact that all indexes builds started during some other (long) index build do not finishwith indexes usable for selects until that long index is built. This may and may not be a source of amcheck misbehavior.Just a note what could be possibly considered. I may have been unclear. I meant that work on the REINDEX CONCURRENTLY feature (several years ago) was very difficult. It seemed to challenge what "Postgres relation" really means. Various community members had concerns about the definition at the time. Remember, plain REINDEX actually gets a full AccessExclusiveLock on the target index relation. This is practically as bad as getting the same lock on the table itself for most users -- which is very disruptive indeed. It's much more disruptive than plain CREATE INDEX -- CREATE INDEX generally only blocks write DML. Whereas REINDEX tends to block both writes and reads (in practice, barring some narrow cases with prepared statements that are too confusing to users to be worth discussing). Which is surprising in itself to users. Why should plain REINDEX be so different to plain CREATE INDEX? The weird (but also helpful) thing about the implementation of REINDEX CONCURRENTLY is that we can have *two* pg_class entries for what the user thinks of as one index/relation. Having two pg_class entries is also why plain REINDEX had problems that plain CREATE INDEX never had -- having only one pg_class entry was actually the true underlying problem, all along. Sometimes we have to make a difficult choice between "weird rules but nice behavior" (as with REINDEX CONCURRENTLY), and "nice rules but weird behavior" (as with plain REINDEX). -- Peter Geoghegan
> On Oct 6, 2021, at 4:12 PM, Peter Geoghegan <pg@bowt.ie> wrote: > >> >> Ok, excellent, that was probably the only thing that had me really hung up. I thought you were still asking for pg_amcheckto filter out the --parent-check option when in recovery, but if you're not asking for that, then I might haveenough to go on now. > > Sorry about that. I realized my mistake (not specifically addressing > pg_is_in_recovery()) after I hit "send", and should have corrected the > record sooner. > >> I was using "downgrading" to mean downgrading from bt_index_parent_check() to bt_index_check() when pg_is_in_recovery()is true, but you've clarified that you're not requesting that downgrade, so I think we've now gotten pastthe last sticking point about that whole issue. > > Right. I never meant anything like making a would-be > bt_index_parent_check() call into a bt_index_check() call, just > because of the state of the system (e.g., it's in recovery). That > seems awful, in fact. Please find attached the latest version of the patch which includes the changes we discussed. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Вложения
On Mon, Oct 11, 2021 at 9:53 AM Mark Dilger <mark.dilger@enterprisedb.com> wrote: > > Right. I never meant anything like making a would-be > > bt_index_parent_check() call into a bt_index_check() call, just > > because of the state of the system (e.g., it's in recovery). That > > seems awful, in fact. > > Please find attached the latest version of the patch which includes the changes we discussed. This mostly looks good to me. Just one thing occurs to me: I suspect that we don't need to call pg_is_in_recovery() from SQL at all. What's wrong with just letting verify_heapam() (the C function from amcheck proper) show those notice messages where appropriate? In general I don't like the idea of making the behavior of pg_amcheck conditioned on the state of the system (e.g., whether we're in recovery) -- we should just let amcheck throw "invalid option" type errors when that's the logical outcome (e.g., when --parent-check is used on a replica). To me this seems rather different than not checking temporary tables, because that's something that inherently won't work. (Also, I consider the index-is-being-built stuff to be very similar to the temp table stuff -- same basic situation.) -- Peter Geoghegan
> On Oct 11, 2021, at 10:10 AM, Peter Geoghegan <pg@bowt.ie> wrote: > > This mostly looks good to me. Just one thing occurs to me: I suspect > that we don't need to call pg_is_in_recovery() from SQL at all. What's > wrong with just letting verify_heapam() (the C function from amcheck > proper) show those notice messages where appropriate? I thought a big part of the debate upthread was over exactly this point, that pg_amcheck should not attempt to check (a)temporary relations, (b) indexes that are invalid or unready, and (c) unlogged relations during recovery. > In general I don't like the idea of making the behavior of pg_amcheck > conditioned on the state of the system (e.g., whether we're in > recovery) -- we should just let amcheck throw "invalid option" type > errors when that's the logical outcome (e.g., when --parent-check is > used on a replica). To me this seems rather different than not > checking temporary tables, because that's something that inherently > won't work. (Also, I consider the index-is-being-built stuff to be > very similar to the temp table stuff -- same basic situation.) I don't like having pg_amcheck parse the error message that comes back from amcheck. If amcheck throws an error, pg_amcheckconsiders that a failure and ultimately exists with a non-zero status. So, if we're going to have amcheck handlethese cases, it will have to be with a NOTICE (or perhaps a WARNING) rather than an error. That's not what happensnow, but if you'd rather we fixed this problem that way, I can go do that, or perhaps as the author of the bt_*_checkfunctions, you can do that and I can just do the pg_amcheck changes. How shall we proceed? — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Oct 11, 2021 at 10:46 AM Mark Dilger <mark.dilger@enterprisedb.com> wrote: > > On Oct 11, 2021, at 10:10 AM, Peter Geoghegan <pg@bowt.ie> wrote: > > This mostly looks good to me. Just one thing occurs to me: I suspect > > that we don't need to call pg_is_in_recovery() from SQL at all. What's > > wrong with just letting verify_heapam() (the C function from amcheck > > proper) show those notice messages where appropriate? > > I thought a big part of the debate upthread was over exactly this point, that pg_amcheck should not attempt to check (a)temporary relations, (b) indexes that are invalid or unready, and (c) unlogged relations during recovery. Again, I consider (a) and (b) very similar to each other, but very dissimilar to (c). Only (a) and (b) are *inherently* not verifiable by amcheck. To me, giving pg_amcheck responsibility for only calling amcheck functions when (a) and (b) are sane is akin to expecting pg_amcheck to only call bt_index_check() with a B-Tree index. Giving pg_amcheck these responsibilities is not a case of "pg_amcheck presuming to know what's best for the user, or too much about amcheck", because amcheck itself pretty clearly expects this from the user (and always has). The user is no worse off for having used pg_amcheck rather than calling amcheck functions from SQL themselves. pg_amcheck is literally just fulfilling basic expectations held by amcheck, that are pretty much documented as such. Sure, the user might not be happy with --parent-check throwing an error on a replica. But in practice most users won't want to do that anyway. Even on a primary it's usually not possible as a practical matter, because the locking implications are *bad* -- it's just too disruptive, for too little extra coverage. And so when --parent-check fails on a replica, it really is very likely that the user should just not do that. Which is easy: just remove --parent-check, and try again. Most scenarios where --parent-check is useful involve the user already knowing that there is some corruption. In other words, scenarios where almost nothing could be considered overkill. Presumably this is very rare. > I don't like having pg_amcheck parse the error message that comes back from amcheck. > How shall we proceed? What's the problem with just having pg_amcheck pass through the notice to the user, without it affecting anything else? Why should a simple notice message need to affect its return code, or anything else? It's not like I feel very strongly about this question. Ultimately it probably doesn't matter very much -- if pg_amcheck just can't deal with these notice messages for some reason, then I can let it go. But what's the reason? If there is a good reason, then maybe we should just not have the notice messages (so we would just remove the existing one from verify_nbtree.c, while still interpreting the case in the same way -- index has no storage to check, and so is trivially verified). -- Peter Geoghegan
> On Oct 11, 2021, at 11:12 AM, Peter Geoghegan <pg@bowt.ie> wrote: > > What's the problem with just having pg_amcheck pass through the notice > to the user, without it affecting anything else? Why should a simple > notice message need to affect its return code, or anything else? That's fine by me, but I was under the impression that people wanted the extraneous noise removed. Since pg_amcheck canknow the command is going to draw a "you can't check that right now" type message, one might argue that it is drawingthese notices for no particular benefit. Somebody could quite reasonably complain about this on a hot standby withmillions of unlogged relations. Actual ERROR messages might get lost in all the noise. It's true that these NOTICEs do not change the return code. I was thinking about the ERRORs we get on failed lock acquisition,but that is unrelated. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Oct 11, 2021 at 11:12 AM Peter Geoghegan <pg@bowt.ie> wrote: > Sure, the user might not be happy with --parent-check throwing an > error on a replica. But in practice most users won't want to do that > anyway. Even on a primary it's usually not possible as a practical > matter, because the locking implications are *bad* -- it's just too > disruptive, for too little extra coverage. And so when --parent-check > fails on a replica, it really is very likely that the user should just > not do that. Which is easy: just remove --parent-check, and try again. We should have a warning box about this in the pg_amcheck docs. Users should think carefully about ever using --parent-check, since it alone totally changes the locking requirements (actually --rootdescend will do that too, but only because that option also implies --parent-check). -- Peter Geoghegan
> On Oct 11, 2021, at 11:26 AM, Peter Geoghegan <pg@bowt.ie> wrote: > > We should have a warning box about this in the pg_amcheck docs. Users > should think carefully about ever using --parent-check, since it alone > totally changes the locking requirements (actually --rootdescend will > do that too, but only because that option also implies > --parent-check). The recently submitted patch already contains a short paragraph for each of these, but not a warning box. Should I reformatthose as warning boxes? I don't know the current thinking on the appropriateness of that documentation style. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Oct 11, 2021 at 11:29 AM Mark Dilger <mark.dilger@enterprisedb.com> wrote: > The recently submitted patch already contains a short paragraph for each of these, but not a warning box. Should I reformatthose as warning boxes? I don't know the current thinking on the appropriateness of that documentation style. I definitely think that it warrants a warning box. This is a huge practical difference. Note that I'm talking about a standard thing, which there are certainly a dozen or more examples of in the docs already. Just grep for "<warning> </warning>" tags to see the existing warning boxes. -- Peter Geoghegan
> On Oct 11, 2021, at 11:33 AM, Peter Geoghegan <pg@bowt.ie> wrote: > > I definitely think that it warrants a warning box. This is a huge > practical difference. > > Note that I'm talking about a standard thing, which there are > certainly a dozen or more examples of in the docs already. Just grep > for "<warning> </warning>" tags to see the existing warning boxes. Yes, sure, I know they exist. It's just that I have a vague recollection of a discussion on -hackers about whether we shouldbe using them so much. The documentation for contrib/amcheck has a paragraph but not a warning box. Should that be changed also? — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Oct 11, 2021 at 11:26 AM Mark Dilger <mark.dilger@enterprisedb.com> wrote: > That's fine by me, but I was under the impression that people wanted the extraneous noise removed. A NOTICE message is supposed to be surfaced to clients (but not stored in the server log), pretty much by definition. It's not unreasonable to argue that I was mistaken to ever think that about this particular message. In fact, I suspect that I was. > Since pg_amcheck can know the command is going to draw a "you can't check that right now" type message, one might arguethat it is drawing these notices for no particular benefit. But technically it *was* checked. That's how I think of it, at least. If a replica comes out of recovery, and we run pg_amcheck immediately afterwards, are we now "checking it for real"? I don't think that distinction is meaningful. > Somebody could quite reasonably complain about this on a hot standby with millions of unlogged relations. Actual ERRORmessages might get lost in all the noise. That's a good point. -- Peter Geoghegan
On Mon, Oct 11, 2021 at 11:37 AM Mark Dilger <mark.dilger@enterprisedb.com> wrote: > The documentation for contrib/amcheck has a paragraph but not a warning box. Should that be changed also? Maybe. I think that the pg_amcheck situation is a lot worse, because users could easily interpret --parent-check as an additive thing. Totally changing the general locking requirements seems like a POLA violation. Besides, amcheck proper is now very much the low level tool that most users won't ever bother with. -- Peter Geoghegan
On Mon, Oct 11, 2021 at 11:40 AM Peter Geoghegan <pg@bowt.ie> wrote: > A NOTICE message is supposed to be surfaced to clients (but not stored > in the server log), pretty much by definition. > > It's not unreasonable to argue that I was mistaken to ever think that > about this particular message. In fact, I suspect that I was. > > Somebody could quite reasonably complain about this on a hot standby with millions of unlogged relations. Actual ERRORmessages might get lost in all the noise. How about this: we can just lower the elevel, from NOTICE to DEBUG1. We'd then be able to keep the message we have today in verify_nbtree.c. We'd also add a matching message (and logic) to verify_heapam.c, keeping them consistent. I find your argument about spammy messages convincing. But it's no less valid for any other user of amcheck. So we really should just fix that at the amcheck level. That way you can get rid of the call to pg_is_in_recovery() from the SQL statements in pg_amcheck, while still fixing everything that needs to be fixed in pg_amcheck. -- Peter Geoghegan
> On Oct 11, 2021, at 11:53 AM, Peter Geoghegan <pg@bowt.ie> wrote: > > On Mon, Oct 11, 2021 at 11:40 AM Peter Geoghegan <pg@bowt.ie> wrote: >> A NOTICE message is supposed to be surfaced to clients (but not stored >> in the server log), pretty much by definition. >> >> It's not unreasonable to argue that I was mistaken to ever think that >> about this particular message. In fact, I suspect that I was. > >>> Somebody could quite reasonably complain about this on a hot standby with millions of unlogged relations. Actual ERRORmessages might get lost in all the noise. > > How about this: we can just lower the elevel, from NOTICE to DEBUG1. > We'd then be able to keep the message we have today in > verify_nbtree.c. We'd also add a matching message (and logic) to > verify_heapam.c, keeping them consistent. > > I find your argument about spammy messages convincing. But it's no > less valid for any other user of amcheck. So we really should just fix > that at the amcheck level. That way you can get rid of the call to > pg_is_in_recovery() from the SQL statements in pg_amcheck, while still > fixing everything that needs to be fixed in pg_amcheck. Your proposal sounds good. Let me try it and get back to you shortly. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
> On Oct 11, 2021, at 12:25 PM, Mark Dilger <mark.dilger@enterprisedb.com> wrote: > > Your proposal sounds good. Let me try it and get back to you shortly. Ok, I went with this suggestion, and also your earlier suggestion to have a <warning> in the pg_amcheck docs about using--parent-check and/or --rootdescend against servers in recovery. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Вложения
On Mon, Oct 11, 2021 at 1:20 PM Mark Dilger <mark.dilger@enterprisedb.com> wrote: > Ok, I went with this suggestion, and also your earlier suggestion to have a <warning> in the pg_amcheck docs about using--parent-check and/or --rootdescend against servers in recovery. My concern with --parent-check (and with --rootdescend) had little to do with Hot Standby. I suggested using a warning because these options alone can pretty much cause bedlam on a production database. At least if they're used carelessly. Again, bt_index_parent_check()'s relation level locks will block all DML, as well as VACUUM. That isn't the case with any of the other pg_amcheck options, including those that call bt_index_check(), and including the heapam verification functionality. It's also true that --parent-check won't work in Hot Standby mode, of course. So it couldn't hurt to mention that in passing, at the same point. But that's a secondary point, at best. We don't need to use a warning box because of that. Overall, your approach looks good to me. Will Robert take care of committing this, or should I? -- Peter Geoghegan
> On Oct 11, 2021, at 2:33 PM, Peter Geoghegan <pg@bowt.ie> wrote: > > On Mon, Oct 11, 2021 at 1:20 PM Mark Dilger > <mark.dilger@enterprisedb.com> wrote: >> Ok, I went with this suggestion, and also your earlier suggestion to have a <warning> in the pg_amcheck docs about using--parent-check and/or --rootdescend against servers in recovery. > > My concern with --parent-check (and with --rootdescend) had little to > do with Hot Standby. I suggested using a warning because these options > alone can pretty much cause bedlam on a production database. Ok, that makes more sense. Would you care to rephrase them? I don't think we need another round of patches posted. > At least > if they're used carelessly. Again, bt_index_parent_check()'s relation > level locks will block all DML, as well as VACUUM. That isn't the case > with any of the other pg_amcheck options, including those that call > bt_index_check(), and including the heapam verification functionality. > > It's also true that --parent-check won't work in Hot Standby mode, of > course. So it couldn't hurt to mention that in passing, at the same > point. But that's a secondary point, at best. We don't need to use a > warning box because of that. > > Overall, your approach looks good to me. Will Robert take care of > committing this, or should I? I'd appreciate if you could fix up the <warning> in the docs and do the commit. Thanks! — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Oct 11, 2021 at 2:41 PM Mark Dilger <mark.dilger@enterprisedb.com> wrote: > > Overall, your approach looks good to me. Will Robert take care of > > committing this, or should I? > > I'd appreciate if you could fix up the <warning> in the docs and do the commit. Cool. I pushed just the amcheck changes a moment ago. I attach the remaining changes from your v3, with a new draft commit message (no real changes). I didn't push the rest (what remains in the attached revision) just yet because I'm not quite sure about the approach used to exclude temp tables. Do we really need the redundancy between prepare_btree_command(), prepare_heap_command(), and compile_relation_list_one_db()? All three exclude temp relations, plus you have extra stuff in prepare_btree_command(). There is some theoretical value in delaying the index specific stuff until the query actually runs, at least in theory. But it also seems unlikely to make any appreciable difference to the overall level of coverage in practice. Would it be simpler to do it all together, in compile_relation_list_one_db()? Were you concerned about things changing when parallel workers are run? Or something else? Many thanks -- Peter Geoghegan
Вложения
> On Oct 11, 2021, at 5:37 PM, Peter Geoghegan <pg@bowt.ie> wrote: > > Cool. I pushed just the amcheck changes a moment ago. I attach the > remaining changes from your v3, with a new draft commit message (no > real changes). I didn't push the rest (what remains in the attached > revision) just yet because I'm not quite sure about the approach used > to exclude temp tables. Thanks for that. > Do we really need the redundancy between prepare_btree_command(), > prepare_heap_command(), and compile_relation_list_one_db()? All three > exclude temp relations, plus you have extra stuff in > prepare_btree_command(). There is some theoretical value in delaying > the index specific stuff until the query actually runs, at least in > theory. But it also seems unlikely to make any appreciable difference > to the overall level of coverage in practice. I agree that it is unlikely to make much difference in practice. Another session running reindex concurrently is, I think,the most likely to conflict, but it is just barely imaginable that a relation will be dropped, and its OID reused forsomething unrelated, by the time the check command gets run. The new object might be temporary where the old object wasnot. On a properly functioning database, that may be too remote a possibility to be worth worrying about, but on a corruptdatabase, most bets are off, and I can't really tell you if that's a likely scenario, because it is hard to thinkabout all the different ways corruption might cause a database to behave. On the other hand, the join against pg_classmight fail due to unspecified corruption, so my attempt to play it safe may backfire. I don't feel strongly about this. If you'd like me to remove those checks, I can do so. These are just my thoughts on thesubject. > Would it be simpler to do it all together, in > compile_relation_list_one_db()? Were you concerned about things > changing when parallel workers are run? Or something else? Yeah, I was contemplating things changing by the time the parallel workers run the command. I don't know how important thatis. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Oct 11, 2021 at 7:22 PM Mark Dilger <mark.dilger@enterprisedb.com> wrote: > I agree that it is unlikely to make much difference in practice. > I don't feel strongly about this. If you'd like me to remove those checks, I can do so. These are just my thoughts onthe subject. Okay. I don't feel strongly about it either. I just pushed v4, with the additional minor pg_amcheck documentation updates we talked about. No other changes. Thanks again -- Peter Geoghegan
On Wed, Oct 13, 2021 at 2:09 PM Peter Geoghegan <pg@bowt.ie> wrote: > I just pushed v4, with the additional minor pg_amcheck documentation > updates we talked about. No other changes. Any idea what the problems on drongo are? https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=drongo&dt=2021-10-14%2001%3A27%3A19 -- Peter Geoghegan
Peter Geoghegan <pg@bowt.ie> writes: > Any idea what the problems on drongo are? > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=drongo&dt=2021-10-14%2001%3A27%3A19 It says # pg_ctl start failed; logfile: 2021-10-14 02:10:33.996 UTC [491848:1] LOG: starting PostgreSQL 14.0, compiled by Visual C++ build 1923, 64-bit 2021-10-14 02:10:33.999 UTC [491848:2] LOG: could not bind IPv4 address "127.0.0.1": Only one usage of each socket address(protocol/network address/port) is normally permitted. 2021-10-14 02:10:33.999 UTC [491848:3] HINT: Is another postmaster already running on port 54407? If not, wait a few secondsand retry. 2021-10-14 02:10:33.999 UTC [491848:4] WARNING: could not create listen socket for "127.0.0.1" 2021-10-14 02:10:33.999 UTC [491848:5] FATAL: could not create any TCP/IP sockets 2021-10-14 02:10:34.000 UTC [491848:6] LOG: database system is shut down Bail out! pg_ctl start failed Looks like a transient/phase of the moon issue to me. regards, tom lane
On Wed, Oct 13, 2021 at 9:15 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Looks like a transient/phase of the moon issue to me. Yeah, I noticed that drongo is prone to them, though only after I hit send. Thanks -- Peter Geoghegan
On 10/14/21 12:15 AM, Tom Lane wrote: > Peter Geoghegan <pg@bowt.ie> writes: >> Any idea what the problems on drongo are? >> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=drongo&dt=2021-10-14%2001%3A27%3A19 > It says > > # pg_ctl start failed; logfile: > 2021-10-14 02:10:33.996 UTC [491848:1] LOG: starting PostgreSQL 14.0, compiled by Visual C++ build 1923, 64-bit > 2021-10-14 02:10:33.999 UTC [491848:2] LOG: could not bind IPv4 address "127.0.0.1": Only one usage of each socket address(protocol/network address/port) is normally permitted. > 2021-10-14 02:10:33.999 UTC [491848:3] HINT: Is another postmaster already running on port 54407? If not, wait a few secondsand retry. > 2021-10-14 02:10:33.999 UTC [491848:4] WARNING: could not create listen socket for "127.0.0.1" > 2021-10-14 02:10:33.999 UTC [491848:5] FATAL: could not create any TCP/IP sockets > 2021-10-14 02:10:34.000 UTC [491848:6] LOG: database system is shut down > Bail out! pg_ctl start failed > > Looks like a transient/phase of the moon issue to me. > > Bowerbird is having similar issues, so I don't think this is just a transient. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
> On Oct 14, 2021, at 1:50 PM, Andrew Dunstan <andrew@dunslane.net> wrote: > > Bowerbird is having similar issues, so I don't think this is just a > transient. The pg_amcheck patch Peter committed for me adds a new test, src/bin/pg_amcheck/t/006_bad_targets.pl, which creates two PostgresNodeobjects (a primary and a standby) and uses PostgresNode::background_psql(). It doesn't bother to "finish" thereturned harness, which may be the cause of an installation hanging around long enough to be in the way when another testtries to start. Assuming this is right, the fix is just one line. Thoughts? — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Andrew Dunstan <andrew@dunslane.net> writes: > On 10/14/21 12:15 AM, Tom Lane wrote: >> Looks like a transient/phase of the moon issue to me. > Bowerbird is having similar issues, so I don't think this is just a > transient. Yeah, I noticed that too today, and poked around a bit. But I don't see what this test is doing differently from other tests that bowerbird is running successfully. It's failing while trying to crank up a replica using init_from_backup, which has lots of precedent. I do see that bowerbird is skipping some comparable tests due to using "--skip-steps misc-check". But it's not skipping, eg, pg_rewind's 008_min_recovery_point.pl; and the setup steps in that sure look just the same. What's different? (BTW, I wondered if PostgresNode->new's own_host hack could fix this. But AFAICS that's dead code, with no existing test using it. Seems like we should nuke it.) regards, tom lane
Mark Dilger <mark.dilger@enterprisedb.com> writes: > The pg_amcheck patch Peter committed for me adds a new test, src/bin/pg_amcheck/t/006_bad_targets.pl, which creates twoPostgresNode objects (a primary and a standby) and uses PostgresNode::background_psql(). It doesn't bother to "finish"the returned harness, which may be the cause of an installation hanging around long enough to be in the way whenanother test tries to start. (a) Isn't that just holding open one connection, not the whole instance? (b) Wouldn't finish()ing that connection cause the temp tables to be dropped, negating the entire point of the test? TBH, I seriously doubt this test case is worth expending buildfarm cycles on forevermore. I'm more than a bit tempted to just drop it, rather than also expending developer time figuring out why it's not as portable as it looks. regards, tom lane
> On Oct 14, 2021, at 2:13 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > (a) Isn't that just holding open one connection, not the whole instance? Yes. > (b) Wouldn't finish()ing that connection cause the temp tables to be > dropped, negating the entire point of the test? The finish() would have to be the last line of the test. > TBH, I seriously doubt this test case is worth expending buildfarm > cycles on forevermore. I'm more than a bit tempted to just drop > it, rather than also expending developer time figuring out why it's > not as portable as it looks. I'm curious if the test is indicating something about the underlying test system. Only one other test in the tree uses background_psql(). I was hoping Andrew would have something to say about whether this is a bug with that function or justuser error on my part. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Oct 14, 2021 at 2:13 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > TBH, I seriously doubt this test case is worth expending buildfarm > cycles on forevermore. I'm more than a bit tempted to just drop > it, rather than also expending developer time figuring out why it's > not as portable as it looks. I agree. I can go remove the whole file now, and will. Mark: Any objections? -- Peter Geoghegan
> On Oct 14, 2021, at 2:21 PM, Peter Geoghegan <pg@bowt.ie> wrote: > > I agree. I can go remove the whole file now, and will. > > Mark: Any objections? None of the "pride of ownership" type, but I would like to see something more about the limitations of background_psql(). It's the closest thing we have to being able to run things in parallel from TAP tests. There's no isolationtesterequivalent, and PostgresNode doesn't allow you to fork() in tests without hacking PostgresNodes END{} block. So if we don't debug this, we never get any further towards parallel testing from perl. Or do you have a differentway forward for that? — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Oct 14, 2021 at 2:24 PM Mark Dilger <mark.dilger@enterprisedb.com> wrote: > None of the "pride of ownership" type, but I would like to see something more about the limitations of background_psql(). I'm not sure what that means for the buildfarm. Are you suggesting that we leave things as-is pending an investigation on affected BF animals, or something else? > Or do you have a different way forward for that? I don't know enough about this stuff to be able to comment. -- Peter Geoghegan
> On Oct 14, 2021, at 2:28 PM, Peter Geoghegan <pg@bowt.ie> wrote: > > I'm not sure what that means for the buildfarm. Are you suggesting > that we leave things as-is pending an investigation on affected BF > animals, or something else? I was just waiting a couple minutes to see if Andrew wanted to jump in. Having heard nothing, I guess you can revert it. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Mark Dilger <mark.dilger@enterprisedb.com> writes: >> On Oct 14, 2021, at 2:13 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> (b) Wouldn't finish()ing that connection cause the temp tables to be >> dropped, negating the entire point of the test? > The finish() would have to be the last line of the test. > ... > I'm curious if the test is indicating something about the underlying test system. Only one other test in the tree usesbackground_psql(). I was hoping Andrew would have something to say about whether this is a bug with that function orjust user error on my part. Neither of these things could explain the problem at hand, AFAICS, because it's failing to start up the standby. regards, tom lane
On 10/14/21 5:09 PM, Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: >> On 10/14/21 12:15 AM, Tom Lane wrote: >>> Looks like a transient/phase of the moon issue to me. >> Bowerbird is having similar issues, so I don't think this is just a >> transient. > Yeah, I noticed that too today, and poked around a bit. But I don't > see what this test is doing differently from other tests that > bowerbird is running successfully. It's failing while trying to crank > up a replica using init_from_backup, which has lots of precedent. Yes, that's been puzzling me too. I've just been staring at it again and nothing jumps out. But maybe we can investigate that offline if this test is deemed not worth keeping. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
On Thu, Oct 14, 2021 at 2:31 PM Mark Dilger <mark.dilger@enterprisedb.com> wrote: > I was just waiting a couple minutes to see if Andrew wanted to jump in. Having heard nothing, I guess you can revert it. Okay. Pushed a commit removing the test case just now. Thanks -- Peter Geoghegan
Andrew Dunstan <andrew@dunslane.net> writes: > Yes, that's been puzzling me too. I've just been staring at it again and > nothing jumps out. But maybe we can investigate that offline if this > test is deemed not worth keeping. As Mark says, it'd be interesting to know whether the use of background_psql is related, because if it is, we'd want to debug that. (I don't really see how it could be related, but maybe I just lack sufficient imagination today.) Beyond that, ISTM this is blocking all TAP testing on the Windows machines, which is pretty bad to leave in place for long. regards, tom lane
On 10/14/21 5:52 PM, Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: >> Yes, that's been puzzling me too. I've just been staring at it again and >> nothing jumps out. But maybe we can investigate that offline if this >> test is deemed not worth keeping. > As Mark says, it'd be interesting to know whether the use of > background_psql is related, because if it is, we'd want to debug that. > (I don't really see how it could be related, but maybe I just lack > sufficient imagination today.) Yeah. I'm working on getting a cut-down reproducible failure case. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
On 10/15/21 10:46 AM, Andrew Dunstan wrote: > On 10/14/21 5:52 PM, Tom Lane wrote: >> Andrew Dunstan <andrew@dunslane.net> writes: >>> Yes, that's been puzzling me too. I've just been staring at it again and >>> nothing jumps out. But maybe we can investigate that offline if this >>> test is deemed not worth keeping. >> As Mark says, it'd be interesting to know whether the use of >> background_psql is related, because if it is, we'd want to debug that. >> (I don't really see how it could be related, but maybe I just lack >> sufficient imagination today.) > > > Yeah. I'm working on getting a cut-down reproducible failure case. > I spend a good deal of time poking at this on Friday and Saturday. It's quite clear that the use of my $h = $node->background_psql(...); $h->pump_nb; is the root of the problem. If that code is commented out, or even just moved to just after the standby is started and before we check that replication has caught up (which should meet the needs of the case where we found this), then the problem goes away. IPC::Run deals with this setup in a different way on Windows, mainly because its select() only works on sockets and not other types of file handles. It does appear that TestLib::get_free_port() is not sufficiently robust, as it should guarantee that the port/address can be bound. I haven't got further that that, and I have other things I need to be doing, but for now I think we just need to be careful wherever possible to try to set up servers before trying to calling start/pump. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com