Обсуждение: pgstatindex vs. !indisready
Running pgstatindex on some !indisready indexes fails with "ERROR: XX001: could not read block 0 in file". This reproduces it: begin; drop table if exists not_indisready_t; create table not_indisready_t (c int); insert into not_indisready_t values (1),(1); commit; create unique index concurrently not_indisready_i on not_indisready_t(c); begin; create extension pgstattuple; \set VERBOSITY verbose select * from pgstatindex('not_indisready_i'); \set VERBOSITY default rollback; Since XX001 = ERRCODE_DATA_CORRUPTED appears in the "can't-happen" class, it's not a good fit for this scenario. I propose to have pgstatindex fail early on !indisready indexes. We could go a step further and also fail on indisready&&!indisvalid indexes, which are complete enough to accept inserts but not complete enough to answer queries. I don't see a reason to do that, but maybe someone else will. This made me wonder about functions handling unlogged rels during recovery. I used the attached hack to test most regclass-arg functions. While some of these errors from src/test/recovery/tmp_check/log/001_stream_rep_standby_2.log may deserve improvement, there were no class-XX messages: 2023-10-01 12:24:05.992 PDT [646914:11] 001_stream_rep.pl ERROR: 58P01: could not open file "base/5/16862": No such fileor directory 2023-10-01 12:24:05.996 PDT [646914:118] 001_stream_rep.pl ERROR: 22023: fork "main" does not exist for this relation Thanks, nm
Вложения
Noah Misch <noah@leadboat.com> writes: > Running pgstatindex on some !indisready indexes fails with "ERROR: XX001: > could not read block 0 in file". This reproduces it: > ... > Since XX001 = ERRCODE_DATA_CORRUPTED appears in the "can't-happen" class, it's > not a good fit for this scenario. I propose to have pgstatindex fail early on > !indisready indexes. +1 > We could go a step further and also fail on > indisready&&!indisvalid indexes, which are complete enough to accept inserts > but not complete enough to answer queries. I don't see a reason to do that, > but maybe someone else will. Hmm. Seems like the numbers pgstatindex would produce for a not-yet-complete index would be rather irrelevant, even if the case doesn't risk any outright problems. I'd be inclined to be conservative and insist on indisvalid being true too. regards, tom lane
On Sun, Oct 01, 2023 at 04:37:25PM -0400, Tom Lane wrote: > Noah Misch <noah@leadboat.com> writes: > > Running pgstatindex on some !indisready indexes fails with "ERROR: XX001: > > could not read block 0 in file". This reproduces it: > > ... > > Since XX001 = ERRCODE_DATA_CORRUPTED appears in the "can't-happen" class, it's > > not a good fit for this scenario. I propose to have pgstatindex fail early on > > !indisready indexes. > > +1 > > > We could go a step further and also fail on > > indisready&&!indisvalid indexes, which are complete enough to accept inserts > > but not complete enough to answer queries. I don't see a reason to do that, > > but maybe someone else will. > > Hmm. Seems like the numbers pgstatindex would produce for a > not-yet-complete index would be rather irrelevant, even if the case > doesn't risk any outright problems. I'd be inclined to be > conservative and insist on indisvalid being true too. Okay. If no other preferences appear, I'll do pgstatindex that way. > > This made me wonder about functions handling unlogged rels during recovery. I > > used the attached hack to test most regclass-arg functions. I forgot to test the same battery of functions on !indisready indexes. I've now done that, using the attached script. While I didn't get additional class-XX errors, more should change: [pgstatginindex pgstathashindex pgstattuple] currently succeed. Like pgstatindex, they should ERROR, including in the back branches. [brin_desummarize_range brin_summarize_new_values brin_summarize_range gin_clean_pending_list] currently succeed. I propose to make them emit a DEBUG1 message and return early, like amcheck does, except on !indisready. This would allow users to continue running them on all indexes of the applicable access method. Doing these operations on an indisready&&!indisvalid index is entirely reasonable, since they relate to INSERT/UPDATE/DELETE operations. [pg_freespace pg_indexes_size pg_prewarm] currently succeed, and I propose to leave them that way. No undefined behavior arises. pg_freespace needs to be resilient to garbage data anyway, given the lack of WAL for the FSM. One could argue for a relkind check in pg_indexes_size. One could argue for treating pg_prewarm like amcheck (DEBUG1 and return).
Вложения
On Sun, Oct 1, 2023 at 2:00 PM Noah Misch <noah@leadboat.com> wrote: > Okay. If no other preferences appear, I'll do pgstatindex that way. Sounds reasonable. > [pgstatginindex pgstathashindex pgstattuple] currently succeed. Like > pgstatindex, they should ERROR, including in the back branches. Makes sense. > [brin_desummarize_range brin_summarize_new_values brin_summarize_range > gin_clean_pending_list] currently succeed. I propose to make them emit a > DEBUG1 message and return early, like amcheck does, except on !indisready. > This would allow users to continue running them on all indexes of the > applicable access method. Doing these operations on an > indisready&&!indisvalid index is entirely reasonable, since they relate to > INSERT/UPDATE/DELETE operations. +1 to all that (including the part about these operations being a little different to the amcheck functions in one particular respect). FWIW, amcheck's handling of unlogged relations when in hot standby mode is based on the following theory: if recovery were to end, the index would become an empty index, so just treat it as if it was already empty during recovery. Not everybody thought that this behavior was ideal, but ISTM that it has fewer problems than any alternative approach you can think of. The same argument works just as well with any function that accepts a regclass argument IMV. -- Peter Geoghegan
On Sun, Oct 01, 2023 at 04:20:42PM -0700, Peter Geoghegan wrote: > On Sun, Oct 1, 2023 at 2:00 PM Noah Misch <noah@leadboat.com> wrote: > > Okay. If no other preferences appear, I'll do pgstatindex that way. > > Sounds reasonable. > >> [pgstatginindex pgstathashindex pgstattuple] currently succeed. Like >> pgstatindex, they should ERROR, including in the back branches. > > Makes sense. These are already restrictive, makes sense. >> [brin_desummarize_range brin_summarize_new_values brin_summarize_range >> gin_clean_pending_list] currently succeed. I propose to make them emit a >> DEBUG1 message and return early, like amcheck does, except on !indisready. >> This would allow users to continue running them on all indexes of the >> applicable access method. Doing these operations on an >> indisready&&!indisvalid index is entirely reasonable, since they relate to >> INSERT/UPDATE/DELETE operations. Hmm. Still slightly incorrect in some cases? Before being switched to indisvalid, an index built concurrently may miss some tuples deleted before the reference snapshot used to build the index was taken. > +1 to all that (including the part about these operations being a > little different to the amcheck functions in one particular respect). Making them return early sounds sensible here. > FWIW, amcheck's handling of unlogged relations when in hot standby > mode is based on the following theory: if recovery were to end, the > index would become an empty index, so just treat it as if it was > already empty during recovery. Not everybody thought that this > behavior was ideal, but ISTM that it has fewer problems than any > alternative approach you can think of. The same argument works just as > well with any function that accepts a regclass argument IMV. It depends, I guess, on how "user-friendly" all that should be. I have seen in the past as argument that it may be sometimes better to have a function do nothing rather than ERROR when these are used across a scan of pg_class, for example, particularly for non-SRFs. Even if sometimes errors can be bypassed with more quals on the relkind or such (aka less complicated queries with less JOINs to write). -- Michael
Вложения
On Sun, Oct 1, 2023 at 5:24 PM Michael Paquier <michael@paquier.xyz> wrote: > > FWIW, amcheck's handling of unlogged relations when in hot standby > > mode is based on the following theory: if recovery were to end, the > > index would become an empty index, so just treat it as if it was > > already empty during recovery. Not everybody thought that this > > behavior was ideal, but ISTM that it has fewer problems than any > > alternative approach you can think of. The same argument works just as > > well with any function that accepts a regclass argument IMV. > > It depends, I guess, on how "user-friendly" all that should be. I > have seen in the past as argument that it may be sometimes better to > have a function do nothing rather than ERROR when these are used > across a scan of pg_class, for example, particularly for non-SRFs. > Even if sometimes errors can be bypassed with more quals on the > relkind or such (aka less complicated queries with less JOINs to > write). I think of recovery as a property of the whole system. So throwing an error about one particular unlogged index that the user (say) checked during recovery doesn't seem sensible. After all, the answer that RecoveryInProgress() gives can change in a way that's observable within individual transactions. Again, I wouldn't claim that this is very elegant. Just that it seems to have the fewest problems. -- Peter Geoghegan
On Mon, Oct 02, 2023 at 09:24:33AM +0900, Michael Paquier wrote: > On Sun, Oct 01, 2023 at 04:20:42PM -0700, Peter Geoghegan wrote: > > On Sun, Oct 1, 2023 at 2:00 PM Noah Misch <noah@leadboat.com> wrote: > >> [brin_desummarize_range brin_summarize_new_values brin_summarize_range > >> gin_clean_pending_list] currently succeed. I propose to make them emit a > >> DEBUG1 message and return early, like amcheck does, except on !indisready. > >> This would allow users to continue running them on all indexes of the > >> applicable access method. Doing these operations on an > >> indisready&&!indisvalid index is entirely reasonable, since they relate to > >> INSERT/UPDATE/DELETE operations. > > Hmm. Still slightly incorrect in some cases? Before being switched > to indisvalid, an index built concurrently may miss some tuples > deleted before the reference snapshot used to build the index was > taken. The !indisvalid index may be missing tuples, yes. In what way does that make one of those four operations incorrect?
On Sun, Oct 01, 2023 at 06:31:26PM -0700, Noah Misch wrote: > The !indisvalid index may be missing tuples, yes. In what way does that make > one of those four operations incorrect? Reminding myself of what these four do, it looks that you're right and that the state of indisvalid is not going to change what they report. Still, I'd like to agree with Tom's point to be more conservative and check also for indisvalid which is what the planner does. These functions will be used in SELECTs, and one thing that worries me is that checks based on indisready may get copy-pasted somewhere else, leading to incorrect results where they get copied. (indisready && !indisvalid) is a "short"-term combination in a concurrent build process, as well (depends on how long one waits for the old snapshots before switching indisvalid, but that's way shorter than the period of time where the built indexes remain valid). -- Michael
Вложения
On Wed, Oct 04, 2023 at 09:00:23AM +0900, Michael Paquier wrote: > On Sun, Oct 01, 2023 at 06:31:26PM -0700, Noah Misch wrote: > > The !indisvalid index may be missing tuples, yes. In what way does that make > > one of those four operations incorrect? > > Reminding myself of what these four do, it looks that you're right and > that the state of indisvalid is not going to change what they report. > > Still, I'd like to agree with Tom's point to be more conservative and > check also for indisvalid which is what the planner does. These > functions will be used in SELECTs, and one thing that worries me is > that checks based on indisready may get copy-pasted somewhere else, > leading to incorrect results where they get copied. (indisready && > !indisvalid) is a "short"-term combination in a concurrent build > process, as well (depends on how long one waits for the old snapshots > before switching indisvalid, but that's way shorter than the period of > time where the built indexes remain valid). Neither choice would harm the user experience in an important way, so I've followed your preference in the attached patch.
Вложения
On Sun, Oct 22, 2023 at 02:02:45PM -0700, Noah Misch wrote: > - /* OK, do it */ > - brinsummarize(indexRel, heapRel, heapBlk, true, &numSummarized, NULL); > + /* see gin_clean_pending_list() */ > + if (indexRel->rd_index->indisvalid) > + brinsummarize(indexRel, heapRel, heapBlk, true, &numSummarized, NULL); > + else > + ereport(DEBUG1, > + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), > + errmsg("index \"%s\" is not valid", > + RelationGetRelationName(indexRel)))); brinsummarize() could return 0 even for a valid index, and we would not be able to make the difference with an invalid index. Perhaps you are right and this is not a big deal in practice to do as you are suggesting. -- Michael