Обсуждение: PANIC during crash recovery of a recently promoted standby
Hello,
I recently investigated a problem where a standby is promoted to be the new master. The promoted standby crashes shortly thereafter for whatever reason. Upon running the crash recovery, the promoted standby (now master) PANICs with message such as:
PANIC,XX000,"WAL contains references to invalid pages",,,,,,,," XLogCheckInvalidPages, xlogutils.c:242",""
After investigation, I could recreate a reproduction scenario for this problem. The attached TAP test (thanks Alvaro from converting my bash script to a TAP test) demonstrates the problem. The test is probably sensitive to timing, but it reproduces the problem consistently at least at my end. While the original report was for 9.6, I can reproduce it on the master and thus it probably affects all supported releases.
Investigations point to a possible bug where we fail to update the minRecoveryPoint after completing the ongoing restart point upon promotion. IMV after promotion the new master must always recover to the end of the WAL to ensure that all changes are applied correctly. But what we've instead is that minRecoveryPoint remains set to a prior location because of this:
/*
* Update pg_control, using current time. Check that it still shows
* IN_ARCHIVE_RECOVERY state and an older checkpoint, else do nothing;
* this is a quick hack to make sure nothing really bad happens if somehow
* we get here after the end-of-recovery checkpoint.
*/
LWLockAcquire( ControlFileLock, LW_EXCLUSIVE);
if (ControlFile->state == DB_IN_ARCHIVE_RECOVERY &&
ControlFile->checkPointCopy. redo < lastCheckPoint.redo)
{
ControlFile->checkPoint = lastCheckPointRecPtr;
ControlFile->checkPointCopy = lastCheckPoint;
ControlFile->time = (pg_time_t) time(NULL);
/*
* Ensure minRecoveryPoint is past the checkpoint record. Normally,
* this will have happened already while writing out dirty buffers,
* but not necessarily - e.g. because no buffers were dirtied. We do
* this because a non-exclusive base backup uses minRecoveryPoint to
* determine which WAL files must be included in the backup, and the
* file (or files) containing the checkpoint record must be included,
* at a minimum. Note that for an ordinary restart of recovery there's
* no value in having the minimum recovery point any earlier than this
* anyway, because redo will begin just after the checkpoint record.
*/
if (ControlFile->minRecoveryPoint < lastCheckPointEndPtr)
{
ControlFile->minRecoveryPoint = lastCheckPointEndPtr;
ControlFile-> minRecoveryPointTLI = lastCheckPoint.ThisTimeLineID;
/* update local copy */
minRecoveryPoint = ControlFile->minRecoveryPoint;
minRecoveryPointTLI = ControlFile-> minRecoveryPointTLI;
}
if (flags & CHECKPOINT_IS_SHUTDOWN)
ControlFile->state = DB_SHUTDOWNED_IN_RECOVERY;
UpdateControlFile();
}
LWLockRelease(ControlFileLock) ;
After promotion, the minRecoveryPoint is only updated (cleared) when the first regular checkpoint completes. If a crash happens before that, we will run the crash recovery with a stale minRecoveryPoint, which results into the PANIC that we diagnosed. The test case was written to reproduce the issue as reported to us. Thus the test case TRUNCATEs and extends the table at hand after promotion. The crash shortly thereafter leaves the pages in uninitialised state because the shared buffers are not yet flushed to the disk.
During crash recovery, we see uninitialised pages for the WAL records written before the promotion. These pages are remembered and we expect to either see a DROP TABLE or TRUNCATE WAL record before the minRecoveryPoint is reached. But since the minRecoveryPoint is still pointing to a WAL location prior to the TRUNCATE operation, crash recovery hits the minRecoveryPoint before seeing the TRUNCATE WAL record. That results in a PANIC situation.
I propose that we should always clear the minRecoveryPoint after promotion to ensure that crash recovery always run to the end if a just-promoted standby crashes before completing its first regular checkpoint. A WIP patch is attached.
Thanks,
Pavan
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
PostgreSQL Development, 24x7 Support, Training & Services
Вложения
On Thu, May 10, 2018 at 10:52:12AM +0530, Pavan Deolasee wrote: > I propose that we should always clear the minRecoveryPoint after promotion > to ensure that crash recovery always run to the end if a just-promoted > standby crashes before completing its first regular checkpoint. A WIP patch > is attached. I have been playing with your patch and upgraded the test to check as well for cascading standbys. We could use that in the final patch. That's definitely something to add in the recovery test suite, and the sleep phases should be replaced by waits on replay and/or flush. Still, that approach looks sensitive to me. A restart point could be running while the end-of-recovery record is inserted, so your patch could update minRecoveryPoint to InvalidXLogRecPtr, and then a restart point would happily update again the control file's minRecoveryPoint to lastCheckPointEndPtr because it would see that the former is older than lastCheckPointEndPtr (let's not forget that InvalidXLogRecPtr is 0), so you could still crash on invalid pages? I need to think a bit more about that stuff, but one idea would be to use a special state in the control file to mark it as ending recovery, this way we would control race conditions with restart points. -- Michael
Вложения
Michael Paquier wrote: > On Thu, May 10, 2018 at 10:52:12AM +0530, Pavan Deolasee wrote: > > I propose that we should always clear the minRecoveryPoint after promotion > > to ensure that crash recovery always run to the end if a just-promoted > > standby crashes before completing its first regular checkpoint. A WIP patch > > is attached. > > I have been playing with your patch and upgraded the test to check as > well for cascading standbys. We could use that in the final patch. > That's definitely something to add in the recovery test suite, and the > sleep phases should be replaced by waits on replay and/or flush. > > Still, that approach looks sensitive to me. A restart point could be > running while the end-of-recovery record is inserted, so your patch > could update minRecoveryPoint to InvalidXLogRecPtr, and then a restart > point would happily update again the control file's minRecoveryPoint to > lastCheckPointEndPtr because it would see that the former is older than > lastCheckPointEndPtr (let's not forget that InvalidXLogRecPtr is 0), so > you could still crash on invalid pages? Yeah, I had this exact comment, but I was unable to come up with a test case that would cause a problem. > I need to think a bit more about that stuff, but one idea would be to > use a special state in the control file to mark it as ending recovery, > this way we would control race conditions with restart points. Hmm. Can we change the control file in released branches? (It should be possible to make the new server understand both old and new formats, but I think this is breaking new ground and it looks easy to introduce more bugs there.) -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Fri, May 11, 2018 at 12:09:58PM -0300, Alvaro Herrera wrote: > Yeah, I had this exact comment, but I was unable to come up with a test > case that would cause a problem. pg_ctl promote would wait for the control file to be updated, so you cannot use it in the TAP tests to trigger the promotion. Still I think I found one after waking up? Please note I have not tested it: - Use a custom trigger file and then trigger promotion with a signal. - Use a sleep command in recovery_end_command to increase the window, as what matters is sleeping after CreateEndOfRecoveryRecord updates the control file. - Issue a restart point on the standby, which will update the control file. - Stop the standby with immediate mode. - Start the standby, it should see unreferenced pages. > Hmm. Can we change the control file in released branches? (It should > be possible to make the new server understand both old and new formats, > but I think this is breaking new ground and it looks easy to introduce > more bugs there.) We definitely can't, even if the new value is added at the end of DBState :( A couple of wild ideas, not tested, again after waking up: 1) We could also abuse of existing values by using the existing DB_IN_CRASH_RECOVERY or DB_STARTUP. Still that's not completely true as the cluster may be open for business as a hot standby. 2) Invent a new special value for XLogRecPtr, normally impossible to reach, which uses high bits. -- Michael
Вложения
On Sat, May 12, 2018 at 07:41:33AM +0900, Michael Paquier wrote: > pg_ctl promote would wait for the control file to be updated, so you > cannot use it in the TAP tests to trigger the promotion. Still I think > I found one after waking up? Please note I have not tested it: > - Use a custom trigger file and then trigger promotion with a signal. > - Use a sleep command in recovery_end_command to increase the window, as > what matters is sleeping after CreateEndOfRecoveryRecord updates the > control file. > - Issue a restart point on the standby, which will update the control > file. > - Stop the standby with immediate mode. > - Start the standby, it should see unreferenced pages. I have been looking at that this morning, and actually I have been able to create a test case where minRecoveryPoint goes backwards using Pavan's patch. Using a sleep in recovery_end_command has proved to not be enough so I had to patch the backend with a couple of sleeps and some processing, mainly: - One sleep in CreateRestartPoint to make a restart point wait before updating the control file, which I set at 5s. - One sleep just after calling CreateEndOfRecoveryRecord, which has been set at 20s. - Trigger an asynchronous checkpoint using IPC::Run::start. - Trigger a promotion with a trigger file and SIGUSR2 to the postmaster. The rest of the test is crafted with some magic wait number and adds some logs to ease the monitoring of the issue. In order to get a crash, I think that you would need to crash the backend after creating the last WAL records which generate the invalid page references, and also slow down the last restart point which makes minRecoveryPoint go backwards, which is err... Complicated to make deterministic. Still if you apply the patch attached you would see log entries on the standby as follows: 2018-05-14 12:24:15.065 JST [17352] LOG: selected new timeline ID: 2 2018-05-14 12:24:15.074 JST [17352] LOG: archive recovery complete 2018-05-14 12:24:15.074 JST [17352] WARNING: CreateEndOfRecoveryRecord: minRecoveryPoint is 0/32C4258 before update 2018-05-14 12:24:15.074 JST [17352] WARNING: CreateEndOfRecoveryRecord: minRecoveryPoint is 0/0 after update 2018-05-14 12:24:17.067 JST [17353] WARNING: checkPointCopy.redo =0/30B3D70, lastCheckPoint.redo = 0/31BC208 2018-05-14 12:24:17.067 JST [17353] WARNING: CreateRestartPoint: minRecoveryPoint is 0/0 before update, lastCheckPointEndPtris 0/31BC2B0 2018-05-14 12:24:17.067 JST [17353] WARNING: CreateRestartPoint: minRecoveryPoint is 0/31BC2B0 after update So minRecoveryRecord can go definitely go backwards here, which is not good. Attached is a patch which includes Pavan's fix on top of the test case I crafted with what is in upthread. You just need to apply it on HEAD. -- Michael
Вложения
On Fri, May 11, 2018 at 8:39 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Michael Paquier wrote:
> On Thu, May 10, 2018 at 10:52:12AM +0530, Pavan Deolasee wrote:
> > I propose that we should always clear the minRecoveryPoint after promotion
> > to ensure that crash recovery always run to the end if a just-promoted
> > standby crashes before completing its first regular checkpoint. A WIP patch
> > is attached.
>
> I have been playing with your patch and upgraded the test to check as
> well for cascading standbys. We could use that in the final patch.
> That's definitely something to add in the recovery test suite, and the
> sleep phases should be replaced by waits on replay and/or flush.
>
> Still, that approach looks sensitive to me. A restart point could be
> running while the end-of-recovery record is inserted, so your patch
> could update minRecoveryPoint to InvalidXLogRecPtr, and then a restart
> point would happily update again the control file's minRecoveryPoint to
> lastCheckPointEndPtr because it would see that the former is older than
> lastCheckPointEndPtr (let's not forget that InvalidXLogRecPtr is 0), so
> you could still crash on invalid pages?
Yeah, I had this exact comment, but I was unable to come up with a test
case that would cause a problem.
Looks like I didn't understand Alvaro's comment when he mentioned it to me off-list. But I now see what Michael and Alvaro mean and that indeed seems like a problem. I was thinking that the test for (ControlFile->state == DB_IN_ARCHIVE_RECOVERY) will ensure that minRecoveryPoint can't be updated after the standby is promoted. While that's true for a DB_IN_PRODUCTION, the RestartPoint may finish after we have written end-of-recovery record, but before we're in production and thus the minRecoveryPoint may again be set.
> I need to think a bit more about that stuff, but one idea would be to
> use a special state in the control file to mark it as ending recovery,
> this way we would control race conditions with restart points.
Hmm. Can we change the control file in released branches? (It should
be possible to make the new server understand both old and new formats,
but I think this is breaking new ground and it looks easy to introduce
more bugs there.)
Can't we just remember that in shared memory state instead of writing to the control file? So if we've already performed end-of-recovery, we don't update the minRecoveryPoint when RestartPoint completes.
Pavan
--
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
PostgreSQL Development, 24x7 Support, Training & Services
On Mon, May 14, 2018 at 01:14:22PM +0530, Pavan Deolasee wrote: > Looks like I didn't understand Alvaro's comment when he mentioned it to me > off-list. But I now see what Michael and Alvaro mean and that indeed seems > like a problem. I was thinking that the test for (ControlFile->state == > DB_IN_ARCHIVE_RECOVERY) will ensure that minRecoveryPoint can't be updated > after the standby is promoted. While that's true for a DB_IN_PRODUCTION, the > RestartPoint may finish after we have written end-of-recovery record, but > before we're in production and thus the minRecoveryPoint may again be set. Yeah, this has been something I considered as well first, but I was not confident enough that setting up minRecoveryPoint to InvalidXLogRecPtr was actually a safe thing for timeline switches. So I have spent a good portion of today testing and playing with it to be confident enough that this was right, and I have finished with the attached. The patch adds a new flag to XLogCtl which marks if the control file has been updated after the end-of-recovery record has been written, so as minRecoveryPoint does not get updated because of a restart point running in parallel. I have also reworked the test case you sent, removing the manuals sleeps and replacing them with correct wait points. There is also no point to wait after promotion as pg_ctl promote implies a wait. Another important thing is that you need to use wal_log_hints = off to see a crash, which is something that allows_streaming actually enables. Comments are welcome. -- Michael
Вложения
Hello. At Thu, 24 May 2018 16:57:07 +0900, Michael Paquier <michael@paquier.xyz> wrote in <20180524075707.GE15445@paquier.xyz> > On Mon, May 14, 2018 at 01:14:22PM +0530, Pavan Deolasee wrote: > > Looks like I didn't understand Alvaro's comment when he mentioned it to me > > off-list. But I now see what Michael and Alvaro mean and that indeed seems > > like a problem. I was thinking that the test for (ControlFile->state == > > DB_IN_ARCHIVE_RECOVERY) will ensure that minRecoveryPoint can't be updated > > after the standby is promoted. While that's true for a DB_IN_PRODUCTION, the > > RestartPoint may finish after we have written end-of-recovery record, but > > before we're in production and thus the minRecoveryPoint may again be set. > > Yeah, this has been something I considered as well first, but I was not > confident enough that setting up minRecoveryPoint to InvalidXLogRecPtr > was actually a safe thing for timeline switches. > > So I have spent a good portion of today testing and playing with it to > be confident enough that this was right, and I have finished with the > attached. The patch adds a new flag to XLogCtl which marks if the > control file has been updated after the end-of-recovery record has been > written, so as minRecoveryPoint does not get updated because of a > restart point running in parallel. > > I have also reworked the test case you sent, removing the manuals sleeps > and replacing them with correct wait points. There is also no point to > wait after promotion as pg_ctl promote implies a wait. Another > important thing is that you need to use wal_log_hints = off to see a > crash, which is something that allows_streaming actually enables. > > Comments are welcome. As the result of some poking around, my dignosis is different from yours. (I believe that) By definition recovery doesn't end until the end-of-recovery check point ends so from the viewpoint I think it is wrong to clear ControlFile->minRecoveryPoint before the end. Invalid-page checking during crash recovery is hamful rather than useless. It is done by CheckRecoveryConsistency even in crash recovery against its expectation because there's a case where minRecoveryPoint is valid but InArchiveRecovery is false. The two variable there seems to be in contradiction with each other. The immediate cause of the contradition is that StartXLOG wrongly initializes local minRecoveryPoint from control file's value even under crash recovery. updateMinRecoveryPoint also should be turned off during crash recovery. The case of crash after last checkpoint end has been treated in UpdateMinRecoveryPoint but forgot to consider this case, where crash recovery has been started while control file is still holding valid minRecoveryPoint. Finally, the attached patch seems fixing the issue. It passes 015_promotion.pl and the problem doesn't happen with 014_promotion_bug.pl. Also this passes the existing tests 002-014. regards. -- Kyotaro Horiguchi NTT Open Source Software Center From 192c36bc96135d9108c499d01016f7eb6fc28fd1 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp> Date: Thu, 7 Jun 2018 16:02:41 +0900 Subject: [PATCH] Fix wrong behavior during crash recovery After being killed before end-of-recovery checkpoint ends, server restarts in crash recovery mode. However it can wrongly performs invalid-page checks of WAL records, which is not expected during crash recovery, then results in PANIC. This patch prevents the wrong checking by suppressing needless updates of minRecoveryPoint during crash recovery. --- src/backend/access/transam/xlog.c | 27 +++++++++++++-------------- 1 file changed, 13 insertions(+), 14 deletions(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index adbd6a2126..283c26cb6c 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -821,8 +821,11 @@ static XLogSource XLogReceiptSource = 0; /* XLOG_FROM_* code */ static XLogRecPtr ReadRecPtr; /* start of last record read */ static XLogRecPtr EndRecPtr; /* end+1 of last record read */ -static XLogRecPtr minRecoveryPoint; /* local copy of - * ControlFile->minRecoveryPoint */ +/* + * local copy of ControlFile->minRecoveryPoint. InvalidXLogRecPtr means we're + * in crash recovery + */ +static XLogRecPtr minRecoveryPoint = InvalidXLogRecPtr; static TimeLineID minRecoveryPointTLI; static bool updateMinRecoveryPoint = true; @@ -2717,14 +2720,7 @@ UpdateMinRecoveryPoint(XLogRecPtr lsn, bool force) minRecoveryPoint = ControlFile->minRecoveryPoint; minRecoveryPointTLI = ControlFile->minRecoveryPointTLI; - /* - * An invalid minRecoveryPoint means that we need to recover all the WAL, - * i.e., we're doing crash recovery. We never modify the control file's - * value in that case, so we can short-circuit future checks here too. - */ - if (minRecoveryPoint == 0) - updateMinRecoveryPoint = false; - else if (force || minRecoveryPoint < lsn) + if (force || minRecoveryPoint < lsn) { XLogRecPtr newMinRecoveryPoint; TimeLineID newMinRecoveryPointTLI; @@ -6841,6 +6837,7 @@ StartupXLOG(void) ControlFile->checkPointCopy.ThisTimeLineID, recoveryTargetTLI))); ControlFile->state = DB_IN_CRASH_RECOVERY; + updateMinRecoveryPoint = false; } ControlFile->checkPoint = checkPointLoc; ControlFile->checkPointCopy = checkPoint; @@ -6852,6 +6849,10 @@ StartupXLOG(void) ControlFile->minRecoveryPoint = checkPoint.redo; ControlFile->minRecoveryPointTLI = checkPoint.ThisTimeLineID; } + + /* also initialize our local copies */ + minRecoveryPoint = ControlFile->minRecoveryPoint; + minRecoveryPointTLI = ControlFile->minRecoveryPointTLI; } /* @@ -6889,10 +6890,6 @@ StartupXLOG(void) /* No need to hold ControlFileLock yet, we aren't up far enough */ UpdateControlFile(); - /* initialize our local copy of minRecoveryPoint */ - minRecoveryPoint = ControlFile->minRecoveryPoint; - minRecoveryPointTLI = ControlFile->minRecoveryPointTLI; - /* * Reset pgstat data, because it may be invalid after recovery. */ @@ -7858,6 +7855,8 @@ CheckRecoveryConsistency(void) if (XLogRecPtrIsInvalid(minRecoveryPoint)) return; + Assert(InArchiveRecovery); + /* * assume that we are called in the startup process, and hence don't need * a lock to read lastReplayedEndRecPtr -- 2.16.3
On Thu, Jun 07, 2018 at 07:58:29PM +0900, Kyotaro HORIGUCHI wrote: > Invalid-page checking during crash recovery is hamful rather than > useless. It is done by CheckRecoveryConsistency even in crash > recovery against its expectation because there's a case where > minRecoveryPoint is valid but InArchiveRecovery is false. The two > variable there seems to be in contradiction with each other. > > The immediate cause of the contradition is that StartXLOG wrongly > initializes local minRecoveryPoint from control file's value even > under crash recovery. updateMinRecoveryPoint also should be > turned off during crash recovery. The case of crash after last > checkpoint end has been treated in UpdateMinRecoveryPoint but > forgot to consider this case, where crash recovery has been > started while control file is still holding valid > minRecoveryPoint. Hmm. This patch looks interesting and those issues need a very careful lookup. This is one of those things which should be fixed as part of the extra CF, so I am planning to look at it in details very soon, perhaps by the end of this week... I have by the way noticed that nothing was registered in the CF app: https://commitfest.postgresql.org/18/1680/ I have added as well a bullet point in the open item page of v11 for older bugs. -- Michael
Вложения
On Thu, Jun 07, 2018 at 07:58:29PM +0900, Kyotaro HORIGUCHI wrote: > (I believe that) By definition recovery doesn't end until the > end-of-recovery check point ends so from the viewpoint I think it > is wrong to clear ControlFile->minRecoveryPoint before the end. > > Invalid-page checking during crash recovery is hamful rather than > useless. It is done by CheckRecoveryConsistency even in crash > recovery against its expectation because there's a case where > minRecoveryPoint is valid but InArchiveRecovery is false. The two > variable there seems to be in contradiction with each other. > > The immediate cause of the contradition is that StartXLOG wrongly > initializes local minRecoveryPoint from control file's value even > under crash recovery. updateMinRecoveryPoint also should be > turned off during crash recovery. The case of crash after last > checkpoint end has been treated in UpdateMinRecoveryPoint but > forgot to consider this case, where crash recovery has been > started while control file is still holding valid > minRecoveryPoint. I have been digesting your proposal and I reviewed it, and I think that what you are proposing here is on the right track. However, the updates around minRecoveryPoint and minRecoveryPointTLI ought to be more consistent because that could cause future issues. I have spotted two bug where I think the problem is not fixed: when replaying a WAL record XLOG_PARAMETER_CHANGE, minRecoveryPoint and minRecoveryPointTLI would still get updated from the control file values which can still lead to failures as CheckRecoveryConsistency could still happily trigger a PANIC, so I think that we had better maintain those values consistent as long as crash recovery runs. And XLogNeedsFlush() also has a similar problem. Note that there is as well the case where the startup process switches from crash recovery to archive recovery, in which case the update of the local copies have to happen once the switch is done. Your patch covers that with just updating updateMinRecoveryPoint each time crash recovery happens but that's not completely consistent, but it seems that it also missed the fact that updateMinRecoveryPoint needs to be switched back to true as the startup process can update the control file. Actually, updateMinRecoveryPoint needs to be switched back to true in that case or the startup process would not be able to update minRecoveryPoint when it calls XLogFlush for example. There is the point of trying to get rid of updateMinRecoveryPoint which has crossed my mind, but that's not wise as it's default value allows the checkpointer minRecoveryPoint when started, which also has to happen once the startup process gets out of recovery and tells the postmaster to start the checkpointer. For backends as well that's a sane default. There is also no point in taking ControlFileLock when checking if the local copy of minRecoveryPoint is valid or not, so this can be bypassed. The assertion in CheckRecoveryConsistency is definitely a good idea as this should only be called by the startup process, so we can keep it. In the attached, I have fixed the issues I found and added the test case which should be included in the final commit. Its concept is pretty simple, the idea is to keep the local copy of minRecoveryPoint to InvalidXLogRecPtr as long as crash recovery runs, and is switched back to normal if there is a switch to archive recovery after a crash recovery. This is not really a complicated patch, and it took a lot of energy from me the last couple of days per the nature of the many scenarios to think about... So an extra pair of eyes from another committer would be welcome. I am letting that cool down for a couple of days now. -- Michael
Вложения
On Fri, Jun 22, 2018 at 9:28 AM, Michael Paquier <michael@paquier.xyz> wrote:
Thanks,
This is not really a complicated patch, and it took a lot of energy from
me the last couple of days per the nature of the many scenarios to think
about...
Thanks for the efforts. It wasn't an easy bug to chase to begin with. So I am not surprised there were additional problems that I missed.
So an extra pair of eyes from another committer would be
welcome. I am letting that cool down for a couple of days now.
I am not a committer, so don't know if my pair of eyes count, but FWIW the patch looks good to me except couple of minor points.
+/*
+ * Local copies of equivalent fields in the control file. When running
+ * crash recovery, minRecoveryPoint is set to InvalidXLogRecPtr as we
+ * expect to replay all the WAL available, and updateMinRecoveryPoint is
+ * switched to false to prevent any updates while replaying records.
+ * Those values are kept consistent as long as crash recovery runs.
+ */
static XLogRecPtr minRecoveryPoint; /* local copy of
* ControlFile->minRecoveryPoint */
The inline comment looks unnecessary now that we have comment at the top.
@@ -4266,6 +4276,12 @@ ReadRecord(XLogReaderState *xlogreader, XLogRecPtr RecPtr, int emode,
minRecoveryPoint = ControlFile->minRecoveryPoint;
minRecoveryPointTLI = ControlFile->minRecoveryPointTLI;
+ /*
+ * The startup process can update its local copy of
+ * minRecoveryPoint from that point.
+ */
s/that/this
Pavan
--
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
PostgreSQL Development, 24x7 Support, Training & Services
On Fri, Jun 22, 2018 at 10:08:24AM +0530, Pavan Deolasee wrote: > On Fri, Jun 22, 2018 at 9:28 AM, Michael Paquier <michael@paquier.xyz> > wrote: >> So an extra pair of eyes from another committer would be >> welcome. I am letting that cool down for a couple of days now. > > I am not a committer, so don't know if my pair of eyes count, but FWIW the > patch looks good to me except couple of minor points. Thanks for grabbing some time, Pavan. Any help is welcome! > +/* > + * Local copies of equivalent fields in the control file. When running > + * crash recovery, minRecoveryPoint is set to InvalidXLogRecPtr as we > + * expect to replay all the WAL available, and updateMinRecoveryPoint is > + * switched to false to prevent any updates while replaying records. > + * Those values are kept consistent as long as crash recovery runs. > + */ > static XLogRecPtr minRecoveryPoint; /* local copy of > * ControlFile->minRecoveryPoint */ > > The inline comment looks unnecessary now that we have comment at the > top. I'll fix that. > @@ -4266,6 +4276,12 @@ ReadRecord(XLogReaderState *xlogreader, XLogRecPtr > RecPtr, int emode, > minRecoveryPoint = ControlFile->minRecoveryPoint; > minRecoveryPointTLI = ControlFile->minRecoveryPointTLI; > > + /* > + * The startup process can update its local copy of > + * minRecoveryPoint from that point. > + */ > > s/that/this This one too. -- Michael
Вложения
Hello, sorry for the absense and I looked the second patch. At Fri, 22 Jun 2018 13:45:21 +0900, Michael Paquier <michael@paquier.xyz> wrote in <20180622044521.GC5215@paquier.xyz> > On Fri, Jun 22, 2018 at 10:08:24AM +0530, Pavan Deolasee wrote: > > On Fri, Jun 22, 2018 at 9:28 AM, Michael Paquier <michael@paquier.xyz> > > wrote: > >> So an extra pair of eyes from another committer would be > >> welcome. I am letting that cool down for a couple of days now. > > > > I am not a committer, so don't know if my pair of eyes count, but FWIW the > > patch looks good to me except couple of minor points. > > Thanks for grabbing some time, Pavan. Any help is welcome! in previous mail: > I have spotted two > bug where I think the problem is not fixed: when replaying a WAL record > XLOG_PARAMETER_CHANGE, minRecoveryPoint and minRecoveryPointTLI would > still get updated from the control file values which can still lead to > failures as CheckRecoveryConsistency could still happily trigger a > PANIC, so I think that we had better maintain those values consistent as The fix of StartupXLOG, CheckRecoveryConsistency, ReadRecrod and xlog_redo looks (functionally, mendtioned below) fine. > long as crash recovery runs. And XLogNeedsFlush() also has a similar > problem. Here, on the other hand, this patch turns off updateMinRecoverypoint if minRecoverPoint is invalid when RecoveryInProgress() == true. Howerver RecovInProg() == true means archive recovery is already started and and minRecoveryPoint *should* be updated t for the condition. Actually minRecoverypoint is updated just below. If this is really right thing, I think that some explanation for the reason is required here. In xlog_redo there still be "minRecoverypoint != 0", which ought to be described as "!XLogRecPtrIsInvalid(minRecoveryPoint)". (It seems the only one. Double negation is a bit uneasy but there are many instance of this kind of coding.) # I'll go all-out next week. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On Fri, Jun 22, 2018 at 02:34:02PM +0900, Kyotaro HORIGUCHI wrote: > Hello, sorry for the absense and I looked the second patch. Thanks for the review! > At Fri, 22 Jun 2018 13:45:21 +0900, Michael Paquier > <michael@paquier.xyz> wrote in <20180622044521.GC5215@paquier.xyz> >> long as crash recovery runs. And XLogNeedsFlush() also has a similar >> problem. > > Here, on the other hand, this patch turns off > updateMinRecoverypoint if minRecoverPoint is invalid when > RecoveryInProgress() == true. Howerver RecovInProg() == true > means archive recovery is already started and and > minRecoveryPoint *should* be updated t for the > condition. Actually minRecoverypoint is updated just below. If > this is really right thing, I think that some explanation for the > reason is required here. LocalRecoveryInProgress is just a local copy of SharedRecoveryInProgress so RecoveryInProgress also returns true if crash recovery is running. But perhaps I am missing what you mean? The point here is that redo can call XLogNeedsFlush, no? > In xlog_redo there still be "minRecoverypoint != 0", which ought > to be described as "!XLogRecPtrIsInvalid(minRecoveryPoint)". (It > seems the only one. Double negation is a bit uneasy but there are > many instance of this kind of coding.) It is possible to use directly a comparison with InvalidXLogRecPtr instead of a double negation. > # I'll go all-out next week. Enjoy your vacations! -- Michael
Вложения
On Fri, Jun 22, 2018 at 03:25:48PM +0900, Michael Paquier wrote: > On Fri, Jun 22, 2018 at 02:34:02PM +0900, Kyotaro HORIGUCHI wrote: >> Hello, sorry for the absense and I looked the second patch. > > Thanks for the review! I have been spending some time testing and torturing the patch for all stable branches, and I have finished with the set of patches attached. My testing has involved using the TAP suite, where I have actually, and roughly backported the infrastructure in v10 down to older versions, which has required to tweak Makefile.global.in and finding out again that pg_ctl start has switched to the wait mode by default in 10. I have spent a bit of time testing this on HEAD, 10 and 9.6. For 9.5, 9.4 and 9.3 I have reproduced the failure and tested the patch, but I lacked time to perform more tests. The patch set for 9.3~9.5 applies without conflict across the 3 branches. 9.6 has a conflict in a comment, and v10 had an extra comment conflict. Feel free to have a look, I am not completely done with this stuff and I'll work more tomorrow on checking 9.3~9.5. -- Michael
Вложения
Adding Heikki and Andres in CC here for awareness.. On Wed, Jun 27, 2018 at 05:29:38PM +0900, Michael Paquier wrote: > I have spent a bit of time testing this on HEAD, 10 and 9.6. For 9.5, > 9.4 and 9.3 I have reproduced the failure and tested the patch, but I > lacked time to perform more tests. The patch set for 9.3~9.5 applies > without conflict across the 3 branches. 9.6 has a conflict in a > comment, and v10 had an extra comment conflict. > > Feel free to have a look, I am not completely done with this stuff and > I'll work more tomorrow on checking 9.3~9.5. And I have been able to spend the time I wanted to spend on this patch series with testing for 9.3 to 9.5. Attached are a couple of patches you can use to reproduce the failures for all the branches: - For master and 10, the tests are included in the patch and are proposed for commit. - On 9.6, I had to tweak the TAP scripts as pg_ctl start has switched to use the wait mode by default. - On 9.5, there is a tweak to src/Makefile.global.in which cleans up tmp_check, and a couple of GUCs not compatible. - On 9.4, I had to tweak src/Makefile.global.in so as the temporary installation path is correct. Again some GUCs had to be tweaked. - On 9.3, there is no TAP infrastructure, so I tweaked src/test/recovery/Makefile to be able to run the tests. I have also created a bash script which emulates what the TAP test does, which is attached. Because of visibly some timing reasons, I have not been able to reproduce the problem with it. Anyway, running (and actually sort of back-porting) the TAP suite so as the problematic test case can be run is possible with the sets attached and shows the failure so we can use that. Thoughts? I would love more input about the patch concept. -- Michael
Вложения
Hello. At Fri, 22 Jun 2018 15:25:48 +0900, Michael Paquier <michael@paquier.xyz> wrote in <20180622062548.GE5215@paquier.xyz> > On Fri, Jun 22, 2018 at 02:34:02PM +0900, Kyotaro HORIGUCHI wrote: > > Hello, sorry for the absense and I looked the second patch. > > Thanks for the review! > > > At Fri, 22 Jun 2018 13:45:21 +0900, Michael Paquier > > <michael@paquier.xyz> wrote in <20180622044521.GC5215@paquier.xyz> > >> long as crash recovery runs. And XLogNeedsFlush() also has a similar > >> problem. > > > > Here, on the other hand, this patch turns off > > updateMinRecoverypoint if minRecoverPoint is invalid when > > RecoveryInProgress() == true. Howerver RecovInProg() == true > > means archive recovery is already started and and > > minRecoveryPoint *should* be updated t for the > > condition. Actually minRecoverypoint is updated just below. If > > this is really right thing, I think that some explanation for the > > reason is required here. > > LocalRecoveryInProgress is just a local copy of SharedRecoveryInProgress > so RecoveryInProgress also returns true if crash recovery is running. > But perhaps I am missing what you mean? The point here is that redo can > call XLogNeedsFlush, no? My concern at the time was the necessity to turn off updateMinRecoveryPoint on the fly. (The previous comment seems a bit confused, sorry.) When minRecoveryPoint is invalid, there're only two possible cases. It may be at very beginning of archive reovery or may be running a crash recovery. In the latter case, we have detected crash recovery before redo starts. So we can turn off updateMinRecoveryPoint immediately and no further check is needed and it is (I think) easier to understand. > > In xlog_redo there still be "minRecoverypoint != 0", which ought > > to be described as "!XLogRecPtrIsInvalid(minRecoveryPoint)". (It > > seems the only one. Double negation is a bit uneasy but there are > > many instance of this kind of coding.) > > It is possible to use directly a comparison with InvalidXLogRecPtr > instead of a double negation. I'm not sure whether it is abstraction of invalid value, or just a short cut of the value. That's right if it's the latter. (There's several places where invalid LSN is assumed to be smaller than any valid values in the patch). the second diff is the difference of the first patch from promote_panic_master.diff On further thought, as we confirmed upthread (and existing comments are saying) that (minRecoveryPoint == 0) !InArchiveRecovery are always equivalent, and updateMinRecoveryPoint becomes equivalent to them by the v3 patch. That is, we can just remove the variable and the attached third patch is that. It also passes all recovery tests including the new 015. regards. -- Kyotaro Horiguchi NTT Open Source Software Center diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index dcfef36591..d86137ae8b 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -821,8 +821,14 @@ static XLogSource XLogReceiptSource = 0; /* XLOG_FROM_* code */ static XLogRecPtr ReadRecPtr; /* start of last record read */ static XLogRecPtr EndRecPtr; /* end+1 of last record read */ -static XLogRecPtr minRecoveryPoint; /* local copy of - * ControlFile->minRecoveryPoint */ +/* + * Local copies of equivalent fields in the control file. When running + * crash recovery, minRecoveryPoint is set to InvalidXLogRecPtr as we + * expect to replay all the WAL available, and updateMinRecoveryPoint is + * switched to false to prevent any updates while replaying records. + * Those values are kept consistent as long as crash recovery runs. + */ +static XLogRecPtr minRecoveryPoint; static TimeLineID minRecoveryPointTLI; static bool updateMinRecoveryPoint = true; @@ -2707,7 +2713,7 @@ XLogGetReplicationSlotMinimumLSN(void) static void UpdateMinRecoveryPoint(XLogRecPtr lsn, bool force) { - /* Quick check using our local copy of the variable */ + /* Check using our local copy of minRecoveryPoint */ if (!updateMinRecoveryPoint || (!force && lsn <= minRecoveryPoint)) return; @@ -2717,14 +2723,7 @@ UpdateMinRecoveryPoint(XLogRecPtr lsn, bool force) minRecoveryPoint = ControlFile->minRecoveryPoint; minRecoveryPointTLI = ControlFile->minRecoveryPointTLI; - /* - * An invalid minRecoveryPoint means that we need to recover all the WAL, - * i.e., we're doing crash recovery. We never modify the control file's - * value in that case, so we can short-circuit future checks here too. - */ - if (minRecoveryPoint == 0) - updateMinRecoveryPoint = false; - else if (force || minRecoveryPoint < lsn) + if (force || minRecoveryPoint < lsn) { XLogRecPtr newMinRecoveryPoint; TimeLineID newMinRecoveryPointTLI; @@ -3110,7 +3109,7 @@ XLogNeedsFlush(XLogRecPtr record) */ if (RecoveryInProgress()) { - /* Quick exit if already known updated */ + /* Quick exit if already known to be updated or cannot be updated */ if (record <= minRecoveryPoint || !updateMinRecoveryPoint) return false; @@ -3124,20 +3123,8 @@ XLogNeedsFlush(XLogRecPtr record) minRecoveryPointTLI = ControlFile->minRecoveryPointTLI; LWLockRelease(ControlFileLock); - /* - * An invalid minRecoveryPoint means that we need to recover all the - * WAL, i.e., we're doing crash recovery. We never modify the control - * file's value in that case, so we can short-circuit future checks - * here too. - */ - if (minRecoveryPoint == 0) - updateMinRecoveryPoint = false; - /* check again */ - if (record <= minRecoveryPoint || !updateMinRecoveryPoint) - return false; - else - return true; + return record > minRecoveryPoint; } /* Quick exit if already known flushed */ @@ -4269,6 +4256,12 @@ ReadRecord(XLogReaderState *xlogreader, XLogRecPtr RecPtr, int emode, minRecoveryPoint = ControlFile->minRecoveryPoint; minRecoveryPointTLI = ControlFile->minRecoveryPointTLI; + /* + * The startup process can update its local copy of + * minRecoveryPoint from this point. + */ + updateMinRecoveryPoint = true; + UpdateControlFile(); LWLockRelease(ControlFileLock); @@ -6892,9 +6885,31 @@ StartupXLOG(void) /* No need to hold ControlFileLock yet, we aren't up far enough */ UpdateControlFile(); - /* initialize our local copy of minRecoveryPoint */ - minRecoveryPoint = ControlFile->minRecoveryPoint; - minRecoveryPointTLI = ControlFile->minRecoveryPointTLI; + /* + * Initialize our local copy of minRecoveryPoint. When doing crash + * recovery we want to replay up to the end of WAL. Particularly, in + * the case of a promoted standby minRecoveryPoint value in the + * control file is only updated after the first checkpoint. However, + * if the instance crashes before the first post-recovery checkpoint + * is completed then recovery will use a stale location causing the + * startup process to think that there are still invalid page + * references when checking for data consistency. + */ + if (InArchiveRecovery) + { + minRecoveryPoint = ControlFile->minRecoveryPoint; + minRecoveryPointTLI = ControlFile->minRecoveryPointTLI; + } + else + { + /* + * We are to run crash recovery. We shouldn't update + * minRecoveryPoint until crash recvoery ends. + */ + updateMinRecoveryPoint = false; + minRecoveryPoint = InvalidXLogRecPtr; + minRecoveryPointTLI = 0; + } /* * Reset pgstat data, because it may be invalid after recovery. @@ -7861,6 +7876,8 @@ CheckRecoveryConsistency(void) if (XLogRecPtrIsInvalid(minRecoveryPoint)) return; + Assert(InArchiveRecovery); + /* * assume that we are called in the startup process, and hence don't need * a lock to read lastReplayedEndRecPtr @@ -9949,11 +9966,16 @@ xlog_redo(XLogReaderState *record) * Update minRecoveryPoint to ensure that if recovery is aborted, we * recover back up to this point before allowing hot standby again. * This is important if the max_* settings are decreased, to ensure - * you don't run queries against the WAL preceding the change. + * you don't run queries against the WAL preceding the change. The + * local copies cannot be updated as long as crash recovery is + * happening and we expect all the WAL to be replayed. */ - minRecoveryPoint = ControlFile->minRecoveryPoint; - minRecoveryPointTLI = ControlFile->minRecoveryPointTLI; - if (minRecoveryPoint != 0 && minRecoveryPoint < lsn) + if (InArchiveRecovery) + { + minRecoveryPoint = ControlFile->minRecoveryPoint; + minRecoveryPointTLI = ControlFile->minRecoveryPointTLI; + } + if (minRecoveryPoint != InvalidXLogRecPtr && minRecoveryPoint < lsn) { ControlFile->minRecoveryPoint = lsn; ControlFile->minRecoveryPointTLI = ThisTimeLineID; 2716c2716 < /* Quick check using our local copy of the variable */ --- > /* Check using our local copy of minRecoveryPoint */ 2720,2732d2719 < /* < * An invalid minRecoveryPoint means that we need to recover all the WAL, < * i.e., we're doing crash recovery. We never modify the control file's < * value in that case, so we can short-circuit future checks here too. The < * local values of minRecoveryPoint and minRecoveryPointTLI should not be < * updated until crash recovery finishes. < */ < if (XLogRecPtrIsInvalid(minRecoveryPoint)) < { < updateMinRecoveryPoint = false; < return; < } < 3125,3133d3111 < /* < * An invalid minRecoveryPoint means that we need to recover all the < * WAL, i.e., we're doing crash recovery. We never modify the control < * file's value in that case, so we can short-circuit future checks < * here too. < */ < if (XLogRecPtrIsInvalid(minRecoveryPoint)) < updateMinRecoveryPoint = false; < 6926a6905,6909 > /* > * We are to run crash recovery. We shouldn't update > * minRecoveryPoint until crash recvoery ends. > */ > updateMinRecoveryPoint = false; diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index dcfef36591..8a37784653 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -821,10 +821,13 @@ static XLogSource XLogReceiptSource = 0; /* XLOG_FROM_* code */ static XLogRecPtr ReadRecPtr; /* start of last record read */ static XLogRecPtr EndRecPtr; /* end+1 of last record read */ -static XLogRecPtr minRecoveryPoint; /* local copy of - * ControlFile->minRecoveryPoint */ +/* + * Local copies of equivalent fields in the control file. When running + * crash recovery, minRecoveryPoint is set to InvalidXLogRecPtr as we + * expect to replay all the WAL available. + */ +static XLogRecPtr minRecoveryPoint; static TimeLineID minRecoveryPointTLI; -static bool updateMinRecoveryPoint = true; /* * Have we reached a consistent database state? In crash recovery, we have @@ -2707,8 +2710,13 @@ XLogGetReplicationSlotMinimumLSN(void) static void UpdateMinRecoveryPoint(XLogRecPtr lsn, bool force) { - /* Quick check using our local copy of the variable */ - if (!updateMinRecoveryPoint || (!force && lsn <= minRecoveryPoint)) + /* + * Quich check using our local copy of minRecoveryPoint. Invalid + * minRecoveryPoint value means we are running crash recovery and don't + * need to update it. + */ + if (XLogRecPtrIsInvalid(minRecoveryPoint) || + (!force && lsn <= minRecoveryPoint)) return; LWLockAcquire(ControlFileLock, LW_EXCLUSIVE); @@ -2717,14 +2725,7 @@ UpdateMinRecoveryPoint(XLogRecPtr lsn, bool force) minRecoveryPoint = ControlFile->minRecoveryPoint; minRecoveryPointTLI = ControlFile->minRecoveryPointTLI; - /* - * An invalid minRecoveryPoint means that we need to recover all the WAL, - * i.e., we're doing crash recovery. We never modify the control file's - * value in that case, so we can short-circuit future checks here too. - */ - if (minRecoveryPoint == 0) - updateMinRecoveryPoint = false; - else if (force || minRecoveryPoint < lsn) + if (force || minRecoveryPoint < lsn) { XLogRecPtr newMinRecoveryPoint; TimeLineID newMinRecoveryPointTLI; @@ -3110,8 +3111,8 @@ XLogNeedsFlush(XLogRecPtr record) */ if (RecoveryInProgress()) { - /* Quick exit if already known updated */ - if (record <= minRecoveryPoint || !updateMinRecoveryPoint) + /* Quick exit if already known to be updated or cannot be updated */ + if (record <= minRecoveryPoint || XLogRecPtrIsInvalid(minRecoveryPoint)) return false; /* @@ -3124,20 +3125,8 @@ XLogNeedsFlush(XLogRecPtr record) minRecoveryPointTLI = ControlFile->minRecoveryPointTLI; LWLockRelease(ControlFileLock); - /* - * An invalid minRecoveryPoint means that we need to recover all the - * WAL, i.e., we're doing crash recovery. We never modify the control - * file's value in that case, so we can short-circuit future checks - * here too. - */ - if (minRecoveryPoint == 0) - updateMinRecoveryPoint = false; - /* check again */ - if (record <= minRecoveryPoint || !updateMinRecoveryPoint) - return false; - else - return true; + return record > minRecoveryPoint; } /* Quick exit if already known flushed */ @@ -6892,9 +6881,31 @@ StartupXLOG(void) /* No need to hold ControlFileLock yet, we aren't up far enough */ UpdateControlFile(); - /* initialize our local copy of minRecoveryPoint */ - minRecoveryPoint = ControlFile->minRecoveryPoint; - minRecoveryPointTLI = ControlFile->minRecoveryPointTLI; + /* + * Initialize our local copy of minRecoveryPoint. When doing crash + * recovery we want to replay up to the end of WAL. Particularly, in + * the case of a promoted standby minRecoveryPoint value in the + * control file is only updated after the first checkpoint. However, + * if the instance crashes before the first post-recovery checkpoint + * is completed then recovery will use a stale location causing the + * startup process to think that there are still invalid page + * references when checking for data consistency. + */ + if (InArchiveRecovery) + { + minRecoveryPoint = ControlFile->minRecoveryPoint; + minRecoveryPointTLI = ControlFile->minRecoveryPointTLI; + } + else + { + /* + * We are to run crash recovery. Set invalid value to + * minRecoveryPoint ignoring the shared variable, which inhibis + * updating the variable during crash recovery. + */ + minRecoveryPoint = InvalidXLogRecPtr; + minRecoveryPointTLI = 0; + } /* * Reset pgstat data, because it may be invalid after recovery. @@ -7861,6 +7872,8 @@ CheckRecoveryConsistency(void) if (XLogRecPtrIsInvalid(minRecoveryPoint)) return; + Assert(InArchiveRecovery); + /* * assume that we are called in the startup process, and hence don't need * a lock to read lastReplayedEndRecPtr @@ -9265,6 +9278,7 @@ CreateRestartPoint(int flags) * no value in having the minimum recovery point any earlier than this * anyway, because redo will begin just after the checkpoint record. */ + Assert(ControlFile->minRecoveryPoint != InvalidXLogRecPtr); if (ControlFile->minRecoveryPoint < lastCheckPointEndPtr) { ControlFile->minRecoveryPoint = lastCheckPointEndPtr; @@ -9949,14 +9963,21 @@ xlog_redo(XLogReaderState *record) * Update minRecoveryPoint to ensure that if recovery is aborted, we * recover back up to this point before allowing hot standby again. * This is important if the max_* settings are decreased, to ensure - * you don't run queries against the WAL preceding the change. + * you don't run queries against the WAL preceding the change. Both + * local and shared variables cannot be updated and we expect all the + * WAL to be replayed while crash recovery is happning. */ - minRecoveryPoint = ControlFile->minRecoveryPoint; - minRecoveryPointTLI = ControlFile->minRecoveryPointTLI; - if (minRecoveryPoint != 0 && minRecoveryPoint < lsn) + if (InArchiveRecovery) { - ControlFile->minRecoveryPoint = lsn; - ControlFile->minRecoveryPointTLI = ThisTimeLineID; + Assert(minRecoveryPoint != InvalidXLogRecPtr); + minRecoveryPoint = ControlFile->minRecoveryPoint; + minRecoveryPointTLI = ControlFile->minRecoveryPointTLI; + + if (minRecoveryPoint < lsn) + { + ControlFile->minRecoveryPoint = lsn; + ControlFile->minRecoveryPointTLI = ThisTimeLineID; + } } CommitTsParameterChange(xlrec.track_commit_timestamp,
On Mon, Jul 02, 2018 at 04:25:13PM +0900, Kyotaro HORIGUCHI wrote: > When minRecoveryPoint is invalid, there're only two possible > cases. It may be at very beginning of archive reovery or may be > running a crash recovery. In the latter case, we have detected > crash recovery before redo starts. So we can turn off > updateMinRecoveryPoint immediately and no further check is > needed and it is (I think) easier to understand. Er, you are missing the point that updateMinRecoveryPoint is also used by processes, like the checkpointer, other than the startup process, which actually needs to update minRecoveryPoint and rely on the default value of updateMinRecoveryPoint which is true... I am planning to finish wrapping this patch luckily on Wednesday JST time, or in the worst case on Thursday. I got this problem on my mind for a couple of days now and I could not find a case where the approach taken could cause a problem. Opinions are welcome. -- Michael
Вложения
On Mon, Jul 02, 2018 at 10:41:05PM +0900, Michael Paquier wrote: > I am planning to finish wrapping this patch luckily on Wednesday JST > time, or in the worst case on Thursday. I got this problem on my mind > for a couple of days now and I could not find a case where the approach > taken could cause a problem. Opinions are welcome. Okay, pushed and back-patched. Thanks to all who participated in the thread! -- Michael
Вложения
On Thu, Jul 5, 2018 at 7:20 AM, Michael Paquier <michael@paquier.xyz> wrote:
On Mon, Jul 02, 2018 at 10:41:05PM +0900, Michael Paquier wrote:
> I am planning to finish wrapping this patch luckily on Wednesday JST
> time, or in the worst case on Thursday. I got this problem on my mind
> for a couple of days now and I could not find a case where the approach
> taken could cause a problem. Opinions are welcome.
Okay, pushed and back-patched. Thanks to all who participated in the
thread!
Many thanks Michael for doing the gruelling of coming up with a more complete fix, verifying all the cases, in various back branches.
Thanks,
Pavan
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
PostgreSQL Development, 24x7 Support, Training & Services
On Thu, Jul 05, 2018 at 01:03:14PM +0530, Pavan Deolasee wrote: > Many thanks Michael for doing the gruelling of coming up with a more > complete fix, verifying all the cases, in various back branches. No problem. I hope I got the credits right. If there is anything wrong please feel free to let me know. -- Michael