On 2021-Aug-24, Bossart, Nathan wrote:
> I wonder if we need to move the call to RegisterSegmentBoundary() to
> somewhere before WALInsertLockRelease() for this to work as intended.
> Right now, boundary registration could take place after the flush
> pointer has been advanced, which means GetSafeFlushRecPtr() could
> still return an unsafe position.
Yeah, you're right -- that's a definite risk. I didn't try to reproduce
a problem with that, but it is seems pretty obvious that it can happen.
I didn't have a lot of luck with a reliable reproducer script. I was
able to reproduce the problem starting with Ryo Matsumura's script and
attaching a replica; most of the time the replica would recover by
restarting from a streaming position earlier than where the problem
occurred; but a few times it would just get stuck with a WAL segment
containing a bogus record. Then, after patch, the problem no longer
occurs.
I attach the patch with the change you suggested.
--
Álvaro Herrera Valdivia, Chile — https://www.EnterpriseDB.com/
Tom: There seems to be something broken here.
Teodor: I'm in sackcloth and ashes... Fixed.
http://archives.postgresql.org/message-id/482D1632.8010507@sigaev.ru