On 2020-Apr-08, Kyotaro Horiguchi wrote:
> I understand how it happens.
>
> The latch triggered by checkpoint request by CHECKPOINT command has
> been absorbed by ConditionVariableSleep() in
> InvalidateObsoleteReplicationSlots. The attached allows checkpointer
> use MyLatch for other than checkpoint request while a checkpoint is
> running.
Hmm, that explanation makes sense, but I couldn't reproduce it with the
steps you provided. Perhaps I'm missing something.
Anyway I think this patch should fix it also -- instead of adding a new
flag, we just rely on the existing flags (since do_checkpoint must have
been set correctly from the flags earlier in that block.)
I think it'd be worth to verify this bugfix in a new test. Would you
have time to produce that? I could try in a couple of days ...
--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services