Обсуждение: Logical walsenders don't process XLOG_CHECKPOINT_SHUTDOWN

Поиск

Список

Период

Сортировка

Logical walsenders don't process XLOG_CHECKPOINT_SHUTDOWN

От

Amit Kapila

Дата:

25 июля 2023 г., 12:01:00

Currently, we don't perform $SUBJECT at the time of shutdown of the
server. I think currently it will only have a minor impact that after
restart subscribers will ask to start processing before the
XLOG_CHECKPOINT_SHUTDOWN or maybe after the switchover the old
publisher will have an extra WAL record. However, if we want to
support the upgrade of the publisher node such that the existing slots
are copied/created into a new cluster, we need to ensure that all the
changes generated on the publisher must be sent and applied to the
subscriber. This is a hard requirement because after the upgrade we
reset the WAL and if some of the WAL has not been sent then that will
be lost. Now, even a clean shutdown of the publisher node can't ensure
that all the WAL has been sent because it is quite possible that the
subscriber node is down due to which at shutdown time walsenders won't
be available to send the data. Similarly, there could be some logical
slots created via backend which may not have processed all the data
and we can't copy those slots as it is during the upgrade.

To ensure that all the data has been sent during the upgrade, we can
ensure that each logical slot's confirmed_flush_lsn (position in the
WAL till which subscriber has confirmed that it has applied the WAL)
is the same as current_wal_insert_lsn. Now, because we don't send
XLOG_CHECKPOINT_SHUTDOWN even on clean shutdown, confirmed_flush_lsn
will never be the same as current_wal_insert_lsn. The one idea being
discussed in patch [1] (see 0003) is to ensure that each slot's LSN is
exactly XLOG_CHECKPOINT_SHUTDOWN ago which probably has some drawbacks
like what if we tomorrow add some other WAL in the shutdown checkpoint
path or the size of record changes then we would need to modify the
corresponding code in upgrade.

The other possibility is that we allow logical walsenders to process
XLOG_CHECKPOINT_SHUTDOWN before shutdown after which during the
upgrade confirmed_flush_lsn will be the same as
current_wal_insert_lsn. AFAICU, the primary reason that we don't allow
it is that we want to avoid writing any new WAL after the shutdown
checkpoint (to avoid any sort of PANIC as discussed in the thread [2])
which is possible during decoding due to hint bits but it doesn't seem
decoding of XLOG_CHECKPOINT_SHUTDOWN can lead to any hint bit updates.
It seems we made these changes as part of commit c6c3334364 [3]. Note
that even if we can ensure that walsenders send all the WAL before
shutdown and make corresponding logical slots up-to-date so that there
is no pending data but it would still be possible that logical slots
created manually via backends won't consume all the WAL before
shutdown. I think those will be the responsibility of users as those
are created by them.

We can also provide some guidelines to users similar to what we have
on physical standby in pg_upgrade docs [4] (See: 9 Prepare for standby
server upgrades). Something like, before upgrading, verify that the
subscriber is caught up with the publisher by comparing the current
WAL position on the publisher and pg_stat_subscription.received_lsn on
the subscriber.

Any better ideas or thoughts on the above?

[1] -
https://www.postgresql.org/message-id/TYAPR01MB586619721863B7FFDAC4369FF550A%40TYAPR01MB5866.jpnprd01.prod.outlook.com
[2] - https://www.postgresql.org/message-id/CAHGQGwEsttg9P9LOOavoc9d6VB1zVmYgfBk%3DLjsk-UL9cEf-eA%40mail.gmail.com
[3] -
commit c6c333436491a292d56044ed6e167e2bdee015a2
Author: Andres Freund <andres@anarazel.de>
Date:   Mon Jun 5 18:53:41 2017 -0700

    Prevent possibility of panics during shutdown checkpoint.
[4] - https://www.postgresql.org/docs/devel/pgupgrade.html

-- 
With Regards,
Amit Kapila.

Re: Logical walsenders don't process XLOG_CHECKPOINT_SHUTDOWN

От

Andres Freund

Дата:

25 июля 2023 г., 20:03:19

Hi,

On 2023-07-25 14:31:00 +0530, Amit Kapila wrote:
> To ensure that all the data has been sent during the upgrade, we can
> ensure that each logical slot's confirmed_flush_lsn (position in the
> WAL till which subscriber has confirmed that it has applied the WAL)
> is the same as current_wal_insert_lsn. Now, because we don't send
> XLOG_CHECKPOINT_SHUTDOWN even on clean shutdown, confirmed_flush_lsn
> will never be the same as current_wal_insert_lsn. The one idea being
> discussed in patch [1] (see 0003) is to ensure that each slot's LSN is
> exactly XLOG_CHECKPOINT_SHUTDOWN ago which probably has some drawbacks
> like what if we tomorrow add some other WAL in the shutdown checkpoint
> path or the size of record changes then we would need to modify the
> corresponding code in upgrade.

Yea, that doesn't seem like a good path. But there is a variant that seems
better: We could just scan the end of the WAL for records that should have
been streamed out?

Greetings,

Andres Freund

Re: Logical walsenders don't process XLOG_CHECKPOINT_SHUTDOWN

От

Amit Kapila

Дата:

26 июля 2023 г., 07:13:39

On Tue, Jul 25, 2023 at 10:33 PM Andres Freund <andres@anarazel.de> wrote:
>
> On 2023-07-25 14:31:00 +0530, Amit Kapila wrote:
> > To ensure that all the data has been sent during the upgrade, we can
> > ensure that each logical slot's confirmed_flush_lsn (position in the
> > WAL till which subscriber has confirmed that it has applied the WAL)
> > is the same as current_wal_insert_lsn. Now, because we don't send
> > XLOG_CHECKPOINT_SHUTDOWN even on clean shutdown, confirmed_flush_lsn
> > will never be the same as current_wal_insert_lsn. The one idea being
> > discussed in patch [1] (see 0003) is to ensure that each slot's LSN is
> > exactly XLOG_CHECKPOINT_SHUTDOWN ago which probably has some drawbacks
> > like what if we tomorrow add some other WAL in the shutdown checkpoint
> > path or the size of record changes then we would need to modify the
> > corresponding code in upgrade.
>
> Yea, that doesn't seem like a good path. But there is a variant that seems
> better: We could just scan the end of the WAL for records that should have
> been streamed out?
>

This sounds like a better idea. So, one way to realize this is that
group slots based on confirmed_flush_lsn and then scan based on that.
Once we ensure that the slot group with the highest
confirm_flush_location is up-to-date (doesn't have any pending WAL
except for shutdown_checkpoint), any slot group having a lesser value
of confirm_flush_location would be considered a group with pending
data.

BTW, I think the main downside for not trying to send
XLOG_CHECKPOINT_SHUTDOWN for logical walsenders is that even if today
there is no risk of any hint bit updates (or any other possibility of
generating WAL) during decoding of XLOG_CHECKPOINT_SHUTDOWN but there
is no future guarantee of the same. Is there anything I am missing
here?

--
With Regards,
Amit Kapila.

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Logical walsenders don't process XLOG_CHECKPOINT_SHUTDOWN

Logical walsenders don't process XLOG_CHECKPOINT_SHUTDOWN

Re: Logical walsenders don't process XLOG_CHECKPOINT_SHUTDOWN

Re: Logical walsenders don't process XLOG_CHECKPOINT_SHUTDOWN