Suppress generating WAL records during the upgrade

Поиск
Список
Период
Сортировка
От Hayato Kuroda (Fujitsu)
Тема Suppress generating WAL records during the upgrade
Дата
Msg-id TYAPR01MB58660273EACEFC5BF256B133F50DA@TYAPR01MB5866.jpnprd01.prod.outlook.com
обсуждение исходный текст
Список pgsql-hackers
Dear hackers,
(CC: Julien, Sawada-san, Amit)

This is a fork thread from "[PoC] pg_upgrade: allow to upgrade publisher node" [1].

# Background

[1] is the patch which allows to replicate logical replication slots from old to new node.
Followings describe the rough steps:

1. Boot old node as binary-upgrade mode
2. Check confirmed_lsn of all the slots, and confirm all WALs are replicated to downstream
3. Dump slot info to sql file
4. Stop old node
5. Boot new node as binary-upgrade mode
...

Here, step 2 was introduced for avoiding data loss. If there are some WAL records
ahead confirmed_lsn, such records would not be replicated anymore - it may be dangerous.

So in the current patch, pg_upgrade fails if other records than SHUTDOWN_CHECKPOINT
exits after any confirmed_flush_lsn.

# Problem

We found that following three records might be generated during the upgrade.

* RUNNING_XACT
* CHECKPOINT_ONLINE
* XLOG_FPI_FOR_HINT

RUNNING_XACT might be written by the background writer. Conditions for the generation are:

a. Elapsed 15 seconds since the last WAL creation or bootstraping of the process, and either of them:
b-1. The process had never create the RUNNING_XACT record, or
b-2. Some "important WALs" were created after the last RUNNING_XACT record


CHECKPOINT_ONLINE might be written by the checkpointer. Conditions for the generation are:

a. Elapsed checkpoint_timeout seconds since the last creation or bootstraping, and either of them:
b-1. The process had never create the CHECKPOINT_ONLINE record, or
b-2. Some "important WALs" were created after the last CHECKPOINT record


XLOG_FPI_FOR_HINT, which is raised by Sawada-san, might be generated by backend processes.
Conditions for the generation are:

a. Backend processes scanned any tuples (even if it was the system catalog), or either of them:
b-1. Data checksum was enabled, or
b-2. wal_log_hints was set to on

# Solution

I wanted to suppress generations of WALs during the upgrade, because of the "# Background".

Regarding the RUNNING_XACT and CHECKPOINT_ONLINE, it might be OK by removing the
condition b-1. The duration between bootstrap and initial {RUNNING_XACT|CHECKPOINT_ONLINE}
becomes longer, but I could not find impacts by it.

As for the XLOG_FPI_FOR_HINT, the simplest way I came up with is not to call
XLogSaveBufferForHint() during binary upgrade. Considerations may be not enough,
but I attached the patch for the fix. It passed CI on my repository.


Do you have any other considerations about it?
An approach, which adds "if (IsBinaryUpgare)" in XLogInsertAllowed(), was proposed in [2].
But I'm not sure it could really solve the issue - e.g., XLogInsertRecord() just
raised an ERROR if !XLogInsertAllowed().

[1]: https://commitfest.postgresql.org/44/4273/
[2]: https://www.postgresql.org/message-id/flat/20210121152357.s6eflhqyh4g5e6dv%40dalibo.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED


Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Richard Guo
Дата:
Сообщение: Re: Reducing memory consumed by RestrictInfo list translations in partitionwise join planning
Следующее
От: John Naylor
Дата:
Сообщение: Re: Avoid stack frame setup in performance critical routines using tail calls