Re: incremental backup mishandles XLOG_DBASE_CREATE_FILE_COPY

Поиск
Список
Период
Сортировка
От Noah Misch
Тема Re: incremental backup mishandles XLOG_DBASE_CREATE_FILE_COPY
Дата
Msg-id 20240224043513.00.nmisch@google.com
обсуждение исходный текст
Ответ на incremental backup mishandles XLOG_DBASE_CREATE_FILE_COPY  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: incremental backup mishandles XLOG_DBASE_CREATE_FILE_COPY  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On Fri, Feb 23, 2024 at 08:47:52PM +0530, Robert Haas wrote:
> If XLOG_DBASE_CREATE_FILE_COPY occurs between an incremental backup
> and its reference backup, every relation whose DB OID and tablespace
> OID match the corresponding values in that record should be backed up
> in full. Currently that's not happening, because the WAL summarizer
> doesn't see the XLOG_DBASE_CREATE_FILE_COPY as referencing any
> particular relfilenode and so basically ignores it. The same happens
> for XLOG_DBASE_CREATE_WAL_LOG, but that case is OK because that only
> covers creating the directory itself, not anything underneath it, and
> there will be separate WAL records telling us the relfilenodes created
> below the new directory and the pages modified therein.

XLOG_DBASE_CREATE_WAL_LOG creates PG_VERSION in addition to creating the
directory.  I see your patch covers it.

> I thought about whether there were any other WAL records that have
> similar problems to XLOG_DBASE_CREATE_FILE_COPY and didn't come up
> with anything. If anyone knows of any similar cases, please let me
> know.

Regarding records the summarizer potentially can't ignore that don't deal in
relfilenodes, these come to mind:

XLOG_DBASE_DROP - covered in this thread's patch
XLOG_RELMAP_UPDATE
XLOG_TBLSPC_CREATE
XLOG_TBLSPC_DROP
XLOG_XACT_PREPARE

Also, any record that writes XIDs needs to update nextXid; likewise for other
ID spaces.  See the comment at "XLOG stuff" in heap_lock_tuple().  Perhaps you
don't summarize past a checkpoint, making that irrelevant.

If walsummarizer.c handles any of the above, my brief look missed it.  I also
didn't find the string "clog" or "slru" anywhere in dc21234 "Add support for
incremental backup", 174c480 "Add a new WAL summarizer process.", or thread
https://postgr.es/m/flat/CA%2BTgmoYOYZfMCyOXFyC-P%2B-mdrZqm5pP2N7S-r0z3_402h9rsA%40mail.gmail.com
"trying again to get incremental backup".  I wouldn't be surprised if you
treat clog, pg_filenode.map, and/or 2PC state files as unconditionally
non-incremental, in which case some of the above doesn't need explicit
summarization code.  I stopped looking for that logic, though.

> --- a/src/backend/postmaster/walsummarizer.c
> +++ b/src/backend/postmaster/walsummarizer.c

> +     * Technically, this special handling is only needed in the case of
> +     * XLOG_DBASE_CREATE_FILE_COPY, because that can create a whole bunch
> +     * of relation files in a directory without logging anything
> +     * specific to each one. If we didn't mark the whole DB OID/TS OID
> +     * combination in some way, then a tablespace that was dropped after
s/tablespace/database/ I suspect.
> +     * the reference backup and recreated using the FILE_COPY method prior
> +     * to the incremental backup would look just like one that was never
> +     * touched at all, which would be catastrophic.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: Why is subscription/t/031_column_list.pl failing so much?
Следующее
От: Noah Misch
Дата:
Сообщение: Re: Removing unneeded self joins