pgsql: Add info in WAL records in preparation for logical slot conflict

Поиск
Список
Период
Сортировка
От Andres Freund
Тема pgsql: Add info in WAL records in preparation for logical slot conflict
Дата
Msg-id E1pj41x-0013B3-80@gemulon.postgresql.org
обсуждение исходный текст
Список pgsql-committers
Add info in WAL records in preparation for logical slot conflict handling

This commit only implements one prerequisite part for allowing logical
decoding. The commit message contains an explanation of the overall design,
which later commits will refer back to.

Overall design:

1. We want to enable logical decoding on standbys, but replay of WAL
from the primary might remove data that is needed by logical decoding,
causing error(s) on the standby. To prevent those errors, a new replication
conflict scenario needs to be addressed (as much as hot standby does).

2. Our chosen strategy for dealing with this type of replication slot
is to invalidate logical slots for which needed data has been removed.

3. To do this we need the latestRemovedXid for each change, just as we
do for physical replication conflicts, but we also need to know
whether any particular change was to data that logical replication
might access. That way, during WAL replay, we know when there is a risk of
conflict and, if so, if there is a conflict.

4. We can't rely on the standby's relcache entries for this purpose in
any way, because the startup process can't access catalog contents.

5. Therefore every WAL record that potentially removes data from the
index or heap must carry a flag indicating whether or not it is one
that might be accessed during logical decoding.

Why do we need this for logical decoding on standby?

First, let's forget about logical decoding on standby and recall that
on a primary database, any catalog rows that may be needed by a logical
decoding replication slot are not removed.

This is done thanks to the catalog_xmin associated with the logical
replication slot.

But, with logical decoding on standby, in the following cases:

- hot_standby_feedback is off
- hot_standby_feedback is on but there is no a physical slot between
  the primary and the standby. Then, hot_standby_feedback will work,
  but only while the connection is alive (for example a node restart
  would break it)

Then, the primary may delete system catalog rows that could be needed
by the logical decoding on the standby (as it does not know about the
catalog_xmin on the standby).

So, it’s mandatory to identify those rows and invalidate the slots
that may need them if any. Identifying those rows is the purpose of
this commit.

Implementation:

When a WAL replay on standby indicates that a catalog table tuple is
to be deleted by an xid that is greater than a logical slot's
catalog_xmin, then that means the slot's catalog_xmin conflicts with
the xid, and we need to handle the conflict. While subsequent commits
will do the actual conflict handling, this commit adds a new field
isCatalogRel in such WAL records (and a new bit set in the
xl_heap_visible flags field), that is true for catalog tables, so as to
arrange for conflict handling.

The affected WAL records are the ones that already contain the
snapshotConflictHorizon field, namely:

- gistxlogDelete
- gistxlogPageReuse
- xl_hash_vacuum_one_page
- xl_heap_prune
- xl_heap_freeze_page
- xl_heap_visible
- xl_btree_reuse_page
- xl_btree_delete
- spgxlogVacuumRedirect

Due to this new field being added, xl_hash_vacuum_one_page and
gistxlogDelete do now contain the offsets to be deleted as a
FLEXIBLE_ARRAY_MEMBER. This is needed to ensure correct alignment.
It's not needed on the others struct where isCatalogRel has
been added.

This commit just introduces the WAL format changes mentioned above. Handling
the actual conflicts will follow in future commits.

Bumps XLOG_PAGE_MAGIC as the several WAL records are changed.

Author: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>
Author: Andres Freund <andres@anarazel.de> (in an older version)
Author: Amit Khandekar <amitdkhan.pg@gmail.com>  (in an older version)
Reviewed-by: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Fabrízio de Royes Mello <fabriziomello@gmail.com>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/6af1793954e8c5e753af83c3edb37ed3267dd179

Modified Files
--------------
src/backend/access/gist/gistxlog.c     | 12 ++++--------
src/backend/access/hash/hash_xlog.c    | 11 +++--------
src/backend/access/hash/hashinsert.c   |  1 +
src/backend/access/heap/heapam.c       | 11 ++++++++++-
src/backend/access/heap/pruneheap.c    |  1 +
src/backend/access/nbtree/nbtpage.c    |  2 ++
src/backend/access/spgist/spgvacuum.c  |  1 +
src/include/access/gistxlog.h          | 11 ++++++++---
src/include/access/hash_xlog.h         | 10 ++++++----
src/include/access/heapam_xlog.h       |  8 ++++++--
src/include/access/nbtxlog.h           |  8 ++++++--
src/include/access/spgxlog.h           |  2 ++
src/include/access/visibilitymapdefs.h |  9 +++++++++
src/include/access/xlog_internal.h     |  2 +-
src/include/utils/rel.h                |  1 +
15 files changed, 61 insertions(+), 29 deletions(-)


В списке pgsql-committers по дате отправления:

Предыдущее
От: Noah Misch
Дата:
Сообщение: pgsql: Use PG_TEST_TIMEOUT_DEFAULT in 019_replslot_limit.pl.
Следующее
От: Tom Lane
Дата:
Сообщение: pgsql: Doc: update pgindent/README.