Re: Synchronizing slots from primary to standby

Поиск
Список
Период
Сортировка
От Bharath Rupireddy
Тема Re: Synchronizing slots from primary to standby
Дата
Msg-id CALj2ACW5z=QYNabSEW-iqJ0oZUyTNztPBXzCoAiDtWFWYMcuag@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Synchronizing slots from primary to standby  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: Synchronizing slots from primary to standby  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
On Fri, Jun 16, 2023 at 3:26 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Apr 17, 2023 at 7:37 PM Drouvot, Bertrand
> <bertranddrouvot.pg@gmail.com> wrote:
> >
> > Please find attached V5 (a rebase of V4 posted up-thread).
> >
> > In addition to the "rebasing" work, the TAP test adds a test about conflict handling (logical slot invalidation)
> > relying on the work done in the "Minimal logical decoding on standby" patch series.
> >
> > I did not look more at the patch (than what's was needed for the rebase) but plan to do so.
> >
>
> Are you still planning to continue working on this? Some miscellaneous
> comments while going through this patch are as follows?
>
> 1. Can you please try to explain the functionality of the overall
> patch somewhere in the form of comments and or commit message?

IIUC, there are 2 core ideas of the feature:

1) It will never let the logical replication subscribers go ahead of
physical replication standbys specified in standby_slot_names. It
implements this by delaying decoding of commit records on the
walsenders corresponding to logical replication subscribers on the
primary until all the specified standbys confirm receiving the commit
LSN.

2) The physical replication standbys will synchronize data of the
specified logical replication slots (in synchronize_slot_names) from
the primary, creating the logical replication slots if necessary.
Since the logical replication subscribers will never go out of
physical replication standbys, the standbys can safely synchronize the
slots and keep the data necessary for subscribers to connect to it and
work seamlessly even after a failover.

If my understanding is right, I have few thoughts here:

1. All the logical walsenders are delayed on the primary - per
wait_for_standby_confirmation() despite the user being interested in
only a few of them via synchronize_slot_names. Shouldn't the delay be
for just the slots specified in synchronize_slot_names?
2. I think we can split the patch like this - 0001 can be the logical
walsenders delaying decoding on the primary unless standbys confirm,
0002 standby synchronizing the logical slots.
3. I think we need to change the GUC standby_slot_names to better
reflect what it is used for - wait_for_replication_slot_names or
wait_for_
4. It allows specifying logical slots in standby_slot_names, meaning,
it can disallow logical slots getting ahead of other logical slots
specified in standby_slot_names. Should we allow this case with the
thinking that if there's anyone using logical replication for failover
(well, will anybody do that in production?).
5. Similar to above, it allows specifying physical slots in
synchronize_slot_names. Should we disallow?

I'm attaching the v6 patch, a rebased version of v5.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Eisentraut
Дата:
Сообщение: Re: Fix last unitialized memory warning
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: Autogenerate some wait events code and documentation