Re: Synchronizing slots from primary to standby

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: Synchronizing slots from primary to standby
Дата
Msg-id CAA4eK1LbnX5Lu5-Zphv0=x8T+WnvSVoaDZHTY5RNk04dsymctA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Synchronizing slots from primary to standby  ("Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>)
Ответы Re: Synchronizing slots from primary to standby  ("Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>)
Список pgsql-hackers
On Fri, Nov 10, 2023 at 12:50 PM Drouvot, Bertrand
<bertranddrouvot.pg@gmail.com> wrote:
>
> On 11/10/23 6:41 AM, Amit Kapila wrote:
> > On Thu, Nov 9, 2023 at 7:29 PM Drouvot, Bertrand
> > <bertranddrouvot.pg@gmail.com> wrote:
> >
> > Are you saying that we change the state of the already existing slot
> > on standby?
>
> Yes.
>
> > And, such a state would indicate that we are trying to
> > sync the slot with the same name from the primary. Is that what you
> > have in mind?
>
> Yes.
>
> > If so, it appears quite odd to me to have such a state
> > and also set it in some unrelated slot that just has the same name.
> >
>
> > I understand your point that we can allow other slots to proceed but
> > it is also important to not create any sort of inconsistency that can
> > surprise user after failover.
>
> But even if we ERROR out instead of emitting a WARNING, the user would still
> need to be notified/monitor such errors. I agree that then probably they will
> come to know earlier because the slot sync mechanism would be stopped but still
> it is not "guaranteed" (specially if there is no others "working" synced slots
> around.)

>
> And if they do not, then there is still a risk to use this slot after a
> failover thinking this is a "synced" slot.
>

I think this is another reason that probably giving ERROR has better
chances for the user to notice before failover. IF knowing such errors
user still proceeds with the failover, the onus is on her. We can
probably document this hazard along with the failover feature so that
users are aware that they either need to be careful while creating
slots on standby or consult ERROR logs. I guess we can even make it
visible in the view also.

> Giving more thoughts, what about using a dedicated/reserved naming convention for
> synced slot like synced_<primary_slot_name> or such and then:
>
> - prevent user to create sync_<whatever> slots on standby
> - sync <slot> on primary to sync_<slot> on standby
> - during failover, rename  sync_<slot> to <slot> and if <slot> exists then
> emit a WARNING and keep sync_<slot> in place.
>
> That way both slots are still in place (the manually created <slot> and
> the sync_<slot<) and one could decide what to do with them.
>

Hmm, I think after failover, users need to rename all slots or we need
to provide a way to rename them so that they can be used by
subscribers which sounds like much more work.

> > Also, the current coding doesn't ensure
> > we will always give WARNING. If we see the below code that deals with
> > this WARNING,
> >
> > +  /* User created slot with the same name exists, emit WARNING. */
> > +  else if (found && s->data.sync_state == SYNCSLOT_STATE_NONE)
> > +  {
> > +    ereport(WARNING,
> > +        errmsg("not synchronizing slot %s; it is a user created slot",
> > +             remote_slot->name));
> > +  }
> > +  /* Otherwise create the slot first. */
> > +  else
> > +  {
> > +    TransactionId xmin_horizon = InvalidTransactionId;
> > +    ReplicationSlot *slot;
> > +
> > +    ReplicationSlotCreate(remote_slot->name, true, RS_EPHEMERAL,
> > +                remote_slot->two_phase, false);
> >
> > I think this is not a solid check to ensure that the slot existed
> > before. Because it could be created as soon as the slot sync worker
> > invokes ReplicationSlotCreate() here.
>
> Agree.
>

So, having a concrete check to give WARNING would require some more
logic which I don't think is a good idea to handle this boundary case.

--
With Regards,
Amit Kapila.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jim Jones
Дата:
Сообщение: Re: Tab completion for CREATE TABLE ... AS
Следующее
От: Amit Kapila
Дата:
Сообщение: Re: Synchronizing slots from primary to standby