Re: Synchronizing slots from primary to standby

Поиск
Список
Период
Сортировка
От Drouvot, Bertrand
Тема Re: Synchronizing slots from primary to standby
Дата
Msg-id 538ddca6-cf74-4a9c-95d6-dd05af24070c@gmail.com
обсуждение исходный текст
Ответ на Re: Synchronizing slots from primary to standby  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: Synchronizing slots from primary to standby  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
Hi,

On 11/10/23 6:41 AM, Amit Kapila wrote:
> On Thu, Nov 9, 2023 at 7:29 PM Drouvot, Bertrand
> <bertranddrouvot.pg@gmail.com> wrote:
> 
> Are you saying that we change the state of the already existing slot
> on standby? 

Yes.

> And, such a state would indicate that we are trying to
> sync the slot with the same name from the primary. Is that what you
> have in mind?

Yes.

> If so, it appears quite odd to me to have such a state
> and also set it in some unrelated slot that just has the same name.
> 

> I understand your point that we can allow other slots to proceed but
> it is also important to not create any sort of inconsistency that can
> surprise user after failover.

But even if we ERROR out instead of emitting a WARNING, the user would still
need to be notified/monitor such errors. I agree that then probably they will
come to know earlier because the slot sync mechanism would be stopped but still
it is not "guaranteed" (specially if there is no others "working" synced slots
around.) And if they do not, then there is still a risk to use this slot after a
failover thinking this is a "synced" slot.

Giving more thoughts, what about using a dedicated/reserved naming convention for
synced slot like synced_<primary_slot_name> or such and then:

- prevent user to create sync_<whatever> slots on standby
- sync <slot> on primary to sync_<slot> on standby
- during failover, rename  sync_<slot> to <slot> and if <slot> exists then
emit a WARNING and keep sync_<slot> in place.

That way both slots are still in place (the manually created <slot> and
the sync_<slot<) and one could decide what to do with them.

I don't think we'd need to worry about the cases where sync_ slot could be already
created before we "prevent" such slots creation. Indeed I think they would not survive
pg_upgrade before 17 -> 18 upgrades. So it looks like we'd be good as long as we
are able to prevent sync_ slots creation on 17.

Thoughts?

> Also, the current coding doesn't ensure
> we will always give WARNING. If we see the below code that deals with
> this WARNING,
> 
> +  /* User created slot with the same name exists, emit WARNING. */
> +  else if (found && s->data.sync_state == SYNCSLOT_STATE_NONE)
> +  {
> +    ereport(WARNING,
> +        errmsg("not synchronizing slot %s; it is a user created slot",
> +             remote_slot->name));
> +  }
> +  /* Otherwise create the slot first. */
> +  else
> +  {
> +    TransactionId xmin_horizon = InvalidTransactionId;
> +    ReplicationSlot *slot;
> +
> +    ReplicationSlotCreate(remote_slot->name, true, RS_EPHEMERAL,
> +                remote_slot->two_phase, false);
> 
> I think this is not a solid check to ensure that the slot existed
> before. Because it could be created as soon as the slot sync worker
> invokes ReplicationSlotCreate() here.

Agree.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: Synchronizing slots from primary to standby
Следующее
От: Peter Eisentraut
Дата:
Сообщение: Re: Remove MSVC scripts from the tree