Re: Synchronizing slots from primary to standby

Поиск
Список
Период
Сортировка
От Drouvot, Bertrand
Тема Re: Synchronizing slots from primary to standby
Дата
Msg-id dd9dbbaf-ca77-423a-8d62-bfc814626b47@gmail.com
обсуждение исходный текст
Ответ на Re: Synchronizing slots from primary to standby  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы RE: Synchronizing slots from primary to standby  ("Zhijie Hou (Fujitsu)" <houzj.fnst@fujitsu.com>)
Список pgsql-hackers
Hi,

On 11/10/23 8:55 AM, Amit Kapila wrote:
> On Fri, Nov 10, 2023 at 12:50 PM Drouvot, Bertrand
> <bertranddrouvot.pg@gmail.com> wrote:
>>
>> But even if we ERROR out instead of emitting a WARNING, the user would still
>> need to be notified/monitor such errors. I agree that then probably they will
>> come to know earlier because the slot sync mechanism would be stopped but still
>> it is not "guaranteed" (specially if there is no others "working" synced slots
>> around.)
> 
>>
>> And if they do not, then there is still a risk to use this slot after a
>> failover thinking this is a "synced" slot.
>>
> 
> I think this is another reason that probably giving ERROR has better
> chances for the user to notice before failover. IF knowing such errors
> user still proceeds with the failover, the onus is on her.

Agree. My concern is more when they don't know about the error.

> We can
> probably document this hazard along with the failover feature so that
> users are aware that they either need to be careful while creating
> slots on standby or consult ERROR logs. I guess we can even make it
> visible in the view also.

Yeah.

>> Giving more thoughts, what about using a dedicated/reserved naming convention for
>> synced slot like synced_<primary_slot_name> or such and then:
>>
>> - prevent user to create sync_<whatever> slots on standby
>> - sync <slot> on primary to sync_<slot> on standby
>> - during failover, rename  sync_<slot> to <slot> and if <slot> exists then
>> emit a WARNING and keep sync_<slot> in place.
>>
>> That way both slots are still in place (the manually created <slot> and
>> the sync_<slot<) and one could decide what to do with them.
>>
> 
> Hmm, I think after failover, users need to rename all slots or we need
> to provide a way to rename them so that they can be used by
> subscribers which sounds like much more work.

Agree that's much more work for the subscriber case. Maybe that's not worth
the extra work.

>>> Also, the current coding doesn't ensure
>>> we will always give WARNING. If we see the below code that deals with
>>> this WARNING,
>>>
>>> +  /* User created slot with the same name exists, emit WARNING. */
>>> +  else if (found && s->data.sync_state == SYNCSLOT_STATE_NONE)
>>> +  {
>>> +    ereport(WARNING,
>>> +        errmsg("not synchronizing slot %s; it is a user created slot",
>>> +             remote_slot->name));
>>> +  }
>>> +  /* Otherwise create the slot first. */
>>> +  else
>>> +  {
>>> +    TransactionId xmin_horizon = InvalidTransactionId;
>>> +    ReplicationSlot *slot;
>>> +
>>> +    ReplicationSlotCreate(remote_slot->name, true, RS_EPHEMERAL,
>>> +                remote_slot->two_phase, false);
>>>
>>> I think this is not a solid check to ensure that the slot existed
>>> before. Because it could be created as soon as the slot sync worker
>>> invokes ReplicationSlotCreate() here.
>>
>> Agree.
>>
> 
> So, having a concrete check to give WARNING would require some more
> logic which I don't think is a good idea to handle this boundary case.
> 

Yeah good point, agree to just error out in all the case then (if we discard
the sync_ reserved wording proposal, which seems to be the case as probably
not worth the extra work).

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Hayato Kuroda (Fujitsu)"
Дата:
Сообщение: RE: MinGW compiler warnings in ecpg tests
Следующее
От: Nazir Bilal Yavuz
Дата:
Сообщение: Re: Failure during Building Postgres in Windows with Meson