Re: Synchronizing slots from primary to standby
От | Drouvot, Bertrand |
---|---|
Тема | Re: Synchronizing slots from primary to standby |
Дата | |
Msg-id | dd9dbbaf-ca77-423a-8d62-bfc814626b47@gmail.com обсуждение исходный текст |
Ответ на | Re: Synchronizing slots from primary to standby (Amit Kapila <amit.kapila16@gmail.com>) |
Ответы |
RE: Synchronizing slots from primary to standby
("Zhijie Hou (Fujitsu)" <houzj.fnst@fujitsu.com>)
|
Список | pgsql-hackers |
Hi, On 11/10/23 8:55 AM, Amit Kapila wrote: > On Fri, Nov 10, 2023 at 12:50 PM Drouvot, Bertrand > <bertranddrouvot.pg@gmail.com> wrote: >> >> But even if we ERROR out instead of emitting a WARNING, the user would still >> need to be notified/monitor such errors. I agree that then probably they will >> come to know earlier because the slot sync mechanism would be stopped but still >> it is not "guaranteed" (specially if there is no others "working" synced slots >> around.) > >> >> And if they do not, then there is still a risk to use this slot after a >> failover thinking this is a "synced" slot. >> > > I think this is another reason that probably giving ERROR has better > chances for the user to notice before failover. IF knowing such errors > user still proceeds with the failover, the onus is on her. Agree. My concern is more when they don't know about the error. > We can > probably document this hazard along with the failover feature so that > users are aware that they either need to be careful while creating > slots on standby or consult ERROR logs. I guess we can even make it > visible in the view also. Yeah. >> Giving more thoughts, what about using a dedicated/reserved naming convention for >> synced slot like synced_<primary_slot_name> or such and then: >> >> - prevent user to create sync_<whatever> slots on standby >> - sync <slot> on primary to sync_<slot> on standby >> - during failover, rename sync_<slot> to <slot> and if <slot> exists then >> emit a WARNING and keep sync_<slot> in place. >> >> That way both slots are still in place (the manually created <slot> and >> the sync_<slot<) and one could decide what to do with them. >> > > Hmm, I think after failover, users need to rename all slots or we need > to provide a way to rename them so that they can be used by > subscribers which sounds like much more work. Agree that's much more work for the subscriber case. Maybe that's not worth the extra work. >>> Also, the current coding doesn't ensure >>> we will always give WARNING. If we see the below code that deals with >>> this WARNING, >>> >>> + /* User created slot with the same name exists, emit WARNING. */ >>> + else if (found && s->data.sync_state == SYNCSLOT_STATE_NONE) >>> + { >>> + ereport(WARNING, >>> + errmsg("not synchronizing slot %s; it is a user created slot", >>> + remote_slot->name)); >>> + } >>> + /* Otherwise create the slot first. */ >>> + else >>> + { >>> + TransactionId xmin_horizon = InvalidTransactionId; >>> + ReplicationSlot *slot; >>> + >>> + ReplicationSlotCreate(remote_slot->name, true, RS_EPHEMERAL, >>> + remote_slot->two_phase, false); >>> >>> I think this is not a solid check to ensure that the slot existed >>> before. Because it could be created as soon as the slot sync worker >>> invokes ReplicationSlotCreate() here. >> >> Agree. >> > > So, having a concrete check to give WARNING would require some more > logic which I don't think is a good idea to handle this boundary case. > Yeah good point, agree to just error out in all the case then (if we discard the sync_ reserved wording proposal, which seems to be the case as probably not worth the extra work). Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
В списке pgsql-hackers по дате отправления:
Следующее
От: Nazir Bilal YavuzДата:
Сообщение: Re: Failure during Building Postgres in Windows with Meson