RE: Synchronizing slots from primary to standby

Поиск
Список
Период
Сортировка
От Zhijie Hou (Fujitsu)
Тема RE: Synchronizing slots from primary to standby
Дата
Msg-id OS0PR01MB57160CC95DAD0E3B6D60DBAB94B7A@OS0PR01MB5716.jpnprd01.prod.outlook.com
обсуждение исходный текст
Ответ на Re: Synchronizing slots from primary to standby  ("Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>)
Ответы Re: Synchronizing slots from primary to standby  ("Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>)
Список pgsql-hackers
On Tuesday, November 14, 2023 10:27 PM Drouvot, Bertrand <bertranddrouvot.pg@gmail.com> wrote:
> On 11/13/23 2:57 PM, Zhijie Hou (Fujitsu) wrote:
> > On Friday, November 10, 2023 4:16 PM Drouvot, Bertrand
> <bertranddrouvot.pg@gmail.com> wrote:
> >> Yeah good point, agree to just error out in all the case then (if we
> >> discard the sync_ reserved wording proposal, which seems to be the
> >> case as probably not worth the extra work).
> >
> > Thanks for the discussion!
> >
> > Here is the V33 patch set which includes the following changes:
> 
> Thanks for working on it!
> 
> >
> > 1) Drop slots with state 'i' in promotion flow after we shut down WalReceiver.
> 
> @@ -3557,10 +3558,15 @@ WaitForWALToBecomeAvailable(XLogRecPtr
> RecPtr, bool randAccess,
>                       * this only after failure, so when you promote, we still
>                       * finish replaying as much as we can from archive and
>                       * pg_wal before failover.
> +                    *
> +                    * Drop the slots for which sync is initiated but not yet
> +                    * completed i.e. they are still waiting for the primary
> +                    * server to catch up.
>                       */
>                      if (StandbyMode && CheckForStandbyTrigger())
>                      {
>                          XLogShutdownWalRcv();
> +                       slotsync_drop_initiated_slots();
>                          return XLREAD_FAIL;
>                      }
> 
> I had a closer look and it seems this is not located at the right place.
> 
> Indeed, it's added here:
> 
> switch (currentSource)
> {
>     case XLOG_FROM_ARCHIVE:
>     case XLOG_FROM_PG_WAL:
> 
> While in our case we are in
> 
>     case XLOG_FROM_STREAM:
> 
> So I think we should move slotsync_drop_initiated_slots() in the
> XLOG_FROM_STREAM case. Maybe before shutting down the sync slot worker?
> (the TODO item number 2 you mentioned up-thread)

Thanks for the comment.

I feel the WaitForWALToBecomeAvailable may not be the best place to shutdown
slotsync worker and drop slots. There could be other reasons(other than
promotion) as mentioned in comments in case XLOG_FROM_STREAM to reach the code
there. I thought if the intention is to stop slotsync workers on promotion,
maybe FinishWalRecovery() is a better place to do it as it's indicating the end
of recovery and XLogShutdownWalRcv is also called in it.

And I feel we'd better drop the slots after shutting down the slotsync workers,
because otherwise the slotsync workers could create the dropped slot again in
rare cases.

Best Regards,
Hou zj



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jubilee Young
Дата:
Сообщение: Re: Hide exposed impl detail of wchar.c
Следующее
От: Richard Guo
Дата:
Сообщение: Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500