RE: Synchronizing slots from primary to standby

Поиск
Список
Период
Сортировка
От Zhijie Hou (Fujitsu)
Тема RE: Synchronizing slots from primary to standby
Дата
Msg-id OS0PR01MB571633C23B2A4CAC5FB0371A944C2@OS0PR01MB5716.jpnprd01.prod.outlook.com
обсуждение исходный текст
Ответ на Re: Synchronizing slots from primary to standby  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы RE: Synchronizing slots from primary to standby  ("Zhijie Hou (Fujitsu)" <houzj.fnst@fujitsu.com>)
Re: Synchronizing slots from primary to standby  (Bertrand Drouvot <bertranddrouvot.pg@gmail.com>)
Список pgsql-hackers
On Thursday, February 15, 2024 8:29 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

> 
> On Thu, Feb 15, 2024 at 5:46 PM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:
> >
> > On Thu, Feb 15, 2024 at 05:00:18PM +0530, Amit Kapila wrote:
> > > On Thu, Feb 15, 2024 at 4:29 PM Zhijie Hou (Fujitsu)
> > > <houzj.fnst@fujitsu.com> wrote:
> > > > Attach the v2 patch here.
> > > >
> > > > Apart from the new log message. I think we can add one more debug
> > > > message in reserve_wal_for_local_slot, this could be useful to analyze the
> failure.
> > >
> > > Yeah, that can also be helpful, but the added message looks naive to me.
> > > + elog(DEBUG1, "segno: %ld oldest_segno: %ld", oldest_segno, segno);
> > >
> > > Instead of the above, how about something like: "segno: %ld of
> > > purposed restart_lsn for the synced slot, oldest_segno: %ld
> > > available"?
> >
> > Looks good to me. I'm not sure if it would make more sense to elog
> > only if segno < oldest_segno means just before the
> XLogSegNoOffsetToRecPtr() call?
> >
> > But I'm fine with the proposed location too.
> >
> 
> I am also fine either way but the current location gives required information in
> more number of cases and could be helpful in debugging this new facility.
> 
> > >
> > > > And we
> > > > can also enable the DEBUG log in the 040 tap-test, I see we have
> > > > similar setting in 010_logical_decoding_timline and logging debug1
> > > > message doesn't increase noticable time on my machine. These are done
> in 0002.
> > > >
> > >
> > > I haven't tested it but I think this can help in debugging BF
> > > failures, if any. I am not sure if to keep it always like that but
> > > till the time these tests are stabilized, this sounds like a good
> > > idea. So, how, about just making test changes as a separate patch so
> > > that later if required we can revert/remove it easily? Bertrand, do
> > > you have any thoughts on this?
> >
> > +1 on having DEBUG log in the 040 tap-test until it's stabilized (I
> > +think we
> > took the same approach for 035_standby_logical_decoding.pl IIRC) and
> > then revert it back.
> >
> 
> Good to know!
> 
> > Also I was thinking: what about adding an output to
> pg_sync_replication_slots()?
> > The output could be the number of sync slots that have been created
> > and are not considered as sync-ready during the execution.
> >
> 
> Yeah, we can consider outputting some information via this function like how
> many slots are synced and persisted but not sure what would be appropriate
> here. Because one can anyway find that or more information by querying
> pg_replication_slots. I think we can keep discussing what makes more sense as a
> return value but let's commit the debug/log patches as they will be helpful to
> analyze BF failures or any other issues reported.

Agreed. Here is new patch set as suggested. I used debug2 in the 040 as it
could provide more information about communication between primary and standby.
This also doesn't increase noticeable testing time on my machine for debug
build.

Best Regards,
Hou zj

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jeff Davis
Дата:
Сообщение: Re: Built-in CTYPE provider
Следующее
От: Tatsuo Ishii
Дата:
Сообщение: Re: When extended query protocol ends?