Re: BF animal malleefowl reported an failure in 001_password.pl

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: BF animal malleefowl reported an failure in 001_password.pl
Дата
Msg-id 934208.1673682937@sss.pgh.pa.us
обсуждение исходный текст
Ответ на BF animal malleefowl reported an failure in 001_password.pl  ("houzj.fnst@fujitsu.com" <houzj.fnst@fujitsu.com>)
Ответы Re: BF animal malleefowl reported an failure in 001_password.pl  (Thomas Munro <thomas.munro@gmail.com>)
Список pgsql-hackers
"houzj.fnst@fujitsu.com" <houzj.fnst@fujitsu.com> writes:
> I noticed one BF failure[1] when monitoring the BF for some other commit.
> [1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=malleefowl&dt=2023-01-13%2009%3A54%3A51
> ...
> So it seems the connection happens before pg_ident.conf is actually reloaded ?
> Not sure if we need to do something make sure the reload happen, because it's
> looks like very rare failure which hasn't happen in last 90 days.

That does look like a race condition between config reloading and
new-backend launching.  However, I can't help being suspicious about
the fact that we haven't seen this symptom before and now here it is
barely a day after 7389aad63 (Use WaitEventSet API for postmaster's
event loop).  It seems fairly plausible that that did something that
causes the postmaster to preferentially process connection-accept ahead
of SIGHUP.  I took a quick look through the code and did not see a
smoking gun, but I'm way too tired to be sure I didn't miss something.

In general, use of WaitEventSet instead of signals will tend to slot
the postmaster into non-temporally-ordered event responses in two
ways: (1) the latch.c code will report events happening at more-or-less
the same time in a specific order, and (2) the postmaster.c code will
react to signal-handler-set flags in a specific order.  AFAICS, both
of those code layers will prioritize latch events ahead of
connection-accept events, but did I misread it?

Also it seems like the various platform-specific code paths in latch.c
could diverge as to the priority order of events, which could cause
annoying platform-specific behavior.  Not sure there's much to be
done there other than to be sensitive to not letting such divergence
happen.

            regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: vignesh C
Дата:
Сообщение: Re: fixing CREATEROLE
Следующее
От: Jeff Davis
Дата:
Сообщение: Re: Improve WALRead() to suck data directly from WAL buffers when possible