Race conditions with checkpointer and shutdown

Поиск
Список
Период
Сортировка
От Michael Paquier
Тема Race conditions with checkpointer and shutdown
Дата
Msg-id 20190416070119.GK2673@paquier.xyz
обсуждение исходный текст
Ответы Re: Race conditions with checkpointer and shutdown  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
Hi all,

This is a continuation of the following thread, but I prefer spawning
a new thread for clarity:
https://www.postgresql.org/message-id/20190416064512.GJ2673@paquier.xyz

The buildfarm has reported two similar failures when shutting down a
node:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=piculet&dt=2019-03-23%2022%3A28%3A59
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dragonet&dt=2019-04-16%2006%3A14%3A01

In both cases, the instance cannot shut down because it times out,
waiting for the shutdown checkpoint to finish but I suspect that this
checkpoint actually never happens.

The first case involves piculet which has --disable-atomics, gcc 6 and
the recovery test 016_min_consistency where we trigger a checkpoint,
then issue a fast shutdown on a standby.  And at this point the test
waits forever.

The second case involves dragonet which has JIT enabled and clang.
The failure is on test 009_twophase.pl.  The failure happens after
test preparing transaction xact_009_11, where a *standby* gets
restarted.  Again, the test waits forever for the instance to shut
down.

The most recent commits which have touched checkpoints are 0dfe3d0e
and c6c9474a, which maps roughly to the point where the failures
began to happen, and that something related to standby clean shutdowns
has broken since.

Thanks,
--
Michael

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Eisentraut
Дата:
Сообщение: Re: [PATCH v20] GSSAPI encryption support
Следующее
От: Magnus Hagander
Дата:
Сообщение: Re: Commit message / hash in commitfest page.