Re: BUG #17928: Standby fails to decode WAL on termination of primary

Поиск
Список
Период
Сортировка
От Michael Paquier
Тема Re: BUG #17928: Standby fails to decode WAL on termination of primary
Дата
Msg-id ZQ5tIoarrcwAlbVZ@paquier.xyz
обсуждение исходный текст
Ответ на Re: BUG #17928: Standby fails to decode WAL on termination of primary  (Thomas Munro <thomas.munro@gmail.com>)
Ответы Re: BUG #17928: Standby fails to decode WAL on termination of primary  (Thomas Munro <thomas.munro@gmail.com>)
Список pgsql-bugs
On Sat, Sep 23, 2023 at 02:02:02PM +1200, Thomas Munro wrote:
> Hmm, copperhead (riscv) showed an unusual failure, a segfault in
> suspiciously nearby code.  I don't immediately know what that's about,
> let's see if we get more clues...

Yep, something is going on with the prefetching code:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=copperhead&dt=2023-09-22%2023%3A16%3A33

Using host libthread_db library
"/lib/riscv64-linux-gnu/libthread_db.so.1". Core was generated by
`postgres: paris: startup recovering 000000030000000000000003
'. Program terminated with signal SIGSEGV, Segmentation fault.
#0  pg_comp_crc32c_sb8 (crc=1613114916, crc@entry=4294967295,
data=data@entry=0x2af9e00d48, len=<optimized out>) at
pg_crc32c_sb8.c:56 56 uint32 a = *p4++ ^ crc;
#0  pg_comp_crc32c_sb8 (crc=1613114916, crc@entry=4294967295,
data=data@entry=0x2af9e00d48, len=<optimized out>) at
pg_crc32c_sb8.c:56
#1  0x0000002ad59a1536 in ValidXLogRecord (state=0x2af9db1fc0,
record=0x2af9e00d30, recptr=50520048) at xlogreader.c:1195
#2  0x0000002ad59a285a in XLogDecodeNextRecord
(state=state@entry=0x2af9db1fc0, nonblocking=<optimized out>) at
xlogreader.c:842
#3  0x0000002ad59a28c0 in XLogReadAhead (state=0x2af9db1fc0,
nonblocking=nonblocking@entry=false) at xlogreader.c:969
#4  0x0000002ad59a0996 in XLogPrefetcherNextBlock
(pgsr_private=184580836680, lsn=0x2af9e14618) at xlogprefetcher.c:496
#5  0x0000002ad59a11c8 in lrq_prefetch (lrq=<optimized out>) at
xlogprefetcher.c:256
#6  lrq_complete_lsn (lsn=<optimized out>, lrq=0x2af9e145c8) at
xlogprefetcher.c:294
#7  XLogPrefetcherReadRecord
(prefetcher=prefetcher@entry=0x2af9e00d48,
errmsg=errmsg@entry=0x3fec9a2bf0) at xlogprefetcher.c:1041

The stack may point out at a different issue, but perhaps this is a
matter where we're returning now XLREAD_SUCCESS where previously we
had XLREAD_FAIL, causing this code to fail thinking that the block was
valid while it's not?
--
Michael

Вложения

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Thomas Munro
Дата:
Сообщение: Re: BUG #17928: Standby fails to decode WAL on termination of primary
Следующее
От: Tom Lane
Дата:
Сообщение: Re: BUG #18131: PL/pgSQL: regclass procedure parameter wrongly memoized(?)