Re: hash_xlog_split_allocate_page: failed to acquire cleanup lock

Поиск
Список
Период
Сортировка
От Mark Dilger
Тема Re: hash_xlog_split_allocate_page: failed to acquire cleanup lock
Дата
Msg-id ADF10716-5699-42BE-B462-6D0862D8D379@enterprisedb.com
обсуждение исходный текст
Ответ на hash_xlog_split_allocate_page: failed to acquire cleanup lock  (Andres Freund <andres@anarazel.de>)
Ответы Re: hash_xlog_split_allocate_page: failed to acquire cleanup lock  (Thomas Munro <thomas.munro@gmail.com>)
Re: hash_xlog_split_allocate_page: failed to acquire cleanup lock  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers

> On Aug 9, 2022, at 7:26 PM, Andres Freund <andres@anarazel.de> wrote:
>
> The relevant code triggering it:
>
>     newbuf = XLogInitBufferForRedo(record, 1);
>     _hash_initbuf(newbuf, xlrec->new_bucket, xlrec->new_bucket,
>                   xlrec->new_bucket_flag, true);
>     if (!IsBufferCleanupOK(newbuf))
>         elog(PANIC, "hash_xlog_split_allocate_page: failed to acquire cleanup lock");
>
> Why do we just crash if we don't already have a cleanup lock? That can't be
> right. Or is there supposed to be a guarantee this can't happen?

Perhaps the code assumes that when xl_hash_split_allocate_page record was written, the new_bucket field referred to an
unusedpage, and so during replay it should also refer to an unused page, and being unused, that nobody will have it
pinned. But at least in heap we sometimes pin unused pages just long enough to examine them and to see that they are
unused. Maybe something like that is happening here? 

I'd be curious to see the count returned by BUF_STATE_GET_REFCOUNT(LockBufHdr(newbuf)) right before this panic.  If
it'sjust 1, then it's not another backend, but our own, and we'd want to debug why we're pinning the same page twice
(ormore) while replaying wal.  Otherwise, maybe it's a race condition with some other process that transiently pins a
bufferand occasionally causes this code to panic? 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






В списке pgsql-hackers по дате отправления:

Предыдущее
От: Kyotaro Horiguchi
Дата:
Сообщение: Re: shared-memory based stats collector - v70
Следующее
От: Amit Kapila
Дата:
Сообщение: Re: Perform streaming logical transactions by background workers and parallel apply