Re: BUG #18259: Assertion in ExtendBufferedRelLocal() fails after no-space-left condition

Поиск

Список

Период

Сортировка

От	tender wang
Тема	Re: BUG #18259: Assertion in ExtendBufferedRelLocal() fails after no-space-left condition
Дата	28 декабря 2023 г. 06:40:31
Msg-id	CAHewXN=chu4kBxj=vtCOJJoOCAvipfJzKRuH26BMiyHSDhBk7g@mail.gmail.com обсуждение исходный текст
Ответ на	Re: BUG #18259: Assertion in ExtendBufferedRelLocal() fails after no-space-left condition (tender wang <tndrwang@gmail.com>)
Список	pgsql-bugs

Дерево обсуждения

I have always been curious why an error is reported only when there is not enough space.

I did some tests and , maybe, I found some answers. My tests as below:

----------------------------

postgres=# CREATE UNLOGGED TABLE filler(a int, b text STORAGE plain);
CREATE TABLE
postgres=# INSERT INTO filler SELECT g, repeat('x', 1000) FROM generate_series(1,50000) g;
INSERT 0 50000
postgres=# CREATE TEMP TABLE tbl(a int);
CREATE TABLE
postgres=# INSERT INTO tbl SELECT g FROM generate_series(1, 200000) g;
ERROR: could not extend file "base/5/t3_16389": No space left on device
HINT: Check free disk space.
postgres=# INSERT INTO tbl SELECT g FROM generate_series(1, 200000) g;
ERROR: could not extend file "base/5/t3_16389": No space left on device
HINT: Check free disk space.
postgres=# truncate tbl ;
TRUNCATE TABLE
postgres=# drop table filler ;
DROP TABLE
postgres=# INSERT INTO tbl SELECT g FROM generate_series(1, 200000) g;
INSERT 0 200000
postgres=# INSERT INTO tbl SELECT g FROM generate_series(1, 200000) g;
INSERT 0 200000
postgres=# INSERT INTO tbl SELECT g FROM generate_series(1, 200000) g;
INSERT 0 200000
postgres=# INSERT INTO tbl SELECT g FROM generate_series(1, 200000) g;
INSERT 0 200000

------------------------

It didn't report an error when I truncated the temp table.

I found buffer's buf_state on local hash table not cleanup when there was no space left on the device.

If I do truncate temp table, DropRelationLocalBuffers() will be called, the buf_state will be clear, then no assert failed issue report.

tender wang <tndrwang@gmail.com> 于2023年12月27日周三 17:22写道：

When I debugged the ExtendBufferedRelLocal(), I found a repeated assignment to existing_hdr.
So I fixed this small issue with the previous v2 patch together with the attached v3 patch.

tender wang <tndrwang@gmail.com> 于2023年12月27日周三 17:08写道：

Alexander Lakhin <exclusion@gmail.com> 于2023年12月27日周三 15:00写道：
Hello tender wang,

26.12.2023 19:55, tender wang write:
I tried to analyze the issue, and I found that it might be caused by this commit:
commit dad50f677c42de207168a3f08982ba23c9fc6720
bufmgr: Acquire and clean victim buffer separately

Thanks for looking into it!

...

With debug logging added in this code within ExtendBufferedRelLocal():
if (found)
{
BufferDesc *existing_hdr =
GetLocalBufferDescriptor(hresult->id);
uint32 buf_state;

UnpinLocalBuffer(BufferDescriptorGetBuffer(victim_buf_hdr));

existing_hdr = GetLocalBufferDescriptor(hresult->id);
PinLocalBuffer(existing_hdr, false);
buffers[i] = BufferDescriptorGetBuffer(existing_hdr);

buf_state = pg_atomic_read_u32(&existing_hdr->state);
Assert(buf_state & BM_TAG_VALID);
Assert(!(buf_state & BM_DIRTY));
buf_state &= BM_VALID;
pg_atomic_unlocked_write_u32(&existing_hdr->state, buf_state);
...
I see that it reached for the second INSERT (and NOSPC error) with
existing_hdr->state == 0x2040000, but for the third INSERT I observe
state == 0x0.

I wonder, if "buf_state &= BM_VALID" is a typo here, maybe it supposed to be
"buf_state &= ~BM_VALID" as in ExtendBufferedRelShared()...

Yeah, that's true. I analyze this issue again, and I think the root cause is the " buf_state &= BM_VALID" .
In my report issue, buf_state & BM_VALID is true, but buf_state & BM_TAG_VALID is false. This situation is impossible.
It can't happen that the data in the local buffer pool is valid, but LocalBufHash has no entry.

I modified v1 patch, and attached v2 patch should fix the above issues.

Best regards,
Alexander

В списке pgsql-bugs по дате отправления:

Предыдущее

От: Richard Guo
Дата: 28 декабря 2023 г., 06:03:44
Сообщение: Re: BUG #18260: Unexpected error: "negative bitmapset member not allowed" triggered by multiple JOIN

Следующее

От: Andrei Lepikhov
Дата: 28 декабря 2023 г., 07:30:42
Сообщение: Re: BUG #18260: Unexpected error: "negative bitmapset member not allowed" triggered by multiple JOIN

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: BUG #18259: Assertion in ExtendBufferedRelLocal() fails after no-space-left condition

Предыдущее

Следующее