Обсуждение: race condition in pg_class

Поиск

Список

Период

Сортировка

race condition in pg_class

От

Smolkin Grigory

Дата:

25 октября 2023 г., 13:39:41

Hello, hackers!
We are running PG13.10 and recently we have encountered what appears to be a bug due to some race condition between ALTER TABLE ... ADD CONSTRAINT and some other catalog-writer, possibly ANALYZE.
The problem is that after successfully creating index on relation (which previosly didnt have any indexes), its pg_class.relhasindex remains set to "false", which is illegal, I think.
Index was built using the following statement:
ALTER TABLE "example" ADD constraint "example_pkey" PRIMARY KEY (id);

PG_CLASS:
# select ctid,oid,xmin,xmax, * from pg_class where oid = 3566558198;
-[ RECORD 1 ]-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
ctid | (12,49)
oid | 3566558198
xmin | 1202298791
xmax | 0
relname | example
relnamespace | 16479
reltype | 3566558200
reloftype | 0
relowner | 16386
relam | 2
relfilenode | 3566558198
reltablespace | 0
relpages | 152251
reltuples | 1.1565544e+07
relallvisible | 127837
reltoastrelid | 3566558203
relhasindex | f
relisshared | f
relpersistence | p
relkind | r
relnatts | 12
relchecks | 0
relhasrules | f
relhastriggers | f
relhassubclass | f
relrowsecurity | f
relforcerowsecurity | f
relispopulated | t
relreplident | d
relispartition | f
relrewrite | 0
relfrozenxid | 1201647807
relminmxid | 1
relacl |
reloptions |
relpartbound |

PG_INDEX:
# select ctid,xmin,xmax,indexrelid::regclass,indrelid::regclass, * from pg_index where indexrelid = 3569625749;
-[ RECORD 1 ]--+---------------------------------------------
ctid | (3,30)
xmin | 1202295045
xmax | 0
indexrelid | "example_pkey"
indrelid | "example"
indexrelid | 3569625749
indrelid | 3566558198
indnatts | 1
indnkeyatts | 1
indisunique | t
indisprimary | t
indisexclusion | f
indimmediate | t
indisclustered | f
indisvalid | t
indcheckxmin | f
indisready | t
indislive | t
indisreplident | f
indkey | 1
indcollation | 0
indclass | 3124
indoption | 0
indexprs |
indpred |

Looking into the WAL via waldump given us the following picture (full waldump output is attached):

tx: 1202295045, lsn: AAB1/D38378D0, prev AAB1/D3837208, desc: FPI , blkref #0: rel 1663/16387/3569625749 blk 0 FPW
tx: 1202298790, lsn: AAB1/D3912EC0, prev AAB1/D3912E80, desc: NEW_CID rel 1663/16387/1259; tid 6/24; cmin: 0, cmax: 4294967295, combo: 4294967295
tx: 1202298790, lsn: AAB1/D3927580, prev AAB1/D3926988, desc: COMMIT 2023-10-04 22:41:23.863979 UTC
tx: 1202298791, lsn: AAB1/D393C230, prev AAB1/D393C1F0, desc: HOT_UPDATE off 24 xmax 1202298791 flags 0x20 ; new off 45 xmax 0, blkref #0: rel 1663/16387/1259 blk 6
tx: 1202298791, lsn: AAB1/D394ADA0, prev AAB1/D394AD60, desc: UPDATE off 45 xmax 1202298791 flags 0x00 ; new off 28 xmax 0, blkref #0: rel 1663/16387/1259 blk 5, blkref #1: rel 1663/16387/1259 blk 6
tx: 1202298791, lsn: AAB1/D3961088, prev AAB1/D3961048, desc: NEW_CID rel 1663/16387/1259; tid 12/49; cmin: 50, cmax: 4294967295, combo: 4294967295
tx: 1202295045, lsn: AAB1/D3962E78, prev AAB1/D3962E28, desc: INPLACE off 24, blkref #0: rel 1663/16387/1259 blk 6
tx: 1202295045, lsn: AAB1/D39632A0, prev AAB1/D3963250, desc: COMMIT 2023-10-04 22:41:23.878565 UTC
tx: 1202298791, lsn: AAB1/D3973420, prev AAB1/D39733D0, desc: COMMIT 2023-10-04 22:41:23.884951 UTC

1202295045 - create index statement
1202298790 and 1202298791 are some other concurrent operations, unfortunately I wasnt able to determine what are they

So looks like 1202295045, updated tuple (6,24) in pg_class INPLACE, in which at this point xmax was already set by 1202298791 and new tuple in (12,49) was in created.
So after 1202298791 was commited, that inplace update was effectively lost.
If we do an inclusive PITR with (recovery_target_xid = 1202295045), we can see the following picture (notice relhasindex and xmax):

# select ctid,oid, xmin,xmax,relhasindex,cmin,cmax from pg_class where oid = 3566558198;
-[ RECORD 1 ]-----------
ctid | (6,24)
oid | 3566558198
xmin | 1202298790
xmax | 1202298791
relhasindex | t
cmin | 0
cmax | 0

I've tried to reproduce this scenario with CREATE INDEX and various concurrent statements, but no luck.
Attached full waldump output for the relevant WAL segment.

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: race condition in pg_class

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения