BUG #18369: logical decoding core on AssertTXNLsnOrder()

Поиск
Список
Период
Сортировка
От PG Bug reporting form
Тема BUG #18369: logical decoding core on AssertTXNLsnOrder()
Дата
Msg-id 18369-ad61699bf91c5bc0@postgresql.org
обсуждение исходный текст
Список pgsql-bugs
The following bug has been logged on the website:

Bug reference:      18369
Logged by:          haiyang li
Email address:      ocean_li_996@163.com
PostgreSQL version: 14.11
Operating system:   centos7 5.10.84 x86_64
Description:

When testing on logical replication module, we encountered a core dump
issue. The stack trace from the core file is:

##
[New LWP 113877]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `postgres(5432): normal_user dml_full0
11.164.97.22[37210]SELECT               ".
Program terminated with signal SIGABRT, Aborted.
#0  0x00007fe2074a9277 in raise () from /lib64/libc.so.6
#0  0x00007fe2074a9277 in raise () from /lib64/libc.so.6
#1  0x00007fe2074aa968 in abort () from /lib64/libc.so.6
#2  0x00000000010f8d67 in ExceptionalCondition (conditionName=0x17edb50
"!(prev_first_lsn < cur_txn->first_lsn)", errorType=0x17ed93c
"FailedAssertion", fileName=0x17ed990 "reorderbuffer.c", lineNumber=762) at
assert.c:46
#3  0x0000000000e6145c in AssertTXNLsnOrder (rb=0x4558060) at
reorderbuffer.c:762
#4  0x0000000000e60ead in ReorderBufferTXNByXid (rb=0x4558060, xid=19937,
create=true, is_new=0x0, lsn=12640708096, create_as_top=true) at
reorderbuffer.c:610
#5  0x0000000000e6415c in ReorderBufferXidSetCatalogChanges (rb=0x4558060,
xid=19937, lsn=12640708096) at reorderbuffer.c:2298
#6  0x0000000000e6bb4b in SnapBuildXidSetCatalogChanges (builder=0x456e160,
xid=19933, subxcnt=17, subxacts=0x44e08d8, lsn=12640708096) at
snapbuild.c:2172
#7  0x0000000000e54fb3 in DecodeCommit (ctx=0x452bc10, buf=0x7ffe4f03af70,
parsed=0x7ffe4f03ae20, xid=19933) at decode.c:631
#8  0x0000000000e54556 in DecodeXactOp (ctx=0x452bc10, buf=0x7ffe4f03af70)
at decode.c:268
#9  0x0000000000e54124 in LogicalDecodingProcessRecord (ctx=0x452bc10,
record=0x452bed0) at decode.c:120
#10 0x0000000000e5adc6 in pg_logical_slot_get_changes_guts
(fcinfo=0x7ffe4f03b2d0, confirm=true, binary=false) at logicalfuncs.c:329
#11 0x0000000000e5af76 in pg_logical_slot_get_changes
(fcinfo=0x7ffe4f03b2d0) at logicalfuncs.c:393
...
#33 0x0000000000f1c0b9 in exec_simple_query (query_string=0x425bb80 "SELECT
* FROM pg_logical_slot_get_changes("test_logical_decode_slot_0", NULL,
NULL)") at postgres.c:1570
...
(gdb) f 3
#3  0x0000000000e6145c in AssertTXNLsnOrder (rb=0x4558060) at
reorderbuffer.c:762
(gdb) p /x MyReplicationSlot->data.restart_lsn
$1 = 0x2f171b8a8
(gdb) p /x cur_txn->first_lsn
$2 = 0x2f171e600
(gdb) p NInitialRunningXacts
$3 = 1
(gdb) p *InitialRunningXacts
$4 = 19933
##

As indicated, the problem occurred at the AssertTXNLsnOrder function.
Moreover, this issue occurred when pg_logical_slot_get_changes function was
called again because NInitialRunningXacts != 0. 

1) The WAL records from restart_lsn to the corresponding lsn when the issue
occurred,
2) personal analysis of the problem,
3) the steps to reproduce the issue,
4) personal proposed solution
will be posted later under this thread.


В списке pgsql-bugs по дате отправления:

Предыдущее
От: "David G. Johnston"
Дата:
Сообщение: Re: Feature bug dumpall CREATE ROLE postgres
Следующее
От: ocean_li_996
Дата:
Сообщение: Re:BUG #18369: logical decoding core on AssertTXNLsnOrder()