Обсуждение: [HACKERS] Issues with logical replication
I have faced two issues with logical replication. Reproducing scenario: 1. start two Postgres instances (I start both at local machine). 2. Initialize pgbench tables at both instances: pgbench -i -s 10 postgres 3. Create publication of pgbench_accounts table at one node: create publication pub for table pgbench_accounts; 4. Create subscription at another node: delete from pgbench_accounts; CREATE SUBSCRIPTION sub connection 'dbname=postgreshost=localhost port=5432 sslmode=disable' publication pub; CREATE OR REPLACE FUNCTION replication_trigger_proc() RETURNS TRIGGER AS $$ BEGIN -- RETURN NEW; END $$ LANGUAGE plpgsql; CREATE TRIGGER replication_trigger BEFORE INSERTOR UPDATE OR DELETE ON pgbench_accounts FOR EACH ROW EXECUTE PROCEDURE replication_trigger_proc(); ALTER TABLE pgbench_accounts ENABLE ALWAYS TRIGGER replication_trigger; 5. Start pgbench at primary node: pgbench -T 1000 -P 2 -c 10 postgres Please notice commented "return new" statement. Invocation of this function cause error and in log we see repeated messages: 2017-10-02 16:47:46.764 MSK [32129] LOG: logical replication table synchronization worker for subscription "sub", table "pgbench_accounts" has started 2017-10-02 16:47:46.771 MSK [32129] ERROR: control reached end of trigger procedure without RETURN 2017-10-02 16:47:46.771 MSK [32129] CONTEXT: PL/pgSQL function replication_trigger_proc() COPY pgbench_accounts, line 1: "1 1 0 " 2017-10-02 16:47:46.772 MSK [28020] LOG: worker process: logical replication worker for subscription 17264 sync 17251 (PID 32129) exited with exit code 1 ... After few minutes of work primary node (with publication) is crashed with the following stack trace: Program terminated with signal SIGABRT, Aborted. #0 0x00007f3608f8ec37 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 56 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory. (gdb) bt #0 0x00007f3608f8ec37 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #1 0x00007f3608f92028 in __GI_abort () at abort.c:89 #2 0x00000000009f5740 in ExceptionalCondition (conditionName=0xbf6b30 "!(((xid) != ((TransactionId) 0)))", errorType=0xbf69af "FailedAssertion", fileName=0xbf69a8 "lmgr.c", lineNumber=582) at assert.c:54 #3 0x000000000086ac1d in XactLockTableWait (xid=0, rel=0x0, ctid=0x0, oper=XLTW_None) at lmgr.c:582 #4 0x000000000081c9f7 in SnapBuildWaitSnapshot (running=0x2749198, cutoff=898498) at snapbuild.c:1400 #5 0x000000000081c7c7 in SnapBuildFindSnapshot (builder=0x2807910, lsn=798477760, running=0x2749198) at snapbuild.c:1311 #6 0x000000000081c339 in SnapBuildProcessRunningXacts (builder=0x2807910, lsn=798477760, running=0x2749198) at snapbuild.c:1106 #7 0x000000000080a1c7 in DecodeStandbyOp (ctx=0x27ef870, buf=0x7ffd301858d0) at decode.c:314 #8 0x0000000000809ce1 in LogicalDecodingProcessRecord (ctx=0x27ef870, record=0x27efb30) at decode.c:117 #9 0x000000000080ddf9 in DecodingContextFindStartpoint (ctx=0x27ef870) at logical.c:458 #10 0x0000000000823968 in CreateReplicationSlot (cmd=0x27483a8) at walsender.c:934 #11 0x00000000008246ee in exec_replication_command ( cmd_string=0x27b9520 "CREATE_REPLICATION_SLOT \"sub_17264_sync_17251\" TEMPORARY LOGICAL pgoutput USE_SNAPSHOT") at walsender.c:1511 #12 0x000000000088eb44 in PostgresMain (argc=1, argv=0x275b738, dbname=0x275b578 "postgres", username=0x272b800 "knizhnik") at postgres.c:4086 #13 0x00000000007ef9a1 in BackendRun (port=0x27532a0) at postmaster.c:4357 #14 0x00000000007ef0cb in BackendStartup (port=0x27532a0) at postmaster.c:4029 #15 0x00000000007eb68b in ServerLoop () at postmaster.c:1753 #16 0x00000000007eac77 in PostmasterMain (argc=3, argv=0x2729670) at postmaster.c:1361 #17 0x0000000000728552 in main (argc=3, argv=0x2729670) at main.c:228 Now fix the trigger function: CREATE OR REPLACE FUNCTION replication_trigger_proc() RETURNS TRIGGER AS $$ BEGIN RETURN NEW; END $$ LANGUAGE plpgsql; And manually perform at master two updates inside one transaction: postgres=# begin; BEGIN postgres=# update pgbench_accounts set abalance=abalance+1 where aid=26; UPDATE 1 postgres=# update pgbench_accounts set abalance=abalance-1 where aid=26; UPDATE 1 postgres=# commit; <hangs> and in replica log we see: 2017-10-02 18:40:26.094 MSK [2954] LOG: logical replication apply worker for subscription "sub" has started 2017-10-02 18:40:26.101 MSK [2954] ERROR: attempted to lock invisible tuple 2017-10-02 18:40:26.102 MSK [2882] LOG: worker process: logical replication worker for subscription 16403 (PID 2954) exited with exit code 1 Error happens in trigger.c: #3 0x000000000069bddb in GetTupleForTrigger (estate=0x2e36b50, epqstate=0x7ffc4420eda0, relinfo=0x2dcfe90, tid=0x2dd08ac, lockmode=LockTupleNoKeyExclusive, newSlot=0x7ffc4420ec40) at trigger.c:3103 #4 0x000000000069b259 in ExecBRUpdateTriggers (estate=0x2e36b50, epqstate=0x7ffc4420eda0, relinfo=0x2dcfe90, tupleid=0x2dd08ac, fdw_trigtuple=0x0, slot=0x2dd0240) at trigger.c:2748 #5 0x00000000006d2395 in ExecSimpleRelationUpdate (estate=0x2e36b50, epqstate=0x7ffc4420eda0, searchslot=0x2dd0358, slot=0x2dd0240) at execReplication.c:461 #6 0x0000000000820894 in apply_handle_update (s=0x7ffc442163b0) at worker.c:736 #7 0x0000000000820d83 in apply_dispatch (s=0x7ffc442163b0) at worker.c:892 #8 0x000000000082121b in LogicalRepApplyLoop (last_received=2225693736) at worker.c:1095 #9 0x000000000082219d in ApplyWorkerMain (main_arg=0) at worker.c:1647 #10 0x00000000007dd496 in StartBackgroundWorker () at bgworker.c:835 #11 0x00000000007f0889 in do_start_bgworker (rw=0x2d75f50) at postmaster.c:5680 #12 0x00000000007f0bc3 in maybe_start_bgworkers () at postmaster.c:5884 #13 0x00000000007efbeb in sigusr1_handler (postgres_signal_arg=10) at postmaster.c:5073 #14 <signal handler called> #15 0x00007fbe83a054a3 in __select_nocancel () at ../sysdeps/unix/syscall-template.S:81 #16 0x00000000007eb517 in ServerLoop () at postmaster.c:1717 #17 0x00000000007eac48 in PostmasterMain (argc=3, argv=0x2d4e660) at postmaster.c:1361 #18 0x0000000000728523 in main (argc=3, argv=0x2d4e660) at main.c:228 -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On 02/10/17 17:49, Konstantin Knizhnik wrote: > I have faced two issues with logical replication. > Reproducing scenario: > > 1. start two Postgres instances (I start both at local machine). > 2. Initialize pgbench tables at both instances: > pgbench -i -s 10 postgres > 3. Create publication of pgbench_accounts table at one node: > create publication pub for table pgbench_accounts; > 4. Create subscription at another node: > delete from pgbench_accounts; > CREATE SUBSCRIPTION sub connection 'dbname=postgres host=localhost > port=5432 sslmode=disable' publication pub; > CREATE OR REPLACE FUNCTION replication_trigger_proc() RETURNS > TRIGGER AS $$ > BEGIN > -- RETURN NEW; > END $$ LANGUAGE plpgsql; > CREATE TRIGGER replication_trigger BEFORE INSERT OR UPDATE OR DELETE > ON pgbench_accounts FOR EACH ROW EXECUTE PROCEDURE > replication_trigger_proc(); > ALTER TABLE pgbench_accounts ENABLE ALWAYS TRIGGER replication_trigger; > 5. Start pgbench at primary node: > pgbench -T 1000 -P 2 -c 10 postgres > > > Please notice commented "return new" statement. Invocation of this > function cause error and in log we see repeated messages: > > 2017-10-02 16:47:46.764 MSK [32129] LOG: logical replication table > synchronization worker for subscription "sub", table "pgbench_accounts" > has started > 2017-10-02 16:47:46.771 MSK [32129] ERROR: control reached end of > trigger procedure without RETURN > 2017-10-02 16:47:46.771 MSK [32129] CONTEXT: PL/pgSQL function > replication_trigger_proc() > COPY pgbench_accounts, line 1: "1 1 0 " > 2017-10-02 16:47:46.772 MSK [28020] LOG: worker process: logical > replication worker for subscription 17264 sync 17251 (PID 32129) exited > with exit code 1 > ... > > > After few minutes of work primary node (with publication) is crashed > with the following stack trace: > > Program terminated with signal SIGABRT, Aborted. > #0 0x00007f3608f8ec37 in __GI_raise (sig=sig@entry=6) at > ../nptl/sysdeps/unix/sysv/linux/raise.c:56 > 56 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory. > (gdb) bt > #0 0x00007f3608f8ec37 in __GI_raise (sig=sig@entry=6) at > ../nptl/sysdeps/unix/sysv/linux/raise.c:56 > #1 0x00007f3608f92028 in __GI_abort () at abort.c:89 > #2 0x00000000009f5740 in ExceptionalCondition (conditionName=0xbf6b30 > "!(((xid) != ((TransactionId) 0)))", > errorType=0xbf69af "FailedAssertion", fileName=0xbf69a8 "lmgr.c", > lineNumber=582) at assert.c:54 > #3 0x000000000086ac1d in XactLockTableWait (xid=0, rel=0x0, ctid=0x0, > oper=XLTW_None) at lmgr.c:582 > #4 0x000000000081c9f7 in SnapBuildWaitSnapshot (running=0x2749198, > cutoff=898498) at snapbuild.c:1400 > #5 0x000000000081c7c7 in SnapBuildFindSnapshot (builder=0x2807910, > lsn=798477760, running=0x2749198) at snapbuild.c:1311 > #6 0x000000000081c339 in SnapBuildProcessRunningXacts > (builder=0x2807910, lsn=798477760, running=0x2749198) at snapbuild.c:1106 Hmm this would suggest that XLOG_RUNNING_XACTS record contains invalid transaction ids but we specifically skip those in GetRunningTransactionData(). Can you check pg_waldump output for that record (lsn is shown in the backtrace)? > > Now fix the trigger function: > CREATE OR REPLACE FUNCTION replication_trigger_proc() RETURNS TRIGGER AS $$ > BEGIN > RETURN NEW; > END $$ LANGUAGE plpgsql; > > And manually perform at master two updates inside one transaction: > > postgres=# begin; > BEGIN > postgres=# update pgbench_accounts set abalance=abalance+1 where aid=26; > UPDATE 1 > postgres=# update pgbench_accounts set abalance=abalance-1 where aid=26; > UPDATE 1 > postgres=# commit; > <hangs> > > and in replica log we see: > 2017-10-02 18:40:26.094 MSK [2954] LOG: logical replication apply > worker for subscription "sub" has started > 2017-10-02 18:40:26.101 MSK [2954] ERROR: attempted to lock invisible > tuple > 2017-10-02 18:40:26.102 MSK [2882] LOG: worker process: logical > replication worker for subscription 16403 (PID 2954) exited with exit > code 1 > > Error happens in trigger.c: > > #3 0x000000000069bddb in GetTupleForTrigger (estate=0x2e36b50, > epqstate=0x7ffc4420eda0, relinfo=0x2dcfe90, tid=0x2dd08ac, > lockmode=LockTupleNoKeyExclusive, newSlot=0x7ffc4420ec40) at > trigger.c:3103 > #4 0x000000000069b259 in ExecBRUpdateTriggers (estate=0x2e36b50, > epqstate=0x7ffc4420eda0, relinfo=0x2dcfe90, tupleid=0x2dd08ac, > fdw_trigtuple=0x0, slot=0x2dd0240) at trigger.c:2748 > #5 0x00000000006d2395 in ExecSimpleRelationUpdate (estate=0x2e36b50, > epqstate=0x7ffc4420eda0, searchslot=0x2dd0358, slot=0x2dd0240) > at execReplication.c:461 > #6 0x0000000000820894 in apply_handle_update (s=0x7ffc442163b0) at > worker.c:736 We have locked the same tuple in RelationFindReplTupleByIndex() just before this gets called and didn't get the same error. I guess we do something wrong with snapshots. Will need to investigate more. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On 02/10/17 18:59, Petr Jelinek wrote: >> >> Now fix the trigger function: >> CREATE OR REPLACE FUNCTION replication_trigger_proc() RETURNS TRIGGER AS $$ >> BEGIN >> RETURN NEW; >> END $$ LANGUAGE plpgsql; >> >> And manually perform at master two updates inside one transaction: >> >> postgres=# begin; >> BEGIN >> postgres=# update pgbench_accounts set abalance=abalance+1 where aid=26; >> UPDATE 1 >> postgres=# update pgbench_accounts set abalance=abalance-1 where aid=26; >> UPDATE 1 >> postgres=# commit; >> <hangs> >> >> and in replica log we see: >> 2017-10-02 18:40:26.094 MSK [2954] LOG: logical replication apply >> worker for subscription "sub" has started >> 2017-10-02 18:40:26.101 MSK [2954] ERROR: attempted to lock invisible >> tuple >> 2017-10-02 18:40:26.102 MSK [2882] LOG: worker process: logical >> replication worker for subscription 16403 (PID 2954) exited with exit >> code 1 >> >> Error happens in trigger.c: >> >> #3 0x000000000069bddb in GetTupleForTrigger (estate=0x2e36b50, >> epqstate=0x7ffc4420eda0, relinfo=0x2dcfe90, tid=0x2dd08ac, >> lockmode=LockTupleNoKeyExclusive, newSlot=0x7ffc4420ec40) at >> trigger.c:3103 >> #4 0x000000000069b259 in ExecBRUpdateTriggers (estate=0x2e36b50, >> epqstate=0x7ffc4420eda0, relinfo=0x2dcfe90, tupleid=0x2dd08ac, >> fdw_trigtuple=0x0, slot=0x2dd0240) at trigger.c:2748 >> #5 0x00000000006d2395 in ExecSimpleRelationUpdate (estate=0x2e36b50, >> epqstate=0x7ffc4420eda0, searchslot=0x2dd0358, slot=0x2dd0240) >> at execReplication.c:461 >> #6 0x0000000000820894 in apply_handle_update (s=0x7ffc442163b0) at >> worker.c:736 > > We have locked the same tuple in RelationFindReplTupleByIndex() just > before this gets called and didn't get the same error. I guess we do > something wrong with snapshots. Will need to investigate more. > Okay, so it's not snapshot. It's the fact that we don't set the es_output_cid in replication worker which GetTupleForTrigger is using when locking the tuple. Attached one-liner fixes it. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Вложения
On Wed, Oct 4, 2017 at 8:35 AM, Petr Jelinek <petr.jelinek@2ndquadrant.com> wrote: > On 02/10/17 18:59, Petr Jelinek wrote: >>> >>> Now fix the trigger function: >>> CREATE OR REPLACE FUNCTION replication_trigger_proc() RETURNS TRIGGER AS $$ >>> BEGIN >>> RETURN NEW; >>> END $$ LANGUAGE plpgsql; >>> >>> And manually perform at master two updates inside one transaction: >>> >>> postgres=# begin; >>> BEGIN >>> postgres=# update pgbench_accounts set abalance=abalance+1 where aid=26; >>> UPDATE 1 >>> postgres=# update pgbench_accounts set abalance=abalance-1 where aid=26; >>> UPDATE 1 >>> postgres=# commit; >>> <hangs> >>> >>> and in replica log we see: >>> 2017-10-02 18:40:26.094 MSK [2954] LOG: logical replication apply >>> worker for subscription "sub" has started >>> 2017-10-02 18:40:26.101 MSK [2954] ERROR: attempted to lock invisible >>> tuple >>> 2017-10-02 18:40:26.102 MSK [2882] LOG: worker process: logical >>> replication worker for subscription 16403 (PID 2954) exited with exit >>> code 1 >>> >>> Error happens in trigger.c: >>> >>> #3 0x000000000069bddb in GetTupleForTrigger (estate=0x2e36b50, >>> epqstate=0x7ffc4420eda0, relinfo=0x2dcfe90, tid=0x2dd08ac, >>> lockmode=LockTupleNoKeyExclusive, newSlot=0x7ffc4420ec40) at >>> trigger.c:3103 >>> #4 0x000000000069b259 in ExecBRUpdateTriggers (estate=0x2e36b50, >>> epqstate=0x7ffc4420eda0, relinfo=0x2dcfe90, tupleid=0x2dd08ac, >>> fdw_trigtuple=0x0, slot=0x2dd0240) at trigger.c:2748 >>> #5 0x00000000006d2395 in ExecSimpleRelationUpdate (estate=0x2e36b50, >>> epqstate=0x7ffc4420eda0, searchslot=0x2dd0358, slot=0x2dd0240) >>> at execReplication.c:461 >>> #6 0x0000000000820894 in apply_handle_update (s=0x7ffc442163b0) at >>> worker.c:736 >> >> We have locked the same tuple in RelationFindReplTupleByIndex() just >> before this gets called and didn't get the same error. I guess we do >> something wrong with snapshots. Will need to investigate more. >> > > Okay, so it's not snapshot. It's the fact that we don't set the > es_output_cid in replication worker which GetTupleForTrigger is using > when locking the tuple. Attached one-liner fixes it. > Thank you for the patch. This bug can happen even without the trigger and I confirmed tgat the bug is fixed by the patch. I think the patch fixed it properly. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
> On 2 Oct 2017, at 19:59, Petr Jelinek <petr.jelinek@2ndquadrant.com> wrote: > >> >> Program terminated with signal SIGABRT, Aborted. >> #0 0x00007f3608f8ec37 in __GI_raise (sig=sig@entry=6) at >> ../nptl/sysdeps/unix/sysv/linux/raise.c:56 >> 56 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory. >> (gdb) bt >> #0 0x00007f3608f8ec37 in __GI_raise (sig=sig@entry=6) at >> ../nptl/sysdeps/unix/sysv/linux/raise.c:56 >> #1 0x00007f3608f92028 in __GI_abort () at abort.c:89 >> #2 0x00000000009f5740 in ExceptionalCondition (conditionName=0xbf6b30 >> "!(((xid) != ((TransactionId) 0)))", >> errorType=0xbf69af "FailedAssertion", fileName=0xbf69a8 "lmgr.c", >> lineNumber=582) at assert.c:54 >> #3 0x000000000086ac1d in XactLockTableWait (xid=0, rel=0x0, ctid=0x0, >> oper=XLTW_None) at lmgr.c:582 >> #4 0x000000000081c9f7 in SnapBuildWaitSnapshot (running=0x2749198, >> cutoff=898498) at snapbuild.c:1400 >> #5 0x000000000081c7c7 in SnapBuildFindSnapshot (builder=0x2807910, >> lsn=798477760, running=0x2749198) at snapbuild.c:1311 >> #6 0x000000000081c339 in SnapBuildProcessRunningXacts >> (builder=0x2807910, lsn=798477760, running=0x2749198) at snapbuild.c:1106 > > > Hmm this would suggest that XLOG_RUNNING_XACTS record contains invalid > transaction ids but we specifically skip those in > GetRunningTransactionData(). Can you check pg_waldump output for that > record (lsn is shown in the backtrace)? I investigated this case and it seems that XactLockTableWait() in SnapBuildWaitSnapshot() not always work as expected. XactLockTableWait() waits on LockAcquire() for xid to be completed and if we finally got this lock but transaction is still in progress then such xid assumed to be a subxid. However LockAcquire/LockRelease cycle can happen after transaction set xid, but before XactLockTableInsert(). Namely following history happened for xid 5225 and lead to crash: [backend] LOG: AssignTransactionId: XactTopTransactionId = 5225 [walsender] LOG: LogCurrentRunningXacts: Wrote RUNNING_XACTS xctn=1, xid[0]=5225 [walsender] LOG: XactLockTableWait: LockAcquire 5225 [walsender] LOG: XactLockTableWait: LockRelease 5225 [backend] LOG: AssignTransactionId: LockAcquire ExclusiveLock 5225 [walsender] LOG: TransactionIdIsInProgress: SVC->latestCompletedXid=5224 < xid=5225 => true [backend] LOG: CommitTransaction: ProcArrayEndTransaction xid=5225, ipw=0 [backend] LOG: CommitTransaction: ResourceOwnerRelease locks xid=5225 I’ve quickly checked other usages of XactLockTableWait() and it seems that it is used mostly with xids from heap so tx definetly set it lock somewhere in the past. Not sure what it the best approach to handle that. May be write running xacts only if they already set their lock? Also attaching pgbench script that can reproduce failure. I run it over usual pgbench database in 10 threads. It takes several minutes to crash. Stas Kelvich Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Вложения
On Mon, Oct 9, 2017 at 9:19 PM, Stas Kelvich <s.kelvich@postgrespro.ru> wrote: > I investigated this case and it seems that XactLockTableWait() in SnapBuildWaitSnapshot() > not always work as expected. XactLockTableWait() waits on LockAcquire() for xid to be > completed and if we finally got this lock but transaction is still in progress then such xid > assumed to be a subxid. However LockAcquire/LockRelease cycle can happen after transaction > set xid, but before XactLockTableInsert(). > > Namely following history happened for xid 5225 and lead to crash: > > [backend] LOG: AssignTransactionId: XactTopTransactionId = 5225 > [walsender] LOG: LogCurrentRunningXacts: Wrote RUNNING_XACTS xctn=1, xid[0]=5225 > [walsender] LOG: XactLockTableWait: LockAcquire 5225 > [walsender] LOG: XactLockTableWait: LockRelease 5225 > [backend] LOG: AssignTransactionId: LockAcquire ExclusiveLock 5225 > [walsender] LOG: TransactionIdIsInProgress: SVC->latestCompletedXid=5224 < xid=5225 => true > [backend] LOG: CommitTransaction: ProcArrayEndTransaction xid=5225, ipw=0 > [backend] LOG: CommitTransaction: ResourceOwnerRelease locks xid=5225 Ouch. This seems like a bug that needs to be fixed, but do you think it's related to to Petr's proposed fix to set es_output_cid? That fix looks reasonable, since we shouldn't try to lock tuples without a valid CommandId. Now, having said that, I understand how the lack of that fix could cause: 2017-10-02 18:40:26.101 MSK [2954] ERROR: attempted to lock invisible tuple But I do not understand how it could cause: #3 0x000000000086ac1d in XactLockTableWait (xid=0, rel=0x0, ctid=0x0, oper=XLTW_None) at lmgr.c:582 -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
> On 15 Nov 2017, at 23:09, Robert Haas <robertmhaas@gmail.com> wrote: > > Ouch. This seems like a bug that needs to be fixed, but do you think > it's related to to Petr's proposed fix to set es_output_cid? That fix > looks reasonable, since we shouldn't try to lock tuples without a > valid CommandId. > > Now, having said that, I understand how the lack of that fix could cause: > > 2017-10-02 18:40:26.101 MSK [2954] ERROR: attempted to lock invisible tuple > > But I do not understand how it could cause: > > #3 0x000000000086ac1d in XactLockTableWait (xid=0, rel=0x0, ctid=0x0, > oper=XLTW_None) at lmgr.c:582 > I think that’s two separate bugs. Second one is due to race between XactLockTableWait() and acquiring of heavyweight transaction lock in AssignTransactionId(). XactLockTableWait assumes that transaction already set it’s hw transaction lock, but with current machinery of RunningXacts and snapshot building we can start such waiting before transaction gets it hw lock. I can see two ways out of this and both of them are quite unattractive: 1. Create a flag in pgproc indicating that hw lock is acquired and include in RunningXacts only transaction with this flag set. That's probably quite expensive since we need to hold one more exclusive lwlock in AssignTransactionId. 2. Rewrite XactLockTableWait() in a way that it is aware of the fact that valid xid can be topmost transaction but not yet have a hw lock entry. In such situation we can just wait and try to lock again. Good stop condition in this loop is TransactionIdIsInProgress(xid) becoming false. I did a sketch of first approach just to confirm that it solves the problem. But there I hold ProcArrayLock during update of flag. Since only reader is GetRunningTransactionData it possible to have a custom lock there. In this case GetRunningTransactionData will hold three locks simultaneously, since it already holds ProcArrayLock and XidGenLock =) Any better ideas? -- Stas Kelvich Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Вложения
On Wed, Nov 15, 2017 at 8:20 PM, Stas Kelvich <s.kelvich@postgrespro.ru> wrote: > I did a sketch of first approach just to confirm that it solves the problem. > But there I hold ProcArrayLock during update of flag. Since only reader is > GetRunningTransactionData it possible to have a custom lock there. In > this case GetRunningTransactionData will hold three locks simultaneously, > since it already holds ProcArrayLock and XidGenLock =) To me, it seems like SnapBuildWaitSnapshot() is fundamentally misdesigned, and ideally Petr (who wrote the patch) or Andres (who committed it) ought to get involved here and help fix this problem. My own first inclination would be to rewrite this as a loop: if the transaction ID precedes the oldest running XID, then continue; else if TransactionIdDidCommit() || TransactionIdDidAbort() then conclude that we don't need to wait; else XactLockTableWait() then loop. That way, if you hit the race condition, you'll just busy-wait instead of doing the wrong thing. Maybe insert a sleep(1) if we retry more than once. That sucks, of course, but it seems like a better idea than trying to redesign XactLockTableWait() or the procarray, which could affect an awful lot of other things. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2017-11-16 10:36:40 -0500, Robert Haas wrote: > On Wed, Nov 15, 2017 at 8:20 PM, Stas Kelvich <s.kelvich@postgrespro.ru> wrote: > > I did a sketch of first approach just to confirm that it solves the problem. > > But there I hold ProcArrayLock during update of flag. Since only reader is > > GetRunningTransactionData it possible to have a custom lock there. In > > this case GetRunningTransactionData will hold three locks simultaneously, > > since it already holds ProcArrayLock and XidGenLock =) > > To me, it seems like SnapBuildWaitSnapshot() is fundamentally > misdesigned Maybe I'm confused, but why is it fundamentally misdesigned? It's not such an absurd idea to wait for an xid in a WAL record. I get that there's a race condition here, which obviously bad, but I don't really see as evidence of the above claim. I actually think this code used to be safe because ProcArrayLock used to be held while generating and logging the running snapshots record. That was removed when fixing some other bug, but perhaps that shouldn't have been done... > and ideally Petr (who wrote the patch) or Andres (who committed it) > ought to get involved here and help fix this problem. My own first > inclination would be to rewrite this as a loop: if the transaction ID > precedes the oldest running XID, then continue; else if > TransactionIdDidCommit() || TransactionIdDidAbort() then conclude that > we don't need to wait; else XactLockTableWait() then loop. That way, > if you hit the race condition, you'll just busy-wait instead of doing > the wrong thing. Maybe insert a sleep(1) if we retry more than once. > That sucks, of course, but it seems like a better idea than trying to > redesign XactLockTableWait() or the procarray, which could affect an > awful lot of other things. Hm. Thinking about this. Greetings, Andres Freund
On Thu, Nov 16, 2017 at 2:41 PM, Andres Freund <andres@anarazel.de> wrote: >> To me, it seems like SnapBuildWaitSnapshot() is fundamentally >> misdesigned > > Maybe I'm confused, but why is it fundamentally misdesigned? It's not > such an absurd idea to wait for an xid in a WAL record. I get that > there's a race condition here, which obviously bad, but I don't really > see as evidence of the above claim. > > I actually think this code used to be safe because ProcArrayLock used to > be held while generating and logging the running snapshots record. That > was removed when fixing some other bug, but perhaps that shouldn't have > been done... OK. Well, I might be overstating the case. My comment about fundamental misdesign was really just based on the assumption that XactLockTableWait() could be used to wait for an XID the instant it was generated. That was never gonna work and there's no obvious clean workaround for the problem. Getting snapshot building to work properly seems to be Hard (TM). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Fri, Nov 17, 2017 at 5:12 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Thu, Nov 16, 2017 at 2:41 PM, Andres Freund <andres@anarazel.de> wrote: >>> To me, it seems like SnapBuildWaitSnapshot() is fundamentally >>> misdesigned >> >> Maybe I'm confused, but why is it fundamentally misdesigned? It's not >> such an absurd idea to wait for an xid in a WAL record. I get that >> there's a race condition here, which obviously bad, but I don't really >> see as evidence of the above claim. >> >> I actually think this code used to be safe because ProcArrayLock used to >> be held while generating and logging the running snapshots record. That >> was removed when fixing some other bug, but perhaps that shouldn't have >> been done... > > OK. Well, I might be overstating the case. My comment about > fundamental misdesign was really just based on the assumption that > XactLockTableWait() could be used to wait for an XID the instant it > was generated. That was never gonna work and there's no obvious clean > workaround for the problem. Getting snapshot building to work > properly seems to be Hard (TM). The patches discussed here deserve tracking, so please note that I have added an entry in the CF app: https://commitfest.postgresql.org/16/1381/ -- Michael
On 4 October 2017 at 07:35, Petr Jelinek <petr.jelinek@2ndquadrant.com> wrote:
Okay, so it's not snapshot. It's the fact that we don't set theOn 02/10/17 18:59, Petr Jelinek wrote:
>>
>> Now fix the trigger function:
>> CREATE OR REPLACE FUNCTION replication_trigger_proc() RETURNS TRIGGER AS $$
>> BEGIN
>> RETURN NEW;
>> END $$ LANGUAGE plpgsql;
>>
>> And manually perform at master two updates inside one transaction:
>>
>> postgres=# begin;
>> BEGIN
>> postgres=# update pgbench_accounts set abalance=abalance+1 where aid=26;
>> UPDATE 1
>> postgres=# update pgbench_accounts set abalance=abalance-1 where aid=26;
>> UPDATE 1
>> postgres=# commit;
>> <hangs>
>>
>> and in replica log we see:
>> 2017-10-02 18:40:26.094 MSK [2954] LOG: logical replication apply
>> worker for subscription "sub" has started
>> 2017-10-02 18:40:26.101 MSK [2954] ERROR: attempted to lock invisible
>> tuple
>> 2017-10-02 18:40:26.102 MSK [2882] LOG: worker process: logical
>> replication worker for subscription 16403 (PID 2954) exited with exit
>> code 1
>>
>> Error happens in trigger.c:
>>
>> #3 0x000000000069bddb in GetTupleForTrigger (estate=0x2e36b50,
>> epqstate=0x7ffc4420eda0, relinfo=0x2dcfe90, tid=0x2dd08ac,
>> lockmode=LockTupleNoKeyExclusive, newSlot=0x7ffc4420ec40) at
>> trigger.c:3103
>> #4 0x000000000069b259 in ExecBRUpdateTriggers (estate=0x2e36b50,
>> epqstate=0x7ffc4420eda0, relinfo=0x2dcfe90, tupleid=0x2dd08ac,
>> fdw_trigtuple=0x0, slot=0x2dd0240) at trigger.c:2748
>> #5 0x00000000006d2395 in ExecSimpleRelationUpdate (estate=0x2e36b50,
>> epqstate=0x7ffc4420eda0, searchslot=0x2dd0358, slot=0x2dd0240)
>> at execReplication.c:461
>> #6 0x0000000000820894 in apply_handle_update (s=0x7ffc442163b0) at
>> worker.c:736
>
> We have locked the same tuple in RelationFindReplTupleByIndex() just
> before this gets called and didn't get the same error. I guess we do
> something wrong with snapshots. Will need to investigate more.
>
es_output_cid in replication worker which GetTupleForTrigger is using
when locking the tuple. Attached one-liner fixes it.
This seems like a clear-cut bug with a simple fix.
Lets get this committed, so we don't lose it. The rest of the thread is going off into the weeds a bit issues unrelated to the original problem.
On 4 October 2017 at 10:35, Petr Jelinek <petr.jelinek@2ndquadrant.com> wrote: > On 02/10/17 18:59, Petr Jelinek wrote: >>> >>> Now fix the trigger function: >>> CREATE OR REPLACE FUNCTION replication_trigger_proc() RETURNS TRIGGER AS $$ >>> BEGIN >>> RETURN NEW; >>> END $$ LANGUAGE plpgsql; >>> >>> And manually perform at master two updates inside one transaction: >>> >>> postgres=# begin; >>> BEGIN >>> postgres=# update pgbench_accounts set abalance=abalance+1 where aid=26; >>> UPDATE 1 >>> postgres=# update pgbench_accounts set abalance=abalance-1 where aid=26; >>> UPDATE 1 >>> postgres=# commit; >>> <hangs> >>> >>> and in replica log we see: >>> 2017-10-02 18:40:26.094 MSK [2954] LOG: logical replication apply >>> worker for subscription "sub" has started >>> 2017-10-02 18:40:26.101 MSK [2954] ERROR: attempted to lock invisible >>> tuple >>> 2017-10-02 18:40:26.102 MSK [2882] LOG: worker process: logical >>> replication worker for subscription 16403 (PID 2954) exited with exit >>> code 1 >>> >>> Error happens in trigger.c: >>> >>> #3 0x000000000069bddb in GetTupleForTrigger (estate=0x2e36b50, >>> epqstate=0x7ffc4420eda0, relinfo=0x2dcfe90, tid=0x2dd08ac, >>> lockmode=LockTupleNoKeyExclusive, newSlot=0x7ffc4420ec40) at >>> trigger.c:3103 >>> #4 0x000000000069b259 in ExecBRUpdateTriggers (estate=0x2e36b50, >>> epqstate=0x7ffc4420eda0, relinfo=0x2dcfe90, tupleid=0x2dd08ac, >>> fdw_trigtuple=0x0, slot=0x2dd0240) at trigger.c:2748 >>> #5 0x00000000006d2395 in ExecSimpleRelationUpdate (estate=0x2e36b50, >>> epqstate=0x7ffc4420eda0, searchslot=0x2dd0358, slot=0x2dd0240) >>> at execReplication.c:461 >>> #6 0x0000000000820894 in apply_handle_update (s=0x7ffc442163b0) at >>> worker.c:736 >> >> We have locked the same tuple in RelationFindReplTupleByIndex() just >> before this gets called and didn't get the same error. I guess we do >> something wrong with snapshots. Will need to investigate more. >> > > Okay, so it's not snapshot. It's the fact that we don't set the > es_output_cid in replication worker which GetTupleForTrigger is using > when locking the tuple. Attached one-liner fixes it. Applied, thanks -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi, (sorry for not being active here, I am still catching up after being away for some family issues) On 16/11/17 21:12, Robert Haas wrote: > On Thu, Nov 16, 2017 at 2:41 PM, Andres Freund <andres@anarazel.de> wrote: >>> To me, it seems like SnapBuildWaitSnapshot() is fundamentally >>> misdesigned >> >> Maybe I'm confused, but why is it fundamentally misdesigned? It's not >> such an absurd idea to wait for an xid in a WAL record. I get that >> there's a race condition here, which obviously bad, but I don't really >> see as evidence of the above claim. >> >> I actually think this code used to be safe because ProcArrayLock used to >> be held while generating and logging the running snapshots record. That >> was removed when fixing some other bug, but perhaps that shouldn't have >> been done... > > OK. Well, I might be overstating the case. My comment about > fundamental misdesign was really just based on the assumption that > XactLockTableWait() could be used to wait for an XID the instant it > was generated. That was never gonna work and there's no obvious clean > workaround for the problem. Getting snapshot building to work > properly seems to be Hard (TM). > No kidding, we've been at it since 9.4. But anyway, while sure we have race condition because XactLockTableWait() finishes too early, all that should mean is we call LogStandbySnapshot() too early and as a result taking snapshot (and hence slot creation) will take longer as we'll wait for next call of LogStandbySnapshot() from some other caller (because the tx we care about will still be running). The whole SnapBuildWaitSnapshot() is just optimization to make slot creation take less time (and also to be able to write tests). What I don't understand is how it leads to crash (and I could not reproduce it using the pgbench file attached in this thread either) and moreover how it leads to 0 xid being logged. The only explanation I can come up is that some kind of similar race has to be in LogStandbySnapshot() but we explicitly check for 0 xid value there. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
> On 29 Nov 2017, at 18:46, Petr Jelinek <petr.jelinek@2ndquadrant.com> wrote: > > What I don't understand is how it leads to crash (and I could not > reproduce it using the pgbench file attached in this thread either) and > moreover how it leads to 0 xid being logged. The only explanation I can > come up is that some kind of similar race has to be in > LogStandbySnapshot() but we explicitly check for 0 xid value there. > Zero xid isn’t logged. Loop in XactLockTableWait() does following: for (;;) { Assert(TransactionIdIsValid(xid)); Assert(!TransactionIdEquals(xid, GetTopTransactionIdIfAny())); <...> xid = SubTransGetParent(xid); } So if last statement is reached for top transaction then next iteration will crash in first assert. And it will be reached if whole this loop happens before transaction acquired heavyweight lock. Probability of that crash can be significantly increased be adding sleep between xid generation and lock insertion in AssignTransactionId(). Stas Kelvich Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Вложения
On 29/11/17 20:11, Stas Kelvich wrote: > >> On 29 Nov 2017, at 18:46, Petr Jelinek <petr.jelinek@2ndquadrant.com> wrote: >> >> What I don't understand is how it leads to crash (and I could not >> reproduce it using the pgbench file attached in this thread either) and >> moreover how it leads to 0 xid being logged. The only explanation I can >> come up is that some kind of similar race has to be in >> LogStandbySnapshot() but we explicitly check for 0 xid value there. >> > > Zero xid isn’t logged. Loop in XactLockTableWait() does following: > > for (;;) > { > Assert(TransactionIdIsValid(xid)); > Assert(!TransactionIdEquals(xid, GetTopTransactionIdIfAny())); > > <...> > > xid = SubTransGetParent(xid); > } > > So if last statement is reached for top transaction then next iteration > will crash in first assert. And it will be reached if whole this loop > happens before transaction acquired heavyweight lock. > > Probability of that crash can be significantly increased be adding sleep > between xid generation and lock insertion in AssignTransactionId(). > Yes that helps thanks. Now that I reproduced it I understand, I was confused by the backtrace that said xid was 0 on the input to XactLockTableWait() but that's not the case, it's what xid is changed to in the inner loop. So what happens is that we manage to do LogStandbySnapshot(), decode the logged snapshot, and run SnapBuildWaitSnapshot() for a transaction in between GetNewTransactionId() and XactLockTableInsert() calls in AssignTransactionId() for that same transaction. I guess the probability of this happening is increased by the fact that GetRunningTransactionData() acquires XidGenLock so if there is GetNewTransactionId() running in parallel it will wait for it to finish and we'll log immediately after that. Hmm that means that Robert's loop idea will not help and ProcArrayLock will not save us either. Maybe we could either rewrite XactLockTableWait or create another version of it with SubTransGetParent() call replaced by SubTransGetTopmostTransaction() as that will return the same top level xid in case the input xid wasn't a subxact. That would make it safe to be called on transactions that didn't acquire lock on themselves yet. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 2017-11-30 00:25:58 +0100, Petr Jelinek wrote: > Yes that helps thanks. Now that I reproduced it I understand, I was > confused by the backtrace that said xid was 0 on the input to > XactLockTableWait() but that's not the case, it's what xid is changed to > in the inner loop. > So what happens is that we manage to do LogStandbySnapshot(), decode the > logged snapshot, and run SnapBuildWaitSnapshot() for a transaction in > between GetNewTransactionId() and XactLockTableInsert() calls in > AssignTransactionId() for that same transaction. > > I guess the probability of this happening is increased by the fact that > GetRunningTransactionData() acquires XidGenLock so if there is > GetNewTransactionId() running in parallel it will wait for it to finish > and we'll log immediately after that. > > Hmm that means that Robert's loop idea will not help and ProcArrayLock > will not save us either. Maybe we could either rewrite XactLockTableWait > or create another version of it with SubTransGetParent() call replaced > by SubTransGetTopmostTransaction() as that will return the same top > level xid in case the input xid wasn't a subxact. That would make it > safe to be called on transactions that didn't acquire lock on themselves > yet. I've not really looked into this deeply, but afair we can just make this code accept that edgecase be done with it. As the comment says: * Iterate through xids in record, wait for all older than the cutoff to* finish. Then, if possible, log a new xl_running_xactsrecord.* --- highlight ---* This isn't required for the correctness of decoding, but to: --- highlight ---* a) allow isolationtester to notice that we're currently waiting for* something.* b) log a new xl_running_xactsrecord where it'd be helpful, without having* to write for bgwriter or checkpointer. Greetings, Andres Freund
On 30/11/17 00:40, Andres Freund wrote: > On 2017-11-30 00:25:58 +0100, Petr Jelinek wrote: >> Yes that helps thanks. Now that I reproduced it I understand, I was >> confused by the backtrace that said xid was 0 on the input to >> XactLockTableWait() but that's not the case, it's what xid is changed to >> in the inner loop. > >> So what happens is that we manage to do LogStandbySnapshot(), decode the >> logged snapshot, and run SnapBuildWaitSnapshot() for a transaction in >> between GetNewTransactionId() and XactLockTableInsert() calls in >> AssignTransactionId() for that same transaction. >> >> I guess the probability of this happening is increased by the fact that >> GetRunningTransactionData() acquires XidGenLock so if there is >> GetNewTransactionId() running in parallel it will wait for it to finish >> and we'll log immediately after that. >> >> Hmm that means that Robert's loop idea will not help and ProcArrayLock >> will not save us either. Maybe we could either rewrite XactLockTableWait >> or create another version of it with SubTransGetParent() call replaced >> by SubTransGetTopmostTransaction() as that will return the same top >> level xid in case the input xid wasn't a subxact. That would make it >> safe to be called on transactions that didn't acquire lock on themselves >> yet. > > I've not really looked into this deeply, but afair we can just make this > code accept that edgecase be done with it. As the comment says: > > * Iterate through xids in record, wait for all older than the cutoff to > * finish. Then, if possible, log a new xl_running_xacts record. > * > --- highlight --- > * This isn't required for the correctness of decoding, but to: > --- highlight --- > * a) allow isolationtester to notice that we're currently waiting for > * something. > * b) log a new xl_running_xacts record where it'd be helpful, without having > * to write for bgwriter or checkpointer. > I don't understand. I mean sure the SnapBuildWaitSnapshot() can live with it, but the problematic logic happens inside the XactLockTableInsert() and SnapBuildWaitSnapshot() has no way of detecting the situation short of reimplementing the XactLockTableInsert() instead of calling it. The problem is that SubTransGetParent() returns InvalidTransactionId when the race happens and SubTransGetParen() is called because XactLockTableInsert() assumes that if transaction lock was acquired and the xid is still in progress, the input must have been xid of a subtransaction). This is why I suggested replacing that call with SubTransGetTopmostTransaction() which will just return same xid it got on input and we'll simply retry to do the lock instead of crashing. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 2017-11-30 00:45:44 +0100, Petr Jelinek wrote: > I don't understand. I mean sure the SnapBuildWaitSnapshot() can live > with it, but the problematic logic happens inside the > XactLockTableInsert() and SnapBuildWaitSnapshot() has no way of > detecting the situation short of reimplementing the > XactLockTableInsert() instead of calling it. Right. But we fairly trivially can change that. I'm remarking on it because other people's, not yours, suggestions aimed at making this bulletproof. I just wanted to make clear that I don't think that's necessary at all. Greetings, Andres Freund
On 30/11/17 00:47, Andres Freund wrote: > On 2017-11-30 00:45:44 +0100, Petr Jelinek wrote: >> I don't understand. I mean sure the SnapBuildWaitSnapshot() can live >> with it, but the problematic logic happens inside the >> XactLockTableInsert() and SnapBuildWaitSnapshot() has no way of >> detecting the situation short of reimplementing the >> XactLockTableInsert() instead of calling it. > > Right. But we fairly trivially can change that. I'm remarking on it > because other people's, not yours, suggestions aimed at making this > bulletproof. I just wanted to make clear that I don't think that's > necessary at all. > Okay, then I guess we are in agreement. I can confirm that the attached fixes the issue in my tests. Using SubTransGetTopmostTransaction() instead of SubTransGetParent() means 3 more ifs in terms of extra CPU cost for other callers. I don't think it's worth worrying about given we are waiting for heavyweight lock, but if we did we can just inline the code directly into SnapBuildWaitSnapshot(). -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Вложения
> On 30 Nov 2017, at 03:30, Petr Jelinek <petr.jelinek@2ndquadrant.com> wrote: > > On 30/11/17 00:47, Andres Freund wrote: >> On 2017-11-30 00:45:44 +0100, Petr Jelinek wrote: >>> I don't understand. I mean sure the SnapBuildWaitSnapshot() can live >>> with it, but the problematic logic happens inside the >>> XactLockTableInsert() and SnapBuildWaitSnapshot() has no way of >>> detecting the situation short of reimplementing the >>> XactLockTableInsert() instead of calling it. >> >> Right. But we fairly trivially can change that. I'm remarking on it >> because other people's, not yours, suggestions aimed at making this >> bulletproof. I just wanted to make clear that I don't think that's >> necessary at all. >> > > Okay, then I guess we are in agreement. I can confirm that the attached > fixes the issue in my tests. Using SubTransGetTopmostTransaction() > instead of SubTransGetParent() means 3 more ifs in terms of extra CPU > cost for other callers. I don't think it's worth worrying about given we > are waiting for heavyweight lock, but if we did we can just inline the > code directly into SnapBuildWaitSnapshot(). > > -- > Petr Jelinek http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services > <0001-Make-XactLockTableWait-work-for-transactions-that-ar.patch> +1 Seems that having busy loop is the best idea out of several discussed. I thought about small sleep at the bottom of that loop if we reached topmost transaction, but taking into account low probability of that event may be it is faster to do just busy wait. Also some clarifying comment in code would be nice. Stas Kelvich Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On 30 November 2017 at 11:30, Petr Jelinek <petr.jelinek@2ndquadrant.com> wrote: > On 30/11/17 00:47, Andres Freund wrote: >> On 2017-11-30 00:45:44 +0100, Petr Jelinek wrote: >>> I don't understand. I mean sure the SnapBuildWaitSnapshot() can live >>> with it, but the problematic logic happens inside the >>> XactLockTableInsert() and SnapBuildWaitSnapshot() has no way of >>> detecting the situation short of reimplementing the >>> XactLockTableInsert() instead of calling it. >> >> Right. But we fairly trivially can change that. I'm remarking on it >> because other people's, not yours, suggestions aimed at making this >> bulletproof. I just wanted to make clear that I don't think that's >> necessary at all. >> > > Okay, then I guess we are in agreement. I can confirm that the attached > fixes the issue in my tests. Using SubTransGetTopmostTransaction() > instead of SubTransGetParent() means 3 more ifs in terms of extra CPU > cost for other callers. I don't think it's worth worrying about given we > are waiting for heavyweight lock, but if we did we can just inline the > code directly into SnapBuildWaitSnapshot(). This will still fail an Assert in TransactionIdIsInProgress() when snapshots are overflowed. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 30/11/17 11:48, Simon Riggs wrote: > On 30 November 2017 at 11:30, Petr Jelinek <petr.jelinek@2ndquadrant.com> wrote: >> On 30/11/17 00:47, Andres Freund wrote: >>> On 2017-11-30 00:45:44 +0100, Petr Jelinek wrote: >>>> I don't understand. I mean sure the SnapBuildWaitSnapshot() can live >>>> with it, but the problematic logic happens inside the >>>> XactLockTableInsert() and SnapBuildWaitSnapshot() has no way of >>>> detecting the situation short of reimplementing the >>>> XactLockTableInsert() instead of calling it. >>> >>> Right. But we fairly trivially can change that. I'm remarking on it >>> because other people's, not yours, suggestions aimed at making this >>> bulletproof. I just wanted to make clear that I don't think that's >>> necessary at all. >>> >> >> Okay, then I guess we are in agreement. I can confirm that the attached >> fixes the issue in my tests. Using SubTransGetTopmostTransaction() >> instead of SubTransGetParent() means 3 more ifs in terms of extra CPU >> cost for other callers. I don't think it's worth worrying about given we >> are waiting for heavyweight lock, but if we did we can just inline the >> code directly into SnapBuildWaitSnapshot(). > > This will still fail an Assert in TransactionIdIsInProgress() when > snapshots are overflowed. > Hmm, which one, why? I see 2 Asserts there, one is: > Assert(nxids == 0); Which is inside the RecoveryInProgress(), surely on standbys there will still be no PGXACTs with assigned xids so that should be fine. The other one is: > Assert(TransactionIdIsValid(topxid)); Which should be again fine toplevel xid of toplevel xid is same xid which is a valid one. So I think we should be fine there. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Stas Kelvich wrote: > Seems that having busy loop is the best idea out of several discussed. > > I thought about small sleep at the bottom of that loop if we reached topmost > transaction, but taking into account low probability of that event may be > it is faster to do just busy wait. In other locations where we do similar things we have 1ms sleeps. I agree with the need for a comment here. Proposed patch attached. I tried your reload.pgb test case in 9.4 (after changing pgoutput to test_decoding and removing the 3rd arg to a function call) and the crash takes about 3 seconds without patch in my machine. No crash with this patch. Will push this shortly after lunch. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Вложения
Alvaro Herrera wrote: > Will push this shortly after lunch. Pushed. Will you (Konstantin, Stas, Masahiko) please verify that after this commit all the problems reported with logical replication are fixed? Thanks -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
> On 3 Jan 2018, at 23:35, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > > Pushed. Will you (Konstantin, Stas, Masahiko) please verify that after > this commit all the problems reported with logical replication are > fixed? Checked that with and without extra sleep in AssignTransactionId(). In both cases patch works. Thanks! -- Stas Kelvich Postgres Professional: http://www.postgrespro.com The Russian Postgres Company