Обсуждение: We've broken something in error recovery
In a somewhat misguided attempt to test something else, I did this in CVS HEAD: do $$beginfor i in 1 .. 10000 loop execute 'create table t' || i::text || ' (f1 int primary key)';end loop;end$$; This ran for awhile and then ran out of lock table space, which was not surprising in hindsight: ERROR: out of shared memory HINT: You might need to increase max_locks_per_transaction. But what was surprising was what happened next: the autovac launcher immediately crashed. TRAP: FailedAssertion("!(nestLevel > 0 && nestLevel <= GUCNestLevel)", File: "guc.c", Line: 3907) LOG: autovacuum launcher process (PID 25220) was terminated by signal 6 Stack trace looks like #4 0x4e85b4 in ExceptionalCondition ( conditionName=0x1ac4ac "!(nestLevel > 0 && nestLevel <= GUCNestLevel)", errorType=0x1abf44"FailedAssertion", fileName=0x1abee4 "guc.c", lineNumber=3907) at assert.c:57 #5 0x501f48 in AtEOXact_GUC (isCommit=-86 '�', nestLevel=84) at guc.c:3907 #6 0x20618c in AbortTransaction () at xact.c:2194 #7 0x20688c in AbortCurrentTransaction () at xact.c:2568 #8 0x3b0f84 in AutoVacLauncherMain (argc=2063670312, argv=0x7b03b94c) at autovacuum.c:491 #9 0x3b0bd8 in StartAutoVacLauncher () at autovacuum.c:371 Haven't dug any deeper yet --- who's touched this code lately? regards, tom lane
Tom Lane wrote: > #4 0x4e85b4 in ExceptionalCondition ( > conditionName=0x1ac4ac "!(nestLevel > 0 && nestLevel <= GUCNestLevel)", > errorType=0x1abf44 "FailedAssertion", fileName=0x1abee4 "guc.c", > lineNumber=3907) at assert.c:57 > #5 0x501f48 in AtEOXact_GUC (isCommit=-86 'ª', nestLevel=84) at guc.c:3907 > #6 0x20618c in AbortTransaction () at xact.c:2194 > > This looks like maybe a corrupted stack - the args to AtEOXact_GUC at that location in xact.c are hardwired. cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > Tom Lane wrote: >> #5 0x501f48 in AtEOXact_GUC (isCommit=-86 '�', nestLevel=84) at guc.c:3907 > This looks like maybe a corrupted stack - the args to AtEOXact_GUC at > that location in xact.c are hardwired. No, that's just a fairly typical behavior of debugging with -O greater than zero --- the registers holding those parameter values got recycled for something else. This is a rather old version of gdb and it doesn't always print <<value optimized away>> when it should. regards, tom lane
I wrote: > #4 0x4e85b4 in ExceptionalCondition ( > conditionName=0x1ac4ac "!(nestLevel > 0 && nestLevel <= GUCNestLevel)", > errorType=0x1abf44 "FailedAssertion", fileName=0x1abee4 "guc.c", > lineNumber=3907) at assert.c:57 > #5 0x501f48 in AtEOXact_GUC (isCommit=-86 '�', nestLevel=84) at guc.c:3907 > #6 0x20618c in AbortTransaction () at xact.c:2194 > #7 0x20688c in AbortCurrentTransaction () at xact.c:2568 > #8 0x3b0f84 in AutoVacLauncherMain (argc=2063670312, argv=0x7b03b94c) > at autovacuum.c:491 On investigation I think that Assert may just be overenthusiastic. The problem is that StartTransaction is failing at VirtualXactLockTableInsert, for lack of any shared memory to acquire the lock with; and then we try to do AbortTransaction and GUC is unhappy because it's not been initialized yet. So this isn't a new bug at all, it's been there awhile ... regards, tom lane