Обсуждение: Recovery failed on a backup with " lock AccessShareLock on object 16477/244169/0 is already held"
Hi, I hit an issue running PG 8.2.3 with the continuous archiving feature where I was unable to recover from the backup. I was wondering if this may be related to bug #3245? These are the steps that occurred before I saw this problem: 1. Prepare transaction. 2. A base backup of the database was taken to a warm standby system. 3. Commit prepared. The commit prepared never finished as it hit a PANIC: 2008-06-17 23:53:53.206 Local time zone must be set--see zic manual page PANIC: failed to re-find shared lock object 2008-06-17 23:53:53.207 Local time zone must be set--see zic manual page STATEMENT: commit prepared '148969' ; I believe this panic is probably bug #3245 based on the description of that bug - http://archives.postgresql.org/pgsql-bugs/2007-04/msg00075.php At this point I attempted to do a recovery using the continuous archive backup on the warm standby system. Instead of recovering correctly it encountered this FATAL error where a AccessSharedLock was already held. 2008-06-18 00:05:34.045 Local time zone must be set--see zic manual page LOG: database system was interrupted at 2008-06-17 23:53:16 Local time zone must be set--see zic manual page 2008-06-18 00:05:34.077 Local time zone must be set--see zic manual page LOG: checkpoint record is at 70/E600DC18 2008-06-18 00:05:34.077 Local time zone must be set--see zic manual page LOG: redo record is at 70/E600DC18; undo record is at 0/0; shutdown FALSE 2008-06-18 00:05:34.077 Local time zone must be set--see zic manual page LOG: next transaction ID: 0/1099178; next OID: 413234 2008-06-18 00:05:34.077 Local time zone must be set--see zic manual page LOG: next MultiXactId: 1; next MultiXactOffset: 0 2008-06-18 00:05:34.077 Local time zone must be set--see zic manual page LOG: database system was not properly shut down; automatic recovery in progress 2008-06-18 00:05:34.105 Local time zone must be set--see zic manual page LOG: redo starts at 70/E600DC68 2008-06-18 00:05:34.106 Local time zone must be set--see zic manual page LOG: could not open file "pg_xlog/0000000100000070000000E7" (log file 112, segment 231): No such file or directory 2008-06-18 00:05:34.106 Local time zone must be set--see zic manual page LOG: redo done at 70/E600DC68 2008-06-18 00:05:34.293 Local time zone must be set--see zic manual page LOG: recovering prepared transaction 1099169 2008-06-18 00:05:34.293 Local time zone must be set--see zic manual page LOG: recovering prepared transaction 1099156 2008-06-18 00:05:34.293 Local time zone must be set--see zic manual page LOG: recovering prepared transaction 1099157 2008-06-18 00:05:34.293 Local time zone must be set--see zic manual page LOG: recovering prepared transaction 1099161 2008-06-18 00:05:34.293 Local time zone must be set--see zic manual page LOG: recovering prepared transaction 1099164 2008-06-18 00:05:34.293 Local time zone must be set--see zic manual page LOG: recovering prepared transaction 1099162 2008-06-18 00:05:34.293 Local time zone must be set--see zic manual page LOG: recovering prepared transaction 1099166 2008-06-18 00:05:34.294 Local time zone must be set--see zic manual page LOG: recovering prepared transaction 1099131 2008-06-18 00:05:34.298 Local time zone must be set--see zic manual page FATAL: lock AccessShareLock on object 16477/244169/0 is already held 2008-06-18 00:05:34.299 Local time zone must be set--see zic manual page LOG: startup process (PID 17377) exited with exit code 1 2008-06-18 00:05:34.299 Local time zone must be set--see zic manual page LOG: aborting startup due to startup process failure Is this FATAL error seen on recovery a different bug or is it just a direct result of bug #3245? Unfortunately I do not have a way to deterministically reproduce this problem but I have seen it 3 times so far. thanks, John
"John Smith" <sodgodofall@gmail.com> writes: > 2008-06-17 23:53:53.206 Local time zone must be set--see zic manual > page PANIC: failed to re-find shared lock object > 2008-06-17 23:53:53.207 Local time zone must be set--see zic manual > page STATEMENT: commit prepared '148969' ; > I believe this panic is probably bug #3245 based on the description of > that bug - http://archives.postgresql.org/pgsql-bugs/2007-04/msg00075.php Yeah, looks like it to me too. > At this point I attempted to do a recovery using the continuous > archive backup on the warm standby system. Instead of recovering > correctly it encountered this FATAL error where a AccessSharedLock was > already held. > 2008-06-18 00:05:34.298 Local time zone must be set--see zic manual > page FATAL: lock AccessShareLock on object 16477/244169/0 is already > held > 2008-06-18 00:05:34.299 Local time zone must be set--see zic manual > page LOG: startup process (PID 17377) exited with exit code 1 > 2008-06-18 00:05:34.299 Local time zone must be set--see zic manual > page LOG: aborting startup due to startup process failure > Is this FATAL error seen on recovery a different bug or is it just a > direct result of bug #3245? It probably is the same bug. The underlying cause of that bug is explained here: http://archives.postgresql.org/pgsql-bugs/2007-04/msg00129.php I think what you are seeing is just a variant case caused by the same lock being written out to the twophase file twice. In any case there's probably little point in digging further until you've updated to a version with that fix --- if you still see the problem afterward, we can look closer. BTW, what's with the bizarre "Local time zone must be set--see zic manual" where the timezone should be? Are you intentionally selecting the "Factory" zone? regards, tom lane
Thanks for the quick reply Tom. I'll be updating my PG version to one with a fix for bug #3245 so hopefully we won't see this anymore. > BTW, what's with the bizarre "Local time zone must be set--see zic > manual" where the timezone should be? Are you intentionally selecting > the "Factory" zone? I don't think I've put the correct timezone file in /etc/localtime so it is using some default file from the Gentoo install. John On Mon, Jun 30, 2008 at 12:26 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > "John Smith" <sodgodofall@gmail.com> writes: >> 2008-06-17 23:53:53.206 Local time zone must be set--see zic manual >> page PANIC: failed to re-find shared lock object >> 2008-06-17 23:53:53.207 Local time zone must be set--see zic manual >> page STATEMENT: commit prepared '148969' ; > >> I believe this panic is probably bug #3245 based on the description of >> that bug - http://archives.postgresql.org/pgsql-bugs/2007-04/msg00075.php > > Yeah, looks like it to me too. > >> At this point I attempted to do a recovery using the continuous >> archive backup on the warm standby system. Instead of recovering >> correctly it encountered this FATAL error where a AccessSharedLock was >> already held. >> 2008-06-18 00:05:34.298 Local time zone must be set--see zic manual >> page FATAL: lock AccessShareLock on object 16477/244169/0 is already >> held >> 2008-06-18 00:05:34.299 Local time zone must be set--see zic manual >> page LOG: startup process (PID 17377) exited with exit code 1 >> 2008-06-18 00:05:34.299 Local time zone must be set--see zic manual >> page LOG: aborting startup due to startup process failure > >> Is this FATAL error seen on recovery a different bug or is it just a >> direct result of bug #3245? > > It probably is the same bug. The underlying cause of that bug is > explained here: > http://archives.postgresql.org/pgsql-bugs/2007-04/msg00129.php > I think what you are seeing is just a variant case caused by the same > lock being written out to the twophase file twice. In any case there's > probably little point in digging further until you've updated to a > version with that fix --- if you still see the problem afterward, > we can look closer. > > BTW, what's with the bizarre "Local time zone must be set--see zic > manual" where the timezone should be? Are you intentionally selecting > the "Factory" zone? > > regards, tom lane >
"John Smith" <sodgodofall@gmail.com> writes: >> BTW, what's with the bizarre "Local time zone must be set--see zic >> manual" where the timezone should be? Are you intentionally selecting >> the "Factory" zone? > I don't think I've put the correct timezone file in /etc/localtime so > it is using some default file from the Gentoo install. Ah, yes, I was able to duplicate that behavior by overwriting /etc/localtime with /usr/share/zoneinfo/Factory. I guess the Gentoo folks failed in their intention to annoy you enough to make you set the zone correctly ;-) regards, tom lane