Обсуждение: BUG #6167: pg_dump fails on table lock
The following bug has been logged online: Bug reference: 6167 Logged by: Jesper Engman Email address: jesper@engman.net PostgreSQL version: 8.3.10 Operating system: Linux Description: pg_dump fails on table lock Details: I have tables that exists for short time periods, sometimes for as short as 5 min. pg_dump is starting to fail due to a problem to lock these tables: pg_dump: SQL command failed pg_dump: Error message from server: ERROR: relation "vehicle_change_partitions.vehicle_change_export_p4368494" does not exist pg_dump: The command was: LOCK TABLE vehicle_change_partitions.vehicle_change_export_p4368494 IN ACCESS SHARE MODE Backup failed: PGPASSWORD=xxxxx && export PGPASSWORD && export PGOPTIONS="-c statement_timeout=0 -c maintenance_work_mem=2147483647" && /usr/bin/pg_dump -h xxx.xxx.xxx.xxx -U postgres --ignore-version -Fc -Z 6 xxxxxxxx > /vol/nfs_backup/postgres_dumps/2011_07_13/xxxxxxxx_2011_07_13 Account: xxxxxxxx Backup failed How is this possible - pg_dump is a serializable transaction? It doesn't seem to be tripped up by some other backend function since this actually fails on the lock.
On Thu, Aug 18, 2011 at 2:05 AM, Jesper Engman <jesper@engman.net> wrote: > > The following bug has been logged online: > > Bug reference: =A0 =A0 =A06167 > Logged by: =A0 =A0 =A0 =A0 =A0Jesper Engman > Email address: =A0 =A0 =A0jesper@engman.net > PostgreSQL version: 8.3.10 > Operating system: =A0 Linux > Description: =A0 =A0 =A0 =A0pg_dump fails on table lock > Details: > > I have tables that exists for short time periods, sometimes for as short = as > 5 min. pg_dump is starting to fail due to a problem to lock these tables: > > pg_dump: SQL command failed > pg_dump: Error message from server: ERROR: relation > "vehicle_change_partitions.vehicle_change_export_p4368494" does not exist > pg_dump: The command was: LOCK TABLE > vehicle_change_partitions.vehicle_change_export_p4368494 IN ACCESS SHARE > MODE > Backup failed: PGPASSWORD=3Dxxxxx && export PGPASSWORD && export PGOPTION= S=3D"-c > statement_timeout=3D0 -c maintenance_work_mem=3D2147483647" && /usr/bin/p= g_dump > -h xxx.xxx.xxx.xxx -U postgres --ignore-version -Fc -Z 6 xxxxxxxx > > /vol/nfs_backup/postgres_dumps/2011_07_13/xxxxxxxx_2011_07_13 > Account: xxxxxxxx Backup failed > > How is this possible - pg_dump is a serializable transaction? It doesn't > seem to be tripped up by some other backend function since this actually > fails on the lock. Well, its not a bug. You've asked to dump a table and then dropped the table concurrently with the attempt to dump the table. Exclude the tables you don't wish to have dumped using command line options. I don't think we will put in an option to silently exclude missing tables, not least because it would be technically difficult. --=20 =A0Simon Riggs=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 http:/= /www.2ndQuadrant.com/ =A0PostgreSQL Development, 24x7 Support, Training & Services
DDL is not excluded from MVCC, right? This kind of concurrency should be handled and it's generally managed just fine. I just did a lab test attempting to drop a table while pg_dump is running on the same db and it simply waits to drop the table until pg_dump is done. That is the expected behavior. There is some discussion about concurrency and DDL changes in: http://archives.postgresql.org/pgsql-bugs/2010-02/msg00187.php But in that case, some specialized backend functions like pg_get_indexdef is using committed state and that doesn't seem to occur in this case. I wonder if there is a small time span between when pg_dump starts and when all locks have been acquired that may be the problem (if a table is dropped during that time span). Is there such a small time of vulnerability? The database in question does not have a ton of tables like this - about 10 tables and the tables exists for about 5 min. But this runs on many databases (more than a thousand). So, if there is a window of vulnerability (if only small) - chances are we're hitting it. Excluding tables from the dump is not an option - that will be an incomplete backup. On Thu, Aug 18, 2011 at 4:03 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > On Thu, Aug 18, 2011 at 2:05 AM, Jesper Engman <jesper@engman.net> wrote: >> >> The following bug has been logged online: >> >> Bug reference: =A0 =A0 =A06167 >> Logged by: =A0 =A0 =A0 =A0 =A0Jesper Engman >> Email address: =A0 =A0 =A0jesper@engman.net >> PostgreSQL version: 8.3.10 >> Operating system: =A0 Linux >> Description: =A0 =A0 =A0 =A0pg_dump fails on table lock >> Details: >> >> I have tables that exists for short time periods, sometimes for as short= as >> 5 min. pg_dump is starting to fail due to a problem to lock these tables: >> >> pg_dump: SQL command failed >> pg_dump: Error message from server: ERROR: relation >> "vehicle_change_partitions.vehicle_change_export_p4368494" does not exist >> pg_dump: The command was: LOCK TABLE >> vehicle_change_partitions.vehicle_change_export_p4368494 IN ACCESS SHARE >> MODE >> Backup failed: PGPASSWORD=3Dxxxxx && export PGPASSWORD && export PGOPTIO= NS=3D"-c >> statement_timeout=3D0 -c maintenance_work_mem=3D2147483647" && /usr/bin/= pg_dump >> -h xxx.xxx.xxx.xxx -U postgres --ignore-version -Fc -Z 6 xxxxxxxx > >> /vol/nfs_backup/postgres_dumps/2011_07_13/xxxxxxxx_2011_07_13 >> Account: xxxxxxxx Backup failed >> >> How is this possible - pg_dump is a serializable transaction? It doesn't >> seem to be tripped up by some other backend function since this actually >> fails on the lock. > > > Well, its not a bug. > > You've asked to dump a table and then dropped the table concurrently > with the attempt to dump the table. > > Exclude the tables you don't wish to have dumped using command line optio= ns. > > I don't think we will put in an option to silently exclude missing > tables, not least because it would be technically difficult. > > -- > =A0Simon Riggs=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 http= ://www.2ndQuadrant.com/ > =A0PostgreSQL Development, 24x7 Support, Training & Services >
Jesper Engman <jesper@engman.net> writes: > I wonder if there is a small time span between when pg_dump starts and > when all locks have been acquired that may be the problem (if a table > is dropped during that time span). Is there such a small time of > vulnerability? Certainly. pg_dump has to read pg_class to get the names of the tables, and then try to lock each one. If you drop a table during that window, the lock command will fail. The window is actually a bit longer than necessary in existing releases, because pg_dump was doing some other stuff before it got around to acquiring the locks. I fixed that recently http://git.postgresql.org/gitweb/?p=postgresql.git&a=commitdiff&h=68d977a73 but that patch hasn't made it to any released versions yet. In any case there's still a nonzero window. > Excluding tables from the dump is not an option - that will be an > incomplete backup. Um ... if you know it's a transient table, why do you care about backing it up? regards, tom lane