Обсуждение: BUG #6167: pg_dump fails on table lock

Поиск
Список
Период
Сортировка

BUG #6167: pg_dump fails on table lock

От
"Jesper Engman"
Дата:
The following bug has been logged online:

Bug reference:      6167
Logged by:          Jesper Engman
Email address:      jesper@engman.net
PostgreSQL version: 8.3.10
Operating system:   Linux
Description:        pg_dump fails on table lock
Details:

I have tables that exists for short time periods, sometimes for as short as
5 min. pg_dump is starting to fail due to a problem to lock these tables:

pg_dump: SQL command failed
pg_dump: Error message from server: ERROR: relation
"vehicle_change_partitions.vehicle_change_export_p4368494" does not exist
pg_dump: The command was: LOCK TABLE
vehicle_change_partitions.vehicle_change_export_p4368494 IN ACCESS SHARE
MODE
Backup failed: PGPASSWORD=xxxxx && export PGPASSWORD && export PGOPTIONS="-c
statement_timeout=0 -c maintenance_work_mem=2147483647" && /usr/bin/pg_dump
-h xxx.xxx.xxx.xxx -U postgres --ignore-version -Fc -Z 6 xxxxxxxx >
/vol/nfs_backup/postgres_dumps/2011_07_13/xxxxxxxx_2011_07_13
Account: xxxxxxxx Backup failed

How is this possible - pg_dump is a serializable transaction? It doesn't
seem to be tripped up by some other backend function since this actually
fails on the lock.

Re: BUG #6167: pg_dump fails on table lock

От
Simon Riggs
Дата:
On Thu, Aug 18, 2011 at 2:05 AM, Jesper Engman <jesper@engman.net> wrote:
>
> The following bug has been logged online:
>
> Bug reference: =A0 =A0 =A06167
> Logged by: =A0 =A0 =A0 =A0 =A0Jesper Engman
> Email address: =A0 =A0 =A0jesper@engman.net
> PostgreSQL version: 8.3.10
> Operating system: =A0 Linux
> Description: =A0 =A0 =A0 =A0pg_dump fails on table lock
> Details:
>
> I have tables that exists for short time periods, sometimes for as short =
as
> 5 min. pg_dump is starting to fail due to a problem to lock these tables:
>
> pg_dump: SQL command failed
> pg_dump: Error message from server: ERROR: relation
> "vehicle_change_partitions.vehicle_change_export_p4368494" does not exist
> pg_dump: The command was: LOCK TABLE
> vehicle_change_partitions.vehicle_change_export_p4368494 IN ACCESS SHARE
> MODE
> Backup failed: PGPASSWORD=3Dxxxxx && export PGPASSWORD && export PGOPTION=
S=3D"-c
> statement_timeout=3D0 -c maintenance_work_mem=3D2147483647" && /usr/bin/p=
g_dump
> -h xxx.xxx.xxx.xxx -U postgres --ignore-version -Fc -Z 6 xxxxxxxx >
> /vol/nfs_backup/postgres_dumps/2011_07_13/xxxxxxxx_2011_07_13
> Account: xxxxxxxx Backup failed
>
> How is this possible - pg_dump is a serializable transaction? It doesn't
> seem to be tripped up by some other backend function since this actually
> fails on the lock.


Well, its not a bug.

You've asked to dump a table and then dropped the table concurrently
with the attempt to dump the table.

Exclude the tables you don't wish to have dumped using command line options.

I don't think we will put in an option to silently exclude missing
tables, not least because it would be technically difficult.

--=20
=A0Simon Riggs=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 http:/=
/www.2ndQuadrant.com/
=A0PostgreSQL Development, 24x7 Support, Training & Services

Re: BUG #6167: pg_dump fails on table lock

От
Jesper Engman
Дата:
DDL is not excluded from MVCC, right? This kind of concurrency should
be handled and it's generally managed just fine. I just did a lab test
attempting to drop a table while pg_dump is running on the same db and
it simply waits to drop the table until pg_dump is done. That is the
expected behavior.

There is some discussion about concurrency and DDL changes in:

http://archives.postgresql.org/pgsql-bugs/2010-02/msg00187.php

But in that case, some specialized backend functions like
pg_get_indexdef is using committed state and that doesn't seem to
occur in this case.

I wonder if there is a small time span between when pg_dump starts and
when all locks have been acquired that may be the problem (if a table
is dropped during that time span). Is there such a small time of
vulnerability?

The database in question does not have a ton of tables like this -
about 10 tables and the tables exists for about 5 min. But this runs
on many databases (more than a thousand). So, if there is a window of
vulnerability (if only small) - chances are we're hitting it.

Excluding tables from the dump is not an option - that will be an
incomplete backup.

On Thu, Aug 18, 2011 at 4:03 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Thu, Aug 18, 2011 at 2:05 AM, Jesper Engman <jesper@engman.net> wrote:
>>
>> The following bug has been logged online:
>>
>> Bug reference: =A0 =A0 =A06167
>> Logged by: =A0 =A0 =A0 =A0 =A0Jesper Engman
>> Email address: =A0 =A0 =A0jesper@engman.net
>> PostgreSQL version: 8.3.10
>> Operating system: =A0 Linux
>> Description: =A0 =A0 =A0 =A0pg_dump fails on table lock
>> Details:
>>
>> I have tables that exists for short time periods, sometimes for as short=
 as
>> 5 min. pg_dump is starting to fail due to a problem to lock these tables:
>>
>> pg_dump: SQL command failed
>> pg_dump: Error message from server: ERROR: relation
>> "vehicle_change_partitions.vehicle_change_export_p4368494" does not exist
>> pg_dump: The command was: LOCK TABLE
>> vehicle_change_partitions.vehicle_change_export_p4368494 IN ACCESS SHARE
>> MODE
>> Backup failed: PGPASSWORD=3Dxxxxx && export PGPASSWORD && export PGOPTIO=
NS=3D"-c
>> statement_timeout=3D0 -c maintenance_work_mem=3D2147483647" && /usr/bin/=
pg_dump
>> -h xxx.xxx.xxx.xxx -U postgres --ignore-version -Fc -Z 6 xxxxxxxx >
>> /vol/nfs_backup/postgres_dumps/2011_07_13/xxxxxxxx_2011_07_13
>> Account: xxxxxxxx Backup failed
>>
>> How is this possible - pg_dump is a serializable transaction? It doesn't
>> seem to be tripped up by some other backend function since this actually
>> fails on the lock.
>
>
> Well, its not a bug.
>
> You've asked to dump a table and then dropped the table concurrently
> with the attempt to dump the table.
>
> Exclude the tables you don't wish to have dumped using command line optio=
ns.
>
> I don't think we will put in an option to silently exclude missing
> tables, not least because it would be technically difficult.
>
> --
> =A0Simon Riggs=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 http=
://www.2ndQuadrant.com/
> =A0PostgreSQL Development, 24x7 Support, Training & Services
>

Re: BUG #6167: pg_dump fails on table lock

От
Tom Lane
Дата:
Jesper Engman <jesper@engman.net> writes:
> I wonder if there is a small time span between when pg_dump starts and
> when all locks have been acquired that may be the problem (if a table
> is dropped during that time span). Is there such a small time of
> vulnerability?

Certainly.  pg_dump has to read pg_class to get the names of the tables,
and then try to lock each one.  If you drop a table during that window,
the lock command will fail.

The window is actually a bit longer than necessary in existing releases,
because pg_dump was doing some other stuff before it got around to
acquiring the locks.  I fixed that recently
http://git.postgresql.org/gitweb/?p=postgresql.git&a=commitdiff&h=68d977a73
but that patch hasn't made it to any released versions yet.  In any case
there's still a nonzero window.

> Excluding tables from the dump is not an option - that will be an
> incomplete backup.

Um ... if you know it's a transient table, why do you care about backing
it up?

            regards, tom lane