Обсуждение: pg_dump after transaction id wraparound failure

Поиск
Список
Период
Сортировка

pg_dump after transaction id wraparound failure

От
"Michael L. Boscia"
Дата:
Hello-

I currently have a 1.4 TB database (pg version 7.4.8) that is experiencing transaction id wraparound.  I plan to pg_dump the database and initdb and reload.  I may reload into an 8.1 database at this time. 

I am curious how to make sure that I indeed get all the data out of the tables from a pg_dump.  I want to ensure that there would not be anything "hidden" by the transaction id wraparound failure.

My current plan is selectively dump a portion of the data (most current, organized in tables by date ex xxx_2006_may) and reload.  I plan to vaccuum each table before dumping.  Please warn me if this is not enough.  Any assistance is greatly appreciated.

Mike
--
Mike Boscia
mikeboscia at gmail dot com

Re: pg_dump after transaction id wraparound failure

От
Martijn van Oosterhout
Дата:
On Mon, May 15, 2006 at 11:44:01AM -0700, Michael L. Boscia wrote:
> Hello-
>
> I currently have a 1.4 TB database (pg version 7.4.8) that is experiencing
> transaction id wraparound.  I plan to pg_dump the database and initdb and
> reload.  I may reload into an 8.1 database at this time.
>
> I am curious how to make sure that I indeed get all the data out of the
> tables from a pg_dump.  I want to ensure that there would not be anything
> "hidden" by the transaction id wraparound failure.

pg_dump obviously won't see anything that's hidden by wraparound
failure.

Vacuum will make anything that disappeared by wraparound in the last
billion transactions reappear, so a databasewide vacuum should solve
all your problems, no need to dump...

> My current plan is selectively dump a portion of the data (most current,
> organized in tables by date ex xxx_2006_may) and reload.  I plan to vaccuum
> each table before dumping.  Please warn me if this is not enough.  Any
> assistance is greatly appreciated.

I'd say do the vacuum first, dumping the data isn't really helping much
in this case I think (although for backups it's obviously important).

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Вложения

Re: pg_dump after transaction id wraparound failure

От
Scott Marlowe
Дата:
On Mon, 2006-05-15 at 13:48, Martijn van Oosterhout wrote:
> On Mon, May 15, 2006 at 11:44:01AM -0700, Michael L. Boscia wrote:
> > Hello-
> >
> > I currently have a 1.4 TB database (pg version 7.4.8) that is experiencing
> > transaction id wraparound.  I plan to pg_dump the database and initdb and
> > reload.  I may reload into an 8.1 database at this time.
> >
> > I am curious how to make sure that I indeed get all the data out of the
> > tables from a pg_dump.  I want to ensure that there would not be anything
> > "hidden" by the transaction id wraparound failure.
>
> pg_dump obviously won't see anything that's hidden by wraparound
> failure.
>
> Vacuum will make anything that disappeared by wraparound in the last
> billion transactions reappear, so a databasewide vacuum should solve
> all your problems, no need to dump...
>
> > My current plan is selectively dump a portion of the data (most current,
> > organized in tables by date ex xxx_2006_may) and reload.  I plan to vaccuum
> > each table before dumping.  Please warn me if this is not enough.  Any
> > assistance is greatly appreciated.
>
> I'd say do the vacuum first, dumping the data isn't really helping much
> in this case I think (although for backups it's obviously important).

This is one of those rare instances when I'd shut down the server and
take a file system backup if I could.  However, at 1.4 TB, that might be
a bit difficult... :)

Re: pg_dump after transaction id wraparound failure

От
Martijn van Oosterhout
Дата:
On Mon, May 15, 2006 at 08:48:54PM +0200, Martijn van Oosterhout wrote:
> Vacuum will make anything that disappeared by wraparound in the last
> billion transactions reappear, so a databasewide vacuum should solve
> all your problems, no need to dump...

Also, for future reference. If you have a lot of read-only tables (and
judging by the names of your tables it looks like you might), then
running VACUUM FREEZE on a table will put it in a state where you never
have to vacuum it again (until you do an insert/update).

Put another way, anything in a table at the point you run a VACUUM
FREZE over it is protected from wraparound. Over time all tables will
convert to this state, just an explicit freeze makes it quicker, for a
1.4TB data this might be important.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Вложения

Re: pg_dump after transaction id wraparound failure

От
Tom Lane
Дата:
"Michael L. Boscia" <mikeboscia@gmail.com> writes:
> My current plan is selectively dump a portion of the data (most current,
> organized in tables by date ex xxx_2006_may) and reload.  I plan to vaccuum
> each table before dumping.  Please warn me if this is not enough.  Any
> assistance is greatly appreciated.

Do NOT vacuum.  There is zero potential upside and it might delete some
of your data.

It might be worthwhile to try to set the transaction counter back ---
depends on whether you are more worried about losing the oldest data or
newest data.  Or you could take two sets of dumps and compare.

            regards, tom lane

Re: pg_dump after transaction id wraparound failure

От
Tom Lane
Дата:
Martijn van Oosterhout <kleptog@svana.org> writes:
> Vacuum will make anything that disappeared by wraparound in the last
> billion transactions reappear, so a databasewide vacuum should solve
> all your problems, no need to dump...

Oh, I had forgotten about that.  Did we establish that 7.4.* works that
way?  I think so, but obviously I've been programming too many hours in
a row :-(

            regards, tom lane

Re: pg_dump after transaction id wraparound failure

От
"Michael L. Boscia"
Дата:
On 5/15/06, Scott Marlowe <smarlowe@g2switchworks.com> wrote:
> On Mon, 2006-05-15 at 13:48, Martijn van Oosterhout wrote:
> > On Mon, May 15, 2006 at 11:44:01AM -0700, Michael L. Boscia wrote:
> > > Hello-
> > >
> > > I currently have a 1.4 TB database (pg version 7.4.8) that is experiencing
> > > transaction id wraparound.  I plan to pg_dump the database and initdb and
> > > reload.  I may reload into an 8.1 database at this time.
> > >
> > > I am curious how to make sure that I indeed get all the data out of the
> > > tables from a pg_dump.  I want to ensure that there would not be anything
> > > "hidden" by the transaction id wraparound failure.
> >
> > pg_dump obviously won't see anything that's hidden by wraparound
> > failure.
Yes- this does make sense.
> >
> > Vacuum will make anything that disappeared by wraparound in the last
> > billion transactions reappear, so a databasewide vacuum should solve
> > all your problems, no need to dump...

Are we positive that a database wide vacuum will solve this? I am
concerned that while some transactions may "appear" some might still
be affected by wraparound failure

> > > My current plan is selectively dump a portion of the data (most current,
> > > organized in tables by date ex xxx_2006_may) and reload.  I plan to vaccuum
> > > each table before dumping.  Please warn me if this is not enough.  Any
> > > assistance is greatly appreciated.
> >
> > I'd say do the vacuum first, dumping the data isn't really helping much
> > in this case I think (although for backups it's obviously important).

The only reason I would like to dump is to rid myself of a possibly
corrupt pg_catalog tablespace and any issues that may arise out of
this.  This database was not cared for properly, in that vacuums were
not properly scheduled, and I would like to get all the important data
out and start over with a new initdb.  I am just not sure that I would
be getting all the data out.

> This is one of those rare instances when I'd shut down the server and
> take a file system backup if I could.  However, at 1.4 TB, that might be
> a bit difficult... :)

What do I do with a filesystem backup after its done?  If I copy and
restore somewhere won't I experience all the same issues?

Thanks so much for your help
--
Mike Boscia
mikeboscia at gmail dot com

Re: pg_dump after transaction id wraparound failure

От
Scott Marlowe
Дата:
On Mon, 2006-05-15 at 14:07, Michael L. Boscia wrote:
> On 5/15/06, Scott Marlowe <smarlowe@g2switchworks.com> wrote:

> > This is one of those rare instances when I'd shut down the server and
> > take a file system backup if I could.  However, at 1.4 TB, that might be
> > a bit difficult... :)
>
> What do I do with a filesystem backup after its done?  If I copy and
> restore somewhere won't I experience all the same issues?

Well, imagine that things get worse (like that catalog corruption you
mentioned)...  If you have a file system backup you can at least get
back to where you are right now.  Better to have a huge, useless backup
you don't need laying about than to be thinking 5 minutes after the
vacuum "Man, I wish I had a file system backup..."

Re: pg_dump after transaction id wraparound failure

От
Martijn van Oosterhout
Дата:
On Mon, May 15, 2006 at 03:04:47PM -0400, Tom Lane wrote:
> Martijn van Oosterhout <kleptog@svana.org> writes:
> > Vacuum will make anything that disappeared by wraparound in the last
> > billion transactions reappear, so a databasewide vacuum should solve
> > all your problems, no need to dump...
>
> Oh, I had forgotten about that.  Did we establish that 7.4.* works that
> way?  I think so, but obviously I've been programming too many hours in
> a row :-(

Well, I demonstrated it for 8.1beta here:

http://archives.postgresql.org/pgsql-general/2005-11/msg00097.php

And I just demonstrated it on a test 7.4.7 database here (Debian
7.4.7-6 to be precise).

test=# select ctid, xmin, xmax, cmin, cmax, * from test;
 ctid  |    xmin    | xmax | cmin | cmax | x
-------+------------+------+------+------+---
 (0,1) |        541 |    0 |    0 |    0 | 1
 (0,2) | 2147484153 |    0 |    0 |    0 | 2
 (0,3) | 2147484183 |    0 |    0 |    0 | 3
(3 rows)

### Row is still there...

test=# select ctid, xmin, xmax, cmin, cmax, * from test;
 ctid  |    xmin    | xmax | cmin | cmax | x
-------+------------+------+------+------+---
 (0,2) | 2147484153 |    0 |    0 |    0 | 2
 (0,3) | 2147484183 |    0 |    0 |    0 | 3
(2 rows)

### OMG, row is gone

test=# vacuum test;
VACUUM
test=# select ctid, xmin, xmax, cmin, cmax, * from test;
 ctid  |    xmin    | xmax | cmin | cmax | x
-------+------------+------+------+------+---
 (0,1) |          2 |    0 |    0 |    0 | 1
 (0,2) | 2147484153 |    0 |    0 |    0 | 2
 (0,3) | 2147484183 |    0 |    0 |    0 | 3
(3 rows)

### Yay row is back
test=# select version();
                                                version
-------------------------------------------------------------------------------------------------------
 PostgreSQL 7.4.7 on i386-pc-linux-gnu, compiled by GCC i386-linux-gcc (GCC) 3.3.5 (Debian 1:3.3.5-12)
(1 row)

Have a ncie day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Вложения

Re: pg_dump after transaction id wraparound failure

От
"Michael L. Boscia"
Дата:
> And I just demonstrated it on a test 7.4.7 database here (Debian
> 7.4.7-6 to be precise).
>
> test=# select ctid, xmin, xmax, cmin, cmax, * from test;
>  ctid  |    xmin    | xmax | cmin | cmax | x
> -------+------------+------+------+------+---
>  (0,1) |        541 |    0 |    0 |    0 | 1
>  (0,2) | 2147484153 |    0 |    0 |    0 | 2
>  (0,3) | 2147484183 |    0 |    0 |    0 | 3
> (3 rows)
>
> ### Row is still there...
>
> test=# select ctid, xmin, xmax, cmin, cmax, * from test;
>  ctid  |    xmin    | xmax | cmin | cmax | x
> -------+------------+------+------+------+---
>  (0,2) | 2147484153 |    0 |    0 |    0 | 2
>  (0,3) | 2147484183 |    0 |    0 |    0 | 3
> (2 rows)
>
> ### OMG, row is gone
>
> test=# vacuum test;
> VACUUM
> test=# select ctid, xmin, xmax, cmin, cmax, * from test;
>  ctid  |    xmin    | xmax | cmin | cmax | x
> -------+------------+------+------+------+---
>  (0,1) |          2 |    0 |    0 |    0 | 1
>  (0,2) | 2147484153 |    0 |    0 |    0 | 2
>  (0,3) | 2147484183 |    0 |    0 |    0 | 3
> (3 rows)
>
> ### Yay row is back
> test=# select version();
>                                                 version
> -------------------------------------------------------------------------------------------------------
>  PostgreSQL 7.4.7 on i386-pc-linux-gnu, compiled by GCC i386-linux-gcc (GCC) 3.3.5 (Debian 1:3.3.5-12)
> (1 row)
>
> Have a ncie day,
> --
To Clarify - So the consensus is to vacuum the entire database?  This
should take care of any possible loss of data.  Then after this I
should pg_dump?

Thanks again
--Mike