Обсуждение: Database OID xxxxx now seems to belong to "foo"

Поиск
Список
Период
Сортировка

Database OID xxxxx now seems to belong to "foo"

От
"Gauthier, Dave"
Дата:

Hi:

 

Trying (and failing) to attach to my DBs.  Getting...     

 

database “foo_standby” has disappeared form pg_database

DETAIL: Database OID 2323523 now seems to belong to “foo”

 

 I have 2 dbs... foo and foo_standby.  Every midnight, I truncate foo_standby and load it up with all the current data from mu various sources.  If that’s successful, I then rename foo to foo_tmp, then rename foo_standby to foo, then rename foo_tmp to foo_standby.  This results in a minimal “down time” of the foo database ebcause the renames happen fast. And foo_standby is now my backup from the previous day, something I can switch back to if something goes wrong with the new foo db.

 

Anyway, “psql –l” does show my dbs.  There are other dbs for other projects and I can connect to those OK.  I just can’t connect to foo or foo_standby.

 

Q: What might have caused this problem and how can I avoid/prevent it in the future

Q: Can I salvage my foo and foo_standby DBs?

 

Thanks

-dave

 

 

 

 

 

 

Re: Database OID xxxxx now seems to belong to "foo"

От
"Gauthier, Dave"
Дата:

I tried vacuum pg_database, no luck.   I shut down and restarted the DB and it seems to have fixed the problem.  Still, not sure why it happened in the first place or how to prevent it in the future.

 


From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Gauthier, Dave
Sent: Tuesday, March 11, 2008 9:31 AM
To: pgsql-general@postgresql.org
Subject: [GENERAL] Database OID xxxxx now seems to belong to "foo"

 

Hi:

 

Trying (and failing) to attach to my DBs.  Getting...     

 

database “foo_standby” has disappeared form pg_database

DETAIL: Database OID 2323523 now seems to belong to “foo”

 

 I have 2 dbs... foo and foo_standby.  Every midnight, I truncate foo_standby and load it up with all the current data from mu various sources.  If that’s successful, I then rename foo to foo_tmp, then rename foo_standby to foo, then rename foo_tmp to foo_standby.  This results in a minimal “down time” of the foo database ebcause the renames happen fast. And foo_standby is now my backup from the previous day, something I can switch back to if something goes wrong with the new foo db.

 

Anyway, “psql –l” does show my dbs.  There are other dbs for other projects and I can connect to those OK.  I just can’t connect to foo or foo_standby.

 

Q: What might have caused this problem and how can I avoid/prevent it in the future

Q: Can I salvage my foo and foo_standby DBs?

 

Thanks

-dave

 

 

 

 

 

 

Re: Database OID xxxxx now seems to belong to "foo"

От
Tom Lane
Дата:
"Gauthier, Dave" <dave.gauthier@intel.com> writes:
> I tried vacuum pg_database, no luck.   I shut down and restarted the DB
> and it seems to have fixed the problem.  Still, not sure why it happened
> in the first place or how to prevent it in the future.

My recollection is that that's a symptom of lack of vacuuming, in PG
versions that are too old to have automatic defenses against such
DBA misfeasance.  Check whether your routine vacuuming procedures
cover the system catalogs...

            regards, tom lane

Re: Database OID xxxxx now seems to belong to "foo"

От
Richard Huxton
Дата:
Gauthier, Dave wrote:
> I tried vacuum pg_database, no luck.   I shut down and restarted the DB
> and it seems to have fixed the problem.  Still, not sure why it happened
> in the first place or how to prevent it in the future.

 > Trying (and failing) to attach to my DBs.  Getting...
 >
 > database "foo_standby" has disappeared form pg_database
 >
 > DETAIL: Database OID 2323523 now seems to belong to "foo"

Hmm - if a shutdown + restart fixed it, I'm wondering if it wasn't just
a long-lived connection remembering where 2323523 used to point to.
You've not got a connection pool between you and the db or some such?

--
   Richard Huxton
   Archonet Ltd

Re: Database OID xxxxx now seems to belong to "foo"

От
"Gauthier, Dave"
Дата:
I'm running 8.2.0 on Linux.

The postgres.conf has...

   #autovacuum = off

(commented out) which lead me to believe that by default, it's on.  And
I'm assuming that it's vacumming the system tables.  Are these
assumptions right?

Thanks
-dave


-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Tom Lane
Sent: Tuesday, March 11, 2008 11:10 AM
To: Gauthier, Dave
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Database OID xxxxx now seems to belong to "foo"

"Gauthier, Dave" <dave.gauthier@intel.com> writes:
> I tried vacuum pg_database, no luck.   I shut down and restarted the
DB
> and it seems to have fixed the problem.  Still, not sure why it
happened
> in the first place or how to prevent it in the future.

My recollection is that that's a symptom of lack of vacuuming, in PG
versions that are too old to have automatic defenses against such
DBA misfeasance.  Check whether your routine vacuuming procedures
cover the system catalogs...

            regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: Database OID xxxxx now seems to belong to "foo"

От
Tom Lane
Дата:
Richard Huxton <dev@archonet.com> writes:
> Gauthier, Dave wrote:
>>> Trying (and failing) to attach to my DBs.  Getting...
>>> database "foo_standby" has disappeared form pg_database
>>> DETAIL: Database OID 2323523 now seems to belong to "foo"

> Hmm - if a shutdown + restart fixed it, I'm wondering if it wasn't just
> a long-lived connection remembering where 2323523 used to point to.

No, it's the "flat file" copy of pg_database that's supplying that
number, and the reason the restart fixed it is that the flat file
is forcibly rebuilt during a restart.  What's not quite clear is
why the flat file was wrong.

We've seen this type of failure reported from the field before,
and as far as I recall the triggering condition was transaction ID
wraparound due to lack of vacuuming ... but haven't consumed enough
caffeine this morning to remember details.

            regards, tom lane

Re: Database OID xxxxx now seems to belong to "foo"

От
Alvaro Herrera
Дата:
Gauthier, Dave wrote:
> I'm running 8.2.0 on Linux.

You should be running 8.2.6.

> The postgres.conf has...
>
>    #autovacuum = off
>
> (commented out) which lead me to believe that by default, it's on.  And
> I'm assuming that it's vacumming the system tables.  Are these
> assumptions right?

It's not turned on by default (and it's not on 8.2), so it's probably
not vacuuming anything.  On 8.2 there are enough protections that this
shouldn't be the actual problem though -- as soon as you get anywhere
near a failure, the system shuts itself down (but autovac gets a chance
to fix the problem for you before that happens, even if it's turned
off).

Perhaps it's a bug fixed between 8.2.0 and 8.2.6 (which is why you
should upgrade), but perhaps it hasn't been discovered yet.  So please
upgrade and let us know if you can reproduce it on 8.2.6.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: Database OID xxxxx now seems to belong to "foo"

От
Richard Huxton
Дата:
Tom Lane wrote:
> Richard Huxton <dev@archonet.com> writes:
>> Gauthier, Dave wrote:
>>>> Trying (and failing) to attach to my DBs.  Getting...
>>>> database "foo_standby" has disappeared form pg_database
>>>> DETAIL: Database OID 2323523 now seems to belong to "foo"
>
>> Hmm - if a shutdown + restart fixed it, I'm wondering if it wasn't just
>> a long-lived connection remembering where 2323523 used to point to.
>
> No, it's the "flat file" copy of pg_database that's supplying that
> number, and the reason the restart fixed it is that the flat file
> is forcibly rebuilt during a restart.  What's not quite clear is
> why the flat file was wrong.

Ah, that makes sense (well, the first part).

> We've seen this type of failure reported from the field before,
> and as far as I recall the triggering condition was transaction ID
> wraparound due to lack of vacuuming ... but haven't consumed enough
> caffeine this morning to remember details.

Be interesting to find out - I can't quite imagine how a transaction ID
problem would cause this.

--
   Richard Huxton
   Archonet Ltd

Re: Database OID xxxxx now seems to belong to "foo"

От
Tom Lane
Дата:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Gauthier, Dave wrote:
>> I'm running 8.2.0 on Linux.

> It's not turned on by default (and it's not on 8.2), so it's probably
> not vacuuming anything.  On 8.2 there are enough protections that this
> shouldn't be the actual problem though -- as soon as you get anywhere
> near a failure, the system shuts itself down (but autovac gets a chance
> to fix the problem for you before that happens, even if it's turned
> off).

Yeah, if it's 8.2.anything then XID wraparound should be an impossible
situation to get into; so we need a new theory.

Right offhand, the only way I can see for there to be a problem with
the flat pg_database file being out of sync with the real catalog
is if a CREATE/DROP/RENAME DATABASE aborts during transaction commit,
after having already updated the flat file.  This is certainly
conceivable but it would take some rather hairy error.  Dave, did
you have any recent database-manipulating commands go wacko on you?

(Alvaro's point about it possibly being an already-fixed bug is
definitely valid.  I'm too lazy to go trawl the CVS history for
bugs near transaction commit, though.)

            regards, tom lane

Re: Database OID xxxxx now seems to belong to "foo"

От
"Gauthier, Dave"
Дата:
The renames go very fast.  It's one of the main reasons I like it.  I
run them manually, so I can say for certain that they have not failed
"mid stream" where "midstream" seems to take under a second.

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Tuesday, March 11, 2008 12:51 PM
To: Alvaro Herrera
Cc: Gauthier, Dave; pgsql-general@postgresql.org
Subject: Re: [GENERAL] Database OID xxxxx now seems to belong to "foo"

Alvaro Herrera <alvherre@commandprompt.com> writes:
> Gauthier, Dave wrote:
>> I'm running 8.2.0 on Linux.

> It's not turned on by default (and it's not on 8.2), so it's probably
> not vacuuming anything.  On 8.2 there are enough protections that this
> shouldn't be the actual problem though -- as soon as you get anywhere
> near a failure, the system shuts itself down (but autovac gets a
chance
> to fix the problem for you before that happens, even if it's turned
> off).

Yeah, if it's 8.2.anything then XID wraparound should be an impossible
situation to get into; so we need a new theory.

Right offhand, the only way I can see for there to be a problem with
the flat pg_database file being out of sync with the real catalog
is if a CREATE/DROP/RENAME DATABASE aborts during transaction commit,
after having already updated the flat file.  This is certainly
conceivable but it would take some rather hairy error.  Dave, did
you have any recent database-manipulating commands go wacko on you?

(Alvaro's point about it possibly being an already-fixed bug is
definitely valid.  I'm too lazy to go trawl the CVS history for
bugs near transaction commit, though.)

            regards, tom lane