Обсуждение: Unable to determine what has a particular OID

Поиск
Список
Период
Сортировка

Unable to determine what has a particular OID

От
Andy Dale
Дата:
Hi,

We currently have a DB (8.3.7) in production that seem to be taking up more space on the HDD than was anticipated.  

After looking into the the data directory a particular file/oid is around 21 GB (the oid has 21 files), this OID can then be found in the pg_class table, the explanation for the size being that the table has around 12.5 Million rows.  During the inspection of the data directory I observed another OID that is around 8.5GB, but I cannot locate the corresponding entry in the pg_class table (even after dumping the table into a file and grep-ing for the OID in question).

Any other ideas for finding what the mystery OID is ?

Thanks in advance,

Andy

Re: Unable to determine what has a particular OID

От
Tom Lane
Дата:
Andy Dale <andy.dale@gmail.com> writes:
> After looking into the the data directory a particular file/oid is around 21
> GB (the oid has 21 files), this OID can then be found in the pg_class table,
> the explanation for the size being that the table has around 12.5 Million
> rows.  During the inspection of the data directory I observed another OID
> that is around 8.5GB, but I cannot locate the corresponding entry in the
> pg_class table (even after dumping the table into a file and grep-ing for
> the OID in question).

You should be looking at pg_class.relfilenode, not OID.  See
http://www.postgresql.org/docs/8.3/static/storage.html

            regards, tom lane

Re: Unable to determine what has a particular OID

От
Andy Dale
Дата:
On 18 June 2010 16:06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andy Dale <andy.dale@gmail.com> writes:
> After looking into the the data directory a particular file/oid is around 21
> GB (the oid has 21 files), this OID can then be found in the pg_class table,
> the explanation for the size being that the table has around 12.5 Million
> rows.  During the inspection of the data directory I observed another OID
> that is around 8.5GB, but I cannot locate the corresponding entry in the
> pg_class table (even after dumping the table into a file and grep-ing for
> the OID in question).

You should be looking at pg_class.relfilenode, not OID.  See
http://www.postgresql.org/docs/8.3/static/storage.html

                       regards, tom lane

Ok, but when I have dumped the pg_class table into a file like so:

-[ RECORD 1 ]--+----------------------------------------------------------------
relname        | pg_type
relnamespace   | 11
reltype        | 71
relowner       | 10
relam          | 0
relfilenode    | 1247
reltablespace  | 0
relpages       | 19
reltuples      | 796
reltoastrelid  | 0
reltoastidxid  | 0
relhasindex    | t
relisshared    | f
relkind        | r
relnatts       | 26
relchecks      | 0
reltriggers    | 0
relukeys       | 0
relfkeys       | 0
relrefs        | 0
relhasoids     | t
relhaspkey     | f
relhasrules    | f
relhassubclass | f
relfrozenxid   | 379
relacl         | {=r/postgres}
reloptions     | 
-[ RECORD 2 ]--+----------------------------------------------------------------
relname        | triggered_update_columns
relnamespace   | 11313
reltype        | 11477
relowner       | 10
relam          | 0
relfilenode    | 11476
reltablespace  | 0
relpages       | 0
reltuples      | 0
reltoastrelid  | 0

Grep-ing  for the correct oid/filenode (110660 in my case), nothing is returned.

Cheers,

Andy 

Re: Unable to determine what has a particular OID

От
Adrian Klaver
Дата:
On Friday 18 June 2010 7:15:48 am Andy Dale wrote:
> On 18 June 2010 16:06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Andy Dale <andy.dale@gmail.com> writes:
> > > After looking into the the data directory a particular file/oid is
> > > around
> >
> > 21
> >
> > > GB (the oid has 21 files), this OID can then be found in the pg_class
> >
> > table,
> >
> > > the explanation for the size being that the table has around 12.5
> > > Million rows.  During the inspection of the data directory I observed
> > > another OID that is around 8.5GB, but I cannot locate the corresponding
> > > entry in the pg_class table (even after dumping the table into a file
> > > and grep-ing for the OID in question).
> >
> > You should be looking at pg_class.relfilenode, not OID.  See
> > http://www.postgresql.org/docs/8.3/static/storage.html
> >
> >                        regards, tom lane
>
> Ok, but when I have dumped the pg_class table into a file like so:
>
> -[ RECORD 1
> ]--+----------------------------------------------------------------
> relname        | pg_type
> relnamespace   | 11
> reltype        | 71
> relowner       | 10
> relam          | 0
> relfilenode    | 1247
> reltablespace  | 0
> relpages       | 19
> reltuples      | 796
> reltoastrelid  | 0
> reltoastidxid  | 0
> relhasindex    | t
> relisshared    | f
> relkind        | r
> relnatts       | 26
> relchecks      | 0
> reltriggers    | 0
> relukeys       | 0
> relfkeys       | 0
> relrefs        | 0
> relhasoids     | t
> relhaspkey     | f
> relhasrules    | f
> relhassubclass | f
> relfrozenxid   | 379
> relacl         | {=r/postgres}
> reloptions     |
> -[ RECORD 2
> ]--+----------------------------------------------------------------
> relname        | triggered_update_columns
> relnamespace   | 11313
> reltype        | 11477
> relowner       | 10
> relam          | 0
> relfilenode    | 11476
> reltablespace  | 0
> relpages       | 0
> reltuples      | 0
> reltoastrelid  | 0
>
> Grep-ing  for the correct oid/filenode (110660 in my case), nothing is
> returned.
>
> Cheers,
>
> Andy

Try:
SELECT oid,* from pg_database ;

I suspect the relfilnode you are looking at is another database in the cluster.

--
Adrian Klaver
adrian.klaver@gmail.com

Re: Unable to determine what has a particular OID

От
Tom Lane
Дата:
Andy Dale <andy.dale@gmail.com> writes:
> On 18 June 2010 16:06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> You should be looking at pg_class.relfilenode, not OID.  See
>> http://www.postgresql.org/docs/8.3/static/storage.html

> Ok, but when I have dumped the pg_class table into a file like so:
> ...
> Grep-ing  for the correct oid/filenode (110660 in my case), nothing is
> returned.

Hm.  You're sure you're looking at pg_class in the right database?

If there really isn't a pg_class entry matching that relfilenode,
then the files are orphans and can be removed.  There are some scenarios
where orphan files can be left behind, but they generally involve
database crashes ... have you had any of those?

            regards, tom lane

Re: Unable to determine what has a particular OID

От
Andy Dale
Дата:


Try:
SELECT oid,* from pg_database ;

I suspect the relfilnode you are looking at is another database in the cluster.

--
Adrian Klaver
adrian.klaver@gmail.com

Maybe I am possibly doing that.  I did a du -h --max-depth=1 in the base/ within the data directory.  This showed a single folder that contained ~34 GB.  On performing a ls in this, it showed 2 oids/filenodes that had more than 1 GB (multiple files with .<number> at the end).  The large table I mentioned in the original post was in this folder, so I assumed that the rest of the files in the same directory belonged to the same DB.

Have I made an incorrect assumption ? 

Re: Unable to determine what has a particular OID

От
Andy Dale
Дата:
On 18 June 2010 16:37, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andy Dale <andy.dale@gmail.com> writes:
> On 18 June 2010 16:06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> You should be looking at pg_class.relfilenode, not OID.  See
>> http://www.postgresql.org/docs/8.3/static/storage.html

> Ok, but when I have dumped the pg_class table into a file like so:
> ...
> Grep-ing  for the correct oid/filenode (110660 in my case), nothing is
> returned.

Hm.  You're sure you're looking at pg_class in the right database?

If there really isn't a pg_class entry matching that relfilenode,
then the files are orphans and can be removed.  There are some scenarios
where orphan files can be left behind, but they generally involve
database crashes ... have you had any of those?

                       regards, tom lane

To the best of my knowledge we have not had any DB crashes as yet.  I am also replicating the DB using slony, could this be the cause of these weird orphans ? (it is neither of the sl_log_ tables)

Re: Unable to determine what has a particular OID

От
Tom Lane
Дата:
Andy Dale <andy.dale@gmail.com> writes:
>> Try:
>> SELECT oid,* from pg_database ;
>>
>> I suspect the relfilnode you are looking at is another database in the
>> cluster.

> Maybe I am possibly doing that.

There isn't any "maybe" involved here.  Follow Adrian's advice and
determine for sure exactly which database that subdirectory represents.
The subdirectory name will match the OID in the pg_database row.

It's possible for the same table OID to be in use in more than one
database, so the fact that you got a match to the other table doesn't
in itself prove you're looking at the right database.

            regards, tom lane

Re: Unable to determine what has a particular OID

От
Andy Dale
Дата:

 
To the best of my knowledge we have not had any DB crashes as yet.  I am also replicating the DB using slony, could this be the cause of these weird orphans ? (it is neither of the sl_log_ tables)

Also looking at the timestamps of the files, they all seem to have been created within a very short period of time on the 6th of April

-rw------- 1 postgres postgres      57344 2010-04-06 03:10 110653
-rw------- 1 postgres postgres          0 2010-04-06 03:08 110654
-rw------- 1 postgres postgres      40960 2010-04-06 03:08 110657
-rw------- 1 postgres postgres       8192 2010-04-06 03:08 110658
-rw------- 1 postgres postgres 1073741824 2010-04-06 03:10 110660
-rw------- 1 postgres postgres 1073741824 2010-04-06 03:12 110660.1
-rw------- 1 postgres postgres 1073741824 2010-04-06 03:13 110660.2
-rw------- 1 postgres postgres 1073741824 2010-04-06 03:15 110660.3
-rw------- 1 postgres postgres 1073741824 2010-04-06 03:16 110660.4
-rw------- 1 postgres postgres 1073741824 2010-04-06 03:18 110660.5
-rw------- 1 postgres postgres 1073741824 2010-04-06 03:20 110660.6
-rw------- 1 postgres postgres 1073741824 2010-04-06 03:21 110660.7
-rw------- 1 postgres postgres   26566656 2010-04-06 03:21 110660.8
-rw------- 1 postgres postgres     180224 2010-04-06 03:21 110661
-rw------- 1 postgres postgres       8192 2010-04-06 03:08 110662
-rw------- 1 postgres postgres       8192 2010-04-06 03:08 110663
-rw------- 1 postgres postgres      16384 2010-04-06 03:21 110664
-rw------- 1 postgres postgres  113254400 2010-04-06 03:25 110763
-rw------- 1 postgres postgres          0 2010-04-06 03:25 110765
-rw------- 1 postgres postgres 1073741824 2010-06-03 22:17 112806

Re: Unable to determine what has a particular OID

От
Andy Dale
Дата:


On 18 June 2010 17:00, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andy Dale <andy.dale@gmail.com> writes:
>> Try:
>> SELECT oid,* from pg_database ;
>>
>> I suspect the relfilnode you are looking at is another database in the
>> cluster.

> Maybe I am possibly doing that.

There isn't any "maybe" involved here.  Follow Adrian's advice and
determine for sure exactly which database that subdirectory represents.
The subdirectory name will match the OID in the pg_database row.

It's possible for the same table OID to be in use in more than one
database, so the fact that you got a match to the other table doesn't
in itself prove you're looking at the right database.

                       regards, tom lane

OK, I have just run the query, and both are in the same database. 

Re: Unable to determine what has a particular OID

От
Tom Lane
Дата:
Andy Dale <andy.dale@gmail.com> writes:
> OK, I have just run the query, and both are in the same database.

OK.  Given the narrow range of timestamps on the files, the most
likely bet here is that you're looking at the leftovers from a
"CREATE TABLE AS" or similar command that failed partway through and
for some reason didn't clean up the files it'd created.  As I mentioned,
this'd be unsurprising if there'd been a database crash or kill -9
or something similar involved.  Anyway I think you've done enough to
establish that "rm'ing" those files will be safe.  (But if you're
paranoid you might want to save them someplace first.)

            regards, tom lane

Re: Unable to determine what has a particular OID

От
Andy Dale
Дата:
On 18 June 2010 17:27, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andy Dale <andy.dale@gmail.com> writes:
> OK, I have just run the query, and both are in the same database.

OK.  Given the narrow range of timestamps on the files, the most
likely bet here is that you're looking at the leftovers from a
"CREATE TABLE AS" or similar command that failed partway through and
for some reason didn't clean up the files it'd created.  As I mentioned,
this'd be unsurprising if there'd been a database crash or kill -9
or something similar involved.  Anyway I think you've done enough to
establish that "rm'ing" those files will be safe.  (But if you're
paranoid you might want to save them someplace first.)

                       regards, tom lane

Thanks for confirming this, I will however save/backup the files before removing them (just in case anything goes wrong).

Re: Unable to determine what has a particular OID

От
Greg Smith
Дата:
Andy Dale wrote:
> I will however save/backup the files before removing them (just in
> case anything goes wrong).

If you do go through with this, I would recommend doing a complete
database dump afterwards using pg_dump/pg_dumpall.  That should make it
immediately obvious if you broke something with the change.

--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com   www.2ndQuadrant.us


Re: Unable to determine what has a particular OID

От
Tom Lane
Дата:
Greg Smith <greg@2ndquadrant.com> writes:
> Andy Dale wrote:
>> I will however save/backup the files before removing them (just in
>> case anything goes wrong).

> If you do go through with this, I would recommend doing a complete
> database dump afterwards using pg_dump/pg_dumpall.  That should make it
> immediately obvious if you broke something with the change.

Doing pg_dump would expose the mistake if you'd removed an actual
table's files.  But I'm not sure it would expose it if you removed
an index ...

            regards, tom lane

Re: Unable to determine what has a particular OID

От
Greg Smith
Дата:
Tom Lane wrote:
> Doing pg_dump would expose the mistake if you'd removed an actual
> table's files.  But I'm not sure it would expose it if you removed
> an index ...
>

Right, but in theory if you screwed up and accidentally deleted a file
holding an index, you could recover from that in the possibly distant
future by rebuilding it, with some pain but no expected loss.  Whereas
if you deleted some data by removing a file, you really want to know
that's what you did immediately, so you can put it back before you
forget where it all was at.

Sometimes people who have fully setup replication for backup purposes
ask me if they should continue saving pg_dump output somewhere.  I think
it's reasonable to generate a dump using it periodically whether or not
you intend to save the result permanently, just as a paranoid sanity
check that you can still read everything.  I don't trust disks and
filesystems that much.

(If you're reading this and feel the need to write a pro-ZFS essay at
this point, consider yourself trolled)

--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com   www.2ndQuadrant.us