Обсуждение: Problem with dropping a tablespace

Поиск
Список
Период
Сортировка

Problem with dropping a tablespace

От
Oliver Siegmar
Дата:
Hello,

I'm unable to drop a recently created tablespace (the partition on which the
tablespace relies on, run out of space - postgresql seems to have a problem,
now):


template1=# drop tablespace disk1;
ERROR:  tablespace "disk1" is not empty


Huh? Okay...lets dig into...


template1=# select oid from pg_tablespace where spcname = 'disk1';
    oid
-----------
 594611987
(1 row)



template1=# select * from pg_tablespace_databases(594611987);
 pg_tablespace_databases
-------------------------
 595675173
(1 row)



Aha...so there seems to be a database associated to this tablespace - lets see
which one -

template1=# select * from pg_database where oid = 595675173;
 datname | datdba | encoding | datistemplate | datallowconn | datlastsysoid |
datvacuumxid | datfrozenxid | dattablespace | datconfig | datacl

---------+--------+----------+---------------+--------------+---------------+--------------+--------------+---------------+-----------+--------
(0 rows)



Mhh. Nothing.



How may I delete the tablespace manually?



Best regards

Oliver

Re: Problem with dropping a tablespace

От
Michael Fuhr
Дата:
On Mon, Aug 01, 2005 at 09:28:07PM +0200, Oliver Siegmar wrote:
>
> template1=# select * from pg_tablespace_databases(594611987);
>  pg_tablespace_databases
> -------------------------
>  595675173
> (1 row)
>
> Aha...so there seems to be a database associated to this tablespace - lets see
> which one -
>
> template1=# select * from pg_database where oid = 595675173;
>  datname | datdba | encoding | datistemplate | datallowconn | datlastsysoid |
> datvacuumxid | datfrozenxid | dattablespace | datconfig | datacl
>
---------+--------+----------+---------------+--------------+---------------+--------------+--------------+---------------+-----------+--------
> (0 rows)
>
> Mhh. Nothing.
>
> How may I delete the tablespace manually?

Perhaps a better question to ask is: why does pg_tablespace_databases()
think that database 595675173 has objects in the tablespace?  What
platform are you using?  If some flavor of Unix, what's the output
of the following command?

ls -alR /path/to/tablespace/directory

(Replace the path with the tablespace's directory; you'll probably
need to be the directory owner or root to run "ls" without getting
a "Permission denied" error).

--
Michael Fuhr
http://www.fuhr.org/~mfuhr/

Re: Problem with dropping a tablespace

От
Oliver Siegmar
Дата:
On Monday 01 August 2005 22:15, Michael Fuhr wrote:
> On Mon, Aug 01, 2005 at 09:28:07PM +0200, Oliver Siegmar wrote:
> > template1=# select * from pg_tablespace_databases(594611987);
> >  pg_tablespace_databases
> > -------------------------
> >  595675173
> > (1 row)
> >
> > Aha...so there seems to be a database associated to this tablespace -
> > lets see which one -
> >
> > template1=# select * from pg_database where oid = 595675173;
> >  datname | datdba | encoding | datistemplate | datallowconn |
> > datlastsysoid | datvacuumxid | datfrozenxid | dattablespace | datconfig |
> > datacl
> > ---------+--------+----------+---------------+--------------+------------
> >---+--------------+--------------+---------------+-----------+-------- (0
> > rows)
> >
> > Mhh. Nothing.
> >
> > How may I delete the tablespace manually?
>
> Perhaps a better question to ask is: why does pg_tablespace_databases()
> think that database 595675173 has objects in the tablespace?  What
> platform are you using?  If some flavor of Unix, what's the output
> of the following command?
>
> ls -alR /path/to/tablespace/directory

This directory indeed contains a subdirectory named 595675173 (the ghost's
database oid ;-))

Is it save to shutdown the postmaster, remove this directory and restart the
postmaster again? Are there any system tables that need updates after
removing the database manually?


Best

Oliver

Re: Problem with dropping a tablespace

От
Michael Fuhr
Дата:
On Tue, Aug 02, 2005 at 08:00:28AM +0200, Oliver Siegmar wrote:
> On Monday 01 August 2005 22:15, Michael Fuhr wrote:
> > On Mon, Aug 01, 2005 at 09:28:07PM +0200, Oliver Siegmar wrote:
> > > How may I delete the tablespace manually?
> >
> > Perhaps a better question to ask is: why does pg_tablespace_databases()
> > think that database 595675173 has objects in the tablespace?  What
> > platform are you using?  If some flavor of Unix, what's the output
> > of the following command?
> >
> > ls -alR /path/to/tablespace/directory
>
> This directory indeed contains a subdirectory named 595675173 (the ghost's
> database oid ;-))

Does that subdirectory contain anything?  That's part of why I asked
for the "ls -alR" output; I was also curious to see if there were
any surprises in ownership and permissions.

> Is it save to shutdown the postmaster, remove this directory and restart the
> postmaster again? Are there any system tables that need updates after
> removing the database manually?

I'll defer to the developers to answer that, at least until I've
done more research.  But the question that needs answering is: why
does that directory still exist if the database has been dropped?
Maybe one of the developers will have some ideas, especially if
we can see the directory's contents.  Please post the output of the
"ls" command shown above, i.e., a recursive long listing of all files
under the tablespace directory.

--
Michael Fuhr
http://www.fuhr.org/~mfuhr/

Re: Problem with dropping a tablespace

От
Oliver Siegmar
Дата:
On Tuesday 02 August 2005 15:16, Michael Fuhr wrote:
> > This directory indeed contains a subdirectory named 595675173 (the
> > ghost's database oid ;-))
>
> Does that subdirectory contain anything?  That's part of why I asked
> for the "ls -alR" output; I was also curious to see if there were
> any surprises in ownership and permissions.

Yeah. There are a lot of files in it. Owner and group is postgres (the
postmaster user). Permissions are set to 0600 - the directory itself has
0700.

> we can see the directory's contents.  Please post the output of the
> "ls" command shown above, i.e., a recursive long listing of all files
> under the tablespace directory.

I didn't want to "spam" the list... See the attatched file...


Thanks

Oliver

Вложения

Re: Problem with dropping a tablespace

От
Michael Fuhr
Дата:
On Tue, Aug 02, 2005 at 03:25:53PM +0200, Oliver Siegmar wrote:
> On Tuesday 02 August 2005 15:16, Michael Fuhr wrote:
> > > This directory indeed contains a subdirectory named 595675173 (the
> > > ghost's database oid ;-))
> >
> > Does that subdirectory contain anything?  That's part of why I asked
> > for the "ls -alR" output; I was also curious to see if there were
> > any surprises in ownership and permissions.
>
> Yeah. There are a lot of files in it. Owner and group is postgres (the
> postmaster user). Permissions are set to 0600 - the directory itself has
> 0700.

Hmmm...based on the file names, it looks like the directory contains
only system tables.  Do you know what database this was?  Did you
explicitly drop it, or is its disappearance (or appearance) a
mystery?  Didn't you say that the tablespace's filesystem ran out
of space?  When did that happen in relation to this database's
existence (during create, during drop, etc.)?  Are there any unusual
messages in the PostgreSQL logs?

If any of the developers are following this thread, could the
tablespace's filesystem running out of space have resulted in this
phantom directory, perhaps due to a create or drop that didn't
complete?  Is there any danger in Oliver removing the directory if
pg_database doesn't know about the database oid?  Should he shut
down the postmaster before removing it, or shouldn't that matter?

--
Michael Fuhr
http://www.fuhr.org/~mfuhr/

Re: Problem with dropping a tablespace

От
Oliver Siegmar
Дата:
On Tuesday 02 August 2005 17:01, Michael Fuhr wrote:
> Hmmm...based on the file names, it looks like the directory contains
> only system tables.  Do you know what database this was?

Yes, a business database - not a system one. The database were created using
template0.

> Did you
> explicitly drop it, or is its disappearance (or appearance) a
> mystery?

I dropped the database with 'DROP DATABASE xxx;' without any problems (after
the tablespace run out of space).

BTW: The 'default' tablespace (pg_default and pg_global) where all the system
stuff relies on, didn't ran out of space - only a temporarily created
tablespace on another partition.

> Didn't you say that the tablespace's filesystem ran out
> of space?  When did that happen in relation to this database's
> existence (during create, during drop, etc.)?

It happened while performing a 'pg_dump -F p business_db | pg_restore -d
disk1'.

> Are there any unusual
> messages in the PostgreSQL logs?

Well, the only unusual messages were when the space ran out - tons of entries
like this:

ERROR:  could not write block 13632 of relation 594611987/597873915/736358396:
No space left on device



Thanks

Oliver

Re: Problem with dropping a tablespace

От
Tom Lane
Дата:
Oliver Siegmar <o.siegmar@vitrado.de> writes:
> On Tuesday 02 August 2005 17:01, Michael Fuhr wrote:
>> Hmmm...based on the file names, it looks like the directory contains
>> only system tables.  Do you know what database this was?

> Yes, a business database - not a system one. The database were created using
> template0.

>> Did you explicitly drop it, or is its disappearance (or appearance) a
>> mystery?

> I dropped the database with 'DROP DATABASE xxx;' without any problems (after
> the tablespace run out of space).

How exactly do you know that OID 595675173 is the database you dropped,
and not that of some other DB?

I'm theorizing that the scenario went like this:

    CREATE DATABASE starts to create a database, for which it
    assigns the OID 595675173.

    Copying the template database goes fine.  (If we'd run out of
    space in this step, we'd have removed the partially copied
    directories before reporting failure.)

    While trying to make the pg_database entry for the new database,
    we run out of space and fail.

There isn't any provision for removing the copied data if we fail at
late stages of the CREATE process :-(.  I'm not sure it's possible to
plug this hole completely, but we could at least catch any error up
to the end of createdb() by wrapping all that code in a
PG_TRY/PG_RECOVER block.  With the recent shared dependency patch,
our exposure to an out-of-space failure at this stage has gone up by
orders of magnitude because so many more catalog rows need to be added.

            regards, tom lane

Re: Problem with dropping a tablespace

От
Tom Lane
Дата:
Michael Fuhr <mike@fuhr.org> writes:
> ... Is there any danger in Oliver removing the directory if
> pg_database doesn't know about the database oid?

No, but I'd counsel not doing so until we're certain we can't learn any
more about what happened.

> Should he shut down the postmaster before removing it, or shouldn't
> that matter?

Shouldn't be necessary.

            regards, tom lane

Re: Problem with dropping a tablespace

От
Oliver Siegmar
Дата:
On Tuesday 02 August 2005 18:42, Tom Lane wrote:
> > I dropped the database with 'DROP DATABASE xxx;' without any problems
> > (after the tablespace run out of space).
>
> How exactly do you know that OID 595675173 is the database you dropped,
> and not that of some other DB?

I don't know that for sure, but I can't remember having created and dropped an
other database within this tablespace.

> I'm theorizing that the scenario went like this:
>
>     CREATE DATABASE starts to create a database, for which it
>     assigns the OID 595675173.
>
>     Copying the template database goes fine.  (If we'd run out of
>     space in this step, we'd have removed the partially copied
>     directories before reporting failure.)
>
>     While trying to make the pg_database entry for the new database,
>     we run out of space and fail.

At which stage this pg_database entry gets created? I'm very sure, that there
was plenty of free disk space at the time of 'CREATE TABLE' statement. The
space ran out gigabytes after starting pg_restore.

I did a hexdump on the files within the tabelspace directory...no business
data at all, only postgres internals (I saw a lot of function names and
datatypes).


Let me know if I shall dig more information...otherwise I'm happy if I can
safely remove the directory.


Best

Oliver

Re: Problem with dropping a tablespace

От
Tom Lane
Дата:
Oliver Siegmar <o.siegmar@vitrado.de> writes:
> I did a hexdump on the files within the tabelspace directory...no business
> data at all, only postgres internals (I saw a lot of function names and
> datatypes).

Would you look at pg_class in particular (file 1259) and confirm that it
contains only names of Postgres system catalogs and indexes, no tables
of your own?

If so, I think the evidence is pretty strong that this database was in
process of being created.  A DB that had actually gotten used at all
would have more entries in pg_class.

            regards, tom lane

Re: Problem with dropping a tablespace

От
Oliver Siegmar
Дата:
On Tuesday 02 August 2005 19:49, Tom Lane wrote:
> Oliver Siegmar <o.siegmar@vitrado.de> writes:
> > I did a hexdump on the files within the tabelspace directory...no
> > business data at all, only postgres internals (I saw a lot of function
> > names and datatypes).
>
> Would you look at pg_class in particular (file 1259) and confirm that it
> contains only names of Postgres system catalogs and indexes, no tables
> of your own?

Everything in this file seems to be postgres related. Not a single table of
our own.

I removed the database directory and dropped the tablespace (by DROP
TABLESPACE) successfully.

Michael, Tom - thanks a bunch your help!


Best

Oliver

Re: Problem with dropping a tablespace

От
Tom Lane
Дата:
Oliver Siegmar <o.siegmar@vitrado.de> writes:
> On Tuesday 02 August 2005 19:49, Tom Lane wrote:
>> Would you look at pg_class in particular (file 1259) and confirm that it
>> contains only names of Postgres system catalogs and indexes, no tables
>> of your own?

> Everything in this file seems to be postgres related. Not a single table of
> our own.

OK.  I think that's sufficient evidence for my hypothesis (CREATE
DATABASE failed partway through).  I'm working on cleaning that code up.

            regards, tom lane