Обсуждение: pg_basebackup issues

Поиск
Список
Период
Сортировка

pg_basebackup issues

От
Lonni J Friedman
Дата:
Greetings,
I'm running postgresql-9.1.3 on a Linux-x86_64 (Fedora16, if it
matters) system.  I noticed the existence of pg_basebackup starting in
9.1, and figured I'd try it out and see if it would simplify our
backup & management processes.  I setup a test system (same OS &
postgresql version as production) with a fairly recent snapshot of our
production database, invoked it, and saw the following output:
######
# pg_basebackup -P -v -D backups -Ft -z -U postgres
135717206/135717230 kB (100%), 1/1 tablespace
pg_basebackup: could not get WAL end position from server
######

I wasn't sure what that error meant, so after googling a bit, turns
out that it really means that there were one or more files not owned
by the postgres user (see
http://serverfault.com/questions/312205/pg-basebackup-could-not-get-wal-end-position-from-server
).  Sure enough, the file that wasn't owned by the postgres user was
the backup tarball that pg_basebackup was creating, since I had been
running it as root.  That error is rather cryptic, and it would be
helpful if it was improved to suggest the real cause of the failure.
Anyway, lesson learned, I need to either invoke pg_basebackup as the
same user that runs the database (or is specified with the -U
parameter ?), or write the backup somewhere outside of the directory
structure that is being backed up.

I eventually also found the following entries in the postgresql server log:
FATAL:  could not open directory "./backups": Permission denied
FATAL:  archive member "backups/base.tar.gz" too large for tar format

What concerns me is the 2nd fatal error.  The tarball that
pg_basebackup created before erroring out is about 12GB:
12393094165  base.tar.gz

I wasn't aware of any 12GB file size limit for tar, so this is a bit
of a mystery to me.  Regardless, I'd be happy to try some other
archiving strategy, but the man page for pg_basebackup suggests that
there are only two formats, tar and basically just copying the
filesystem.  If I copied the filesystem, I'd still have to find some
way to archive them for easy management (copying elsewhere, etc).  Has
anyone come up with a good strategy on how to deal with it?

thanks

Re: pg_basebackup issues

От
Magnus Hagander
Дата:
On Fri, Apr 20, 2012 at 19:51, Lonni J Friedman <netllama@gmail.com> wrote:
> Greetings,
> I'm running postgresql-9.1.3 on a Linux-x86_64 (Fedora16, if it
> matters) system.  I noticed the existence of pg_basebackup starting in
> 9.1, and figured I'd try it out and see if it would simplify our
> backup & management processes.  I setup a test system (same OS &
> postgresql version as production) with a fairly recent snapshot of our
> production database, invoked it, and saw the following output:
> ######
> # pg_basebackup -P -v -D backups -Ft -z -U postgres
> 135717206/135717230 kB (100%), 1/1 tablespace
> pg_basebackup: could not get WAL end position from server
> ######
>
> I wasn't sure what that error meant, so after googling a bit, turns
> out that it really means that there were one or more files not owned
> by the postgres user (see
> http://serverfault.com/questions/312205/pg-basebackup-could-not-get-wal-end-position-from-server
> ).  Sure enough, the file that wasn't owned by the postgres user was
> the backup tarball that pg_basebackup was creating, since I had been
> running it as root.  That error is rather cryptic, and it would be
> helpful if it was improved to suggest the real cause of the failure.

Yeah, the error message comes from the fact that the backend gives up,
and the real message is in the backend log. We should try to do
something about that.


> Anyway, lesson learned, I need to either invoke pg_basebackup as the
> same user that runs the database (or is specified with the -U
> parameter ?), or write the backup somewhere outside of the directory
> structure that is being backed up.
>
> I eventually also found the following entries in the postgresql server log:
> FATAL:  could not open directory "./backups": Permission denied
> FATAL:  archive member "backups/base.tar.gz" too large for tar format
>
> What concerns me is the 2nd fatal error.  The tarball that
> pg_basebackup created before erroring out is about 12GB:
> 12393094165  base.tar.gz

Are you actually storing your backup files *inside* the data
directory? You really shouldn't do that, you're creating a cyclic
dependency where each new backup will include the old one inside it...
You should store the resulting backup file somewhere outside the data
directory.

> I wasn't aware of any 12GB file size limit for tar, so this is a bit
> of a mystery to me.  Regardless, I'd be happy to try some other
> archiving strategy, but the man page for pg_basebackup suggests that
> there are only two formats, tar and basically just copying the
> filesystem.  If I copied the filesystem, I'd still have to find some
> way to archive them for easy management (copying elsewhere, etc).  Has
> anyone come up with a good strategy on how to deal with it?

The max file size of a single flie inside a standard tar file is 8Gb,
see e.g. http://en.wikipedia.org/wiki/Tar_(file_format).

I think there are extensions that let you store bigger files, but
since PostgreSQL will never create files that  big it's not
implemented in the basebackup system. Because again, the root of your
problem seems to be that you are trying to store the resulting backup
inside the data directory.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: pg_basebackup issues

От
Lonni J Friedman
Дата:
On Fri, Apr 20, 2012 at 12:31 PM, Magnus Hagander <magnus@hagander.net> wrote:
> On Fri, Apr 20, 2012 at 19:51, Lonni J Friedman <netllama@gmail.com> wrote:
>> Anyway, lesson learned, I need to either invoke pg_basebackup as the
>> same user that runs the database (or is specified with the -U
>> parameter ?), or write the backup somewhere outside of the directory
>> structure that is being backed up.
>>
>> I eventually also found the following entries in the postgresql server log:
>> FATAL:  could not open directory "./backups": Permission denied
>> FATAL:  archive member "backups/base.tar.gz" too large for tar format
>>
>> What concerns me is the 2nd fatal error.  The tarball that
>> pg_basebackup created before erroring out is about 12GB:
>> 12393094165  base.tar.gz
>
> Are you actually storing your backup files *inside* the data
> directory? You really shouldn't do that, you're creating a cyclic
> dependency where each new backup will include the old one inside it...
> You should store the resulting backup file somewhere outside the data
> directory.

yea, in hindsight that was silly.  i just saw a convenient 'backups'
subdirectory and figured that'd be a nice logical place.

>
>> I wasn't aware of any 12GB file size limit for tar, so this is a bit
>> of a mystery to me.  Regardless, I'd be happy to try some other
>> archiving strategy, but the man page for pg_basebackup suggests that
>> there are only two formats, tar and basically just copying the
>> filesystem.  If I copied the filesystem, I'd still have to find some
>> way to archive them for easy management (copying elsewhere, etc).  Has
>> anyone come up with a good strategy on how to deal with it?
>
> The max file size of a single flie inside a standard tar file is 8Gb,
> see e.g. http://en.wikipedia.org/wiki/Tar_(file_format).
>
> I think there are extensions that let you store bigger files, but
> since PostgreSQL will never create files that  big it's not
> implemented in the basebackup system. Because again, the root of your
> problem seems to be that you are trying to store the resulting backup
> inside the data directory.

You're right, that was indeed the issue.  I've redone the process
using a location external to $PGDATA, and it completed successfully:
$ pg_basebackup -P -v -D /tmp/backup -x -Ft -z -U postgres
xlog start point: C6/64000020
135733616/135733616 kB (100%), 1/1 tablespace
xlog end point: C6/640000A0
pg_basebackup: base backup completed

So after running through this, I tried to use (restore) the backup
that was generated.  While everything appears to be working ok from a
functional perspective, in the server log I saw the following:
######
LOG:  creating missing WAL directory "pg_xlog/archive_status"
LOG:  database system was not properly shut down; automatic recovery in progress
LOG:  redo starts at C6/66000078
LOG:  could not open file "pg_xlog/00000001000000C600000067" (log file
198, segment 103): No such file or directory
LOG:  redo done at C6/660000A0
FATAL:  the database system is starting up
LOG:  autovacuum launcher started
LOG:  database system is ready to accept connections
#####

Just to be clear, here's what I did after pg_basebackup had completed
successfully:
0) shutdown postgresql gracefully, and verified that it was fully shutdown
1) moved $PGDATA to $PGDATA.old
2) created $PGDATA as postgres user
3) extracted the basebackup tarball as postgres user
cd $PGDATA && tar xzvpf /tmp/backup/base.tar.gz
4) started postgresql up

I would have expected that I wouldn't have gotten the 'not properly
shutdown' warning, or the 'could not open file' warning by following
this process.  Am I doing something wrong?

thanks