Обсуждение: a few questions on backup

Поиск
Список
Период
Сортировка

a few questions on backup

От
Marco Colombo
Дата:
Hello,
I have a few questions on backuping a PostgreSQL server (lets say
anything 8.x.x). I've read "Continuous Archiving and Point-In-Time
Recovery (PITR)" in the manual I'm still missing something...well
actually I think I don't but I've been debating on this with a friend
for a while, and there's something we don't seem to agree on, so I need
some bits of clarification. :)

Ok, let's say what I really need is "poor man's" PITR. That is, I
perform only one daily full backup, and I need to be able to restore the
database contents as they were at backup time. I don't even need to know
the exact time (yes, that means I don't really care about which
transactions turn out to be committed and which don't). Mmm, ok that
can't be rightfully called PITR at all, maybe.

Am I right in assuming that the following procedure is ok?

1) issue pg_start_backup();
2) copy (or tar or cpio) the data dir, w/o pg_xlog/
3) issue pg_stop_backup();
4) copy (or tar or cpio) pg_xlog/ contents.

That's all. Please note that I'm doing NO WAL archiving.

Whether this is going to work or not is based on what exactly
pg_start_backup() does. According to him (my friend), it may simply
prevent PostgreSQL from writing to data files until pg_stop_backup() is
issued, thus kind of enforcing a snapshot on the data files themselves.
Changes go to WAL only, and they are "played" on data files only after
the pg_stop_backup(). Hmm, I don't really like that. But if so, I don't
even need step 4), since the tar backup would be consistent. I'm able to
restore from that only, and data are those at the time of pg_start_backup().

Or, (that's me) pg_start_backup() only ensures that full_page_writes is
enabled during the backup phase, so that any change to data files can be
undone/redone based on the contents of WAL segments. The tar archive
contains potentially inconsistant data, that's why I need WAL segments
too. At restore time, PG eventually performs a crash recovery, and data
are those at the time of step 4) (which is an interval really, some time
between the start and the end of the copy, but I don't need to be able
to tell the exact time anyway).

BTW, I see a vulnerability in the above procedure... if there's enough
write activity so that PostgreSQL recycles WAL segments between 1) and
4), some changes may be lost. At step 4) I need to save _all_ WAL
segments that have been produced during the backup time. I assume that
activity at backup time is low enough that the event is "extremely
unlikely". Another doubt I have is about WAL checkpointing... is it
possible that a checkpoint happens during step 2), and the tar archive
gets data files both from before and from after the checkpoint (which I
think is bad) or does pg_start_backup() prevent WAL checkpointing, too?

Finally, if I'm missing something and the above is wrong, I think that
the only way to perform a full backup on a live database at filesystem
level, is to enable WAL archiving as the first step of the backup
procedure (assuming it's not usually on, of course), and later save all
the WAL segments that were _archived_ during that time, including the
one made at pg_stop_backup(), on the same backup medium.

If I understand the documentation right, nothing breaks if the
archive_command just returns OK w/o saving anything during regular
operation, and starts saving segments only during the backup time... I
mean, that prevents me from doing arbitrary PITR after the backup, but
all I want to do is performing a full backup, w/o being able to do any
partial backup after that. If so, I may write a script that does nothing
most of the time, and archives WAL segments only to be included in the
full backup.

TIA, (and sorry if the above sounds a bit messy)
.TM.

Re: a few questions on backup

От
Tom Lane
Дата:
Marco Colombo <pgsql@esiway.net> writes:
> Am I right in assuming that the following procedure is ok?

> 1) issue pg_start_backup();
> 2) copy (or tar or cpio) the data dir, w/o pg_xlog/
> 3) issue pg_stop_backup();
> 4) copy (or tar or cpio) pg_xlog/ contents.

No.  You have to have an actual archive_command script copying the WAL
segments somewhere else when told to.  An asynchronous copy of the xlog
directory will be nothing but garbage, because we recycle WAL segments
as fast as we can (ie, as soon as the archive_command claims to have
saved the data).

1) 2) and 3) are OK, but you need to use archive_command to collect the
xlog segments.

Actually ... given your low requirements, I wonder why you don't just
stop the postmaster, tar the datadir, start the postmaster.

            regards, tom lane

Re: a few questions on backup

От
Hannes Dorbath
Дата:
On 14.05.2007 16:54, Marco Colombo wrote:
> I have a few questions on backuping a PostgreSQL server (lets say
> anything 8.x.x). I've read "Continuous Archiving and Point-In-Time
> Recovery (PITR)" in the manual I'm still missing something...well
> actually I think I don't but I've been debating on this with a friend
> for a while, and there's something we don't seem to agree on, so I need
> some bits of clarification. :)
>
> Ok, let's say what I really need is "poor man's" PITR. That is, I
> perform only one daily full backup, and I need to be able to restore the
> database contents as they were at backup time. I don't even need to know
> the exact time (yes, that means I don't really care about which
> transactions turn out to be committed and which don't). Mmm, ok that
> can't be rightfully called PITR at all, maybe.
>
> Am I right in assuming that the following procedure is ok?
>
> 1) issue pg_start_backup();
> 2) copy (or tar or cpio) the data dir, w/o pg_xlog/
> 3) issue pg_stop_backup();
> 4) copy (or tar or cpio) pg_xlog/ contents.
>
> That's all. Please note that I'm doing NO WAL archiving.
>
> Whether this is going to work or not is based on what exactly
> pg_start_backup() does. According to him (my friend), it may simply
> prevent PostgreSQL from writing to data files until pg_stop_backup() is
> issued, thus kind of enforcing a snapshot on the data files themselves.
> Changes go to WAL only, and they are "played" on data files only after
> the pg_stop_backup(). Hmm, I don't really like that. But if so, I don't
> even need step 4), since the tar backup would be consistent. I'm able to
> restore from that only, and data are those at the time of pg_start_backup().
>
> Or, (that's me) pg_start_backup() only ensures that full_page_writes is
> enabled during the backup phase, so that any change to data files can be
> undone/redone based on the contents of WAL segments. The tar archive
> contains potentially inconsistant data, that's why I need WAL segments
> too. At restore time, PG eventually performs a crash recovery, and data
> are those at the time of step 4) (which is an interval really, some time
> between the start and the end of the copy, but I don't need to be able
> to tell the exact time anyway).
>
> BTW, I see a vulnerability in the above procedure... if there's enough
> write activity so that PostgreSQL recycles WAL segments between 1) and
> 4), some changes may be lost. At step 4) I need to save _all_ WAL
> segments that have been produced during the backup time. I assume that
> activity at backup time is low enough that the event is "extremely
> unlikely". Another doubt I have is about WAL checkpointing... is it
> possible that a checkpoint happens during step 2), and the tar archive
> gets data files both from before and from after the checkpoint (which I
> think is bad) or does pg_start_backup() prevent WAL checkpointing, too?
>
> Finally, if I'm missing something and the above is wrong, I think that
> the only way to perform a full backup on a live database at filesystem
> level, is to enable WAL archiving as the first step of the backup
> procedure (assuming it's not usually on, of course), and later save all
> the WAL segments that were _archived_ during that time, including the
> one made at pg_stop_backup(), on the same backup medium.
>
> If I understand the documentation right, nothing breaks if the
> archive_command just returns OK w/o saving anything during regular
> operation, and starts saving segments only during the backup time... I
> mean, that prevents me from doing arbitrary PITR after the backup, but
> all I want to do is performing a full backup, w/o being able to do any
> partial backup after that. If so, I may write a script that does nothing
> most of the time, and archives WAL segments only to be included in the
> full backup.

Are file system snapshots not the best approach for your requirements?

lvcreate -s -L5G -nbackup /dev/foo/postgresql
mount /dev/foo/backup /mnt/backup-snap
tar jcpvf pg-backup-<time_stamp>.bz2 /mnt/backup-snap

You can't do much wrong with that, it's fast and easy to use.

Just my 2cent..


--
Regards,
Hannes Dorbath

Re: a few questions on backup

От
Marco Colombo
Дата:
Tom Lane wrote:
> No.  You have to have an actual archive_command script copying the WAL
> segments somewhere else when told to.  An asynchronous copy of the xlog
> directory will be nothing but garbage, because we recycle WAL segments
> as fast as we can (ie, as soon as the archive_command claims to have
> saved the data).

Mmm, sorry I'm not sure I'm following here. Maybe I should provide some
background. In my pg_xlog directory I see five files, WAL segments, I
suppose. Only one (as I expected) is begin currently used, the others
are old (one a couple of days old).

When PG performs a switch from one segment to another one (I assume it
recycles the oldest available), does it archive the recycled one (before
starting using it of course) or the just-filled one? If it's the one
being recycled, it means that in my setup it would takes two days to
archive a segment since it stopped being used. Am I missing something?

> 1) 2) and 3) are OK, but you need to use archive_command to collect the
> xlog segments.
>
> Actually ... given your low requirements, I wonder why you don't just
> stop the postmaster, tar the datadir, start the postmaster.

Well, currently we do a pg_dump. The database mainly supports dynamic
websites. It's very unlikely they get updated at the time the backup
runs, and overall there is little updating even during the day, but I
don't like stopping the postmaster because, even if the write load is
negligible, the read one might be not. It's still small enough that a
tar (to disk) might take only a minute or two to complete, but yet it's
a minute of downtime for the web sites. If I can avoid that, why not?

I'm not unsatisfied with pg_dump, and I agree that with my requirements
the whole issue is accademic. I just wanted to learn how it works
exactly, such knowledge could provide useful for doing the Right Thing
in case of troubles. Maybe it's the right time for me to have a look at
the source...


Hannes Dorbath wrote:
> lvcreate -s -L5G -nbackup /dev/foo/postgresql
> mount /dev/foo/backup /mnt/backup-snap
> tar jcpvf pg-backup-<time_stamp>.bz2 /mnt/backup-snap
>
> You can't do much wrong with that, it's fast and easy to use.

Been there, done that. In my environment (Fedora Core 6) it's fast and
easy, but not reliable, unfortunately. Sometimes the snapshot won't get
created, sometimes it won't get removed after the backup is done.

.TM.

Re: a few questions on backup

От
Richard Huxton
Дата:
Marco Colombo wrote:
> Mmm, sorry I'm not sure I'm following here. Maybe I should provide some
> background. In my pg_xlog directory I see five files, WAL segments, I
> suppose. Only one (as I expected) is begin currently used, the others
> are old (one a couple of days old).
>
> When PG performs a switch from one segment to another one (I assume it
> recycles the oldest available), does it archive the recycled one (before
> starting using it of course) or the just-filled one? If it's the one
> being recycled, it means that in my setup it would takes two days to
> archive a segment since it stopped being used. Am I missing something?

It calls archive_command on the just-filled one.

>> 1) 2) and 3) are OK, but you need to use archive_command to collect the
>> xlog segments.
>>
>> Actually ... given your low requirements, I wonder why you don't just
>> stop the postmaster, tar the datadir, start the postmaster.
>
> Well, currently we do a pg_dump. The database mainly supports dynamic
> websites. It's very unlikely they get updated at the time the backup
> runs,

You do know that pg_dump gives you a guaranteed accurate snapshot of the
database? It doesn't matter if it's in use.

--
   Richard Huxton
   Archonet Ltd

Re: a few questions on backup

От
"Prashant Ranjalkar"
Дата:
Hi,
 
The procedure you followed is for online backups. The backups are useless unless you set archive_command in your postgresql.conf file. This command will copy the filled transaction log to a directory where you specified in your archive_command. The PG won't write to transaction logs unless it takes the copy of the filled transaction log. These copied transaction logs are used during the recovery process for roll forwarding.
 
If you don't set the archive_command then you have to go for cold backups where database will be stopped and data directory is backed up.
 
Hope this helps.
 
Regards,
Prashant Ranjalkar
EnterpriseDB Corporation

 
On 5/14/07, Marco Colombo <pgsql@esiway.net> wrote:
Hello,
I have a few questions on backuping a PostgreSQL server (lets say
anything 8.x.x). I've read "Continuous Archiving and Point-In-Time
Recovery (PITR)" in the manual I'm still missing something...well
actually I think I don't but I've been debating on this with a friend
for a while, and there's something we don't seem to agree on, so I need
some bits of clarification. :)

Ok, let's say what I really need is "poor man's" PITR. That is, I
perform only one daily full backup, and I need to be able to restore the
database contents as they were at backup time. I don't even need to know
the exact time (yes, that means I don't really care about which
transactions turn out to be committed and which don't). Mmm, ok that
can't be rightfully called PITR at all, maybe.

Am I right in assuming that the following procedure is ok?

1) issue pg_start_backup();
2) copy (or tar or cpio) the data dir, w/o pg_xlog/
3) issue pg_stop_backup();
4) copy (or tar or cpio) pg_xlog/ contents.

That's all. Please note that I'm doing NO WAL archiving.

Whether this is going to work or not is based on what exactly
pg_start_backup() does. According to him (my friend), it may simply
prevent PostgreSQL from writing to data files until pg_stop_backup() is
issued, thus kind of enforcing a snapshot on the data files themselves.
Changes go to WAL only, and they are "played" on data files only after
the pg_stop_backup(). Hmm, I don't really like that. But if so, I don't
even need step 4), since the tar backup would be consistent. I'm able to
restore from that only, and data are those at the time of pg_start_backup().

Or, (that's me) pg_start_backup() only ensures that full_page_writes is
enabled during the backup phase, so that any change to data files can be
undone/redone based on the contents of WAL segments. The tar archive
contains potentially inconsistant data, that's why I need WAL segments
too. At restore time, PG eventually performs a crash recovery, and data
are those at the time of step 4) (which is an interval really, some time
between the start and the end of the copy, but I don't need to be able
to tell the exact time anyway).

BTW, I see a vulnerability in the above procedure... if there's enough
write activity so that PostgreSQL recycles WAL segments between 1) and
4), some changes may be lost. At step 4) I need to save _all_ WAL
segments that have been produced during the backup time. I assume that
activity at backup time is low enough that the event is "extremely
unlikely". Another doubt I have is about WAL checkpointing... is it
possible that a checkpoint happens during step 2), and the tar archive
gets data files both from before and from after the checkpoint (which I
think is bad) or does pg_start_backup() prevent WAL checkpointing, too?

Finally, if I'm missing something and the above is wrong, I think that
the only way to perform a full backup on a live database at filesystem
level, is to enable WAL archiving as the first step of the backup
procedure (assuming it's not usually on, of course), and later save all
the WAL segments that were _archived_ during that time, including the
one made at pg_stop_backup(), on the same backup medium.

If I understand the documentation right, nothing breaks if the
archive_command just returns OK w/o saving anything during regular
operation, and starts saving segments only during the backup time... I
mean, that prevents me from doing arbitrary PITR after the backup, but
all I want to do is performing a full backup, w/o being able to do any
partial backup after that. If so, I may write a script that does nothing
most of the time, and archives WAL segments only to be included in the
full backup.

TIA, (and sorry if the above sounds a bit messy)
.TM.

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

              http://archives.postgresql.org/

Re: a few questions on backup

От
Marco Colombo
Дата:
Richard Huxton wrote:
> It calls archive_command on the just-filled one.

Good to know, thanks. I think I'll experiment a bit with
archive_command. My point was that since I know (or better assume) that
old segments are going to stay in my pg_xlog for *days* before getting
recycled, just copying them all after the call to backup_stop() should
be enough, in my case. It's more than I need, even.

> You do know that pg_dump gives you a guaranteed accurate snapshot of the
> database? It doesn't matter if it's in use.

Yes, I know, that's why I'm using it. A dump is also useful in that you
can restore it even in a different system easier. A text dump allows you
to make changes before restoring, even with sed. I'm comfortable with it.

The reasons for playing with WAL segments for backups are:
1) learning how it works;
2) everything else is backed up with cpio;
3) at restore time I need to have PG running already, and there may be
something different at database initialization.

Reason 1) is the driving one, by far. I can handle it for the rest (i.e.
the compressed dump is saved as part of the cpio archive).

.TM.

Re: a few questions on backup

От
Tom Lane
Дата:
Marco Colombo <pgsql@esiway.net> writes:
> Good to know, thanks. I think I'll experiment a bit with
> archive_command. My point was that since I know (or better assume) that
> old segments are going to stay in my pg_xlog for *days* before getting
> recycled,

On what do you base that assumption?  Once the system thinks they're not
needed anymore, they'll be recycled immediately.

            regards, tom lane

Re: a few questions on backup

От
Marco Colombo
Дата:
Tom Lane wrote:
> Marco Colombo <pgsql@esiway.net> writes:
>> Good to know, thanks. I think I'll experiment a bit with
>> archive_command. My point was that since I know (or better assume) that
>> old segments are going to stay in my pg_xlog for *days* before getting
>> recycled,
>
> On what do you base that assumption?  Once the system thinks they're not
> needed anymore, they'll be recycled immediately.
>
>             regards, tom lane

Well now that you make me think of it, I do make some assumptions. One
is that only one file in pg_xlog is the active segment. Two is that I
can trust modification times (so that a file inside pg_xlog that looks
old is actually old... and since postgresql does not run as root, it
couldn't cheat on that even if it tried to).

The best thing I can do is to configure archiving, and see what gets
archived exactly. I'm making assumptions there too. I expect for each
file in pg_xlog to find a copy in the archive directory (say archiving
is done with cp), with one exception, the segment currently beeing
written to. There will be a file with the same name but different
contents (and older modification time).

I'll try that out. Maybe my ideas are so far from the truth that I'm
having a hard time in explaing them to people who actually know how
things work. I'll be back with results. Meanwhile, thanks for your time.

.TM.

Re: a few questions on backup

От
Marco Colombo
Дата:
Marco Colombo wrote:
> I'll try that out. Maybe my ideas are so far from the truth that I'm
> having a hard time in explaing them to people who actually know how
> things work. I'll be back with results. Meanwhile, thanks for your time.

I think I finally got it.

Segment 34 in my pg_xlog got archived and recycled. It became segment
39, but PG is still working on segment 35, after some hours. Now pg_xlog
contains 5 segments, from named from 35 to 39, 35 being the most
recently modified. 39 won't be used yet for about a couple of days.

Now I see what you mean for "recycled": I thought it meant "marked free
for later use", but it means "renamed for future use". My mistake was
assuming that the rename part happens lazily when PG starts using the
file. Instead, it happens right after (the eventual) archiving.

That makes the strategy in my original post somehow unfeasable. Still, I
was not completely wrong:

# cmp /var/lib/pgsql/data/pg_xlog/000000010000000100000039
/u1/pg_wal_archive/000000010000000100000034 && echo "Yes"
Yes

They do contain the same data, that of segment 34, and the *39 file will
stay there, untouched, for quite a while after the backup. So the WAL
segment I need *is* there, just with a different name.

The only problem is figuring out what segment that data actually belongs
to. I know only because I can compare it with the archived one. Now, I
could still make some educated guesses, by looking at modification
times, but definitely a guessing game is not something you want to play
when restoring your precious data. :) Archiving the WAL segments and
letting the recovery procedure handle them at restore time is easier anyway.

Again, thanks a lot.

.TM.
--
      ____/  ____/   /
     /      /       /            Marco Colombo
    ___/  ___  /   /              Technical Manager
   /          /   /             ESI s.r.l.
 _____/ _____/  _/               Colombo@ESI.it