Обсуждение: "Resurrected" data files - problem?
We use a tape backup software that does "incremental backups" as follows: - In a full backup, all files are backed up. - In an incremental backup, only the files with modification date after the last backup are backed up. Now when such a backup is restored, you first have to restore the full backup, and then the incremental backup. The problem is that files which were deleted between the full and the incremental backup will get "resurrected" after such a restore. So if we perform our database backups with incremental backups as described above, we could end up with additional files after the restore, because PostgreSQL files can get deleted (e.g. during DROP TABLE or TRUNCATE TABLE). My question is: Could such "resurrected" files (data files, files in pg_xlog, pg_clog or elsewhere) cause a problem for the database (other than the obvious one that there may be unnecessary files about that consume disk space)? Yours, Laurenz Albe
On 08/11/2007, Albe Laurenz <laurenz.albe@wien.gv.at> wrote:
We use a tape backup software that does "incremental backups"
as follows:
- In a full backup, all files are backed up.
- In an incremental backup, only the files with modification
date after the last backup are backed up.
Now when such a backup is restored, you first have to restore
the full backup, and then the incremental backup.
The problem is that files which were deleted between the full
and the incremental backup will get "resurrected" after such a
restore.
So if we perform our database backups with incremental
backups as described above, we could end up with additional
files after the restore, because PostgreSQL files can get
deleted (e.g. during DROP TABLE or TRUNCATE TABLE).
My question is:
Could such "resurrected" files (data files, files in
pg_xlog, pg_clog or elsewhere) cause a problem for the database
(other than the obvious one that there may be unnecessary files
about that consume disk space)?
This will not work at all.
Try re-reading the instructions on backup in the manual.
oh and always, always, always test your backup works before you actually need it!
Peter Childs
"Peter Childs" <peterachilds@gmail.com> writes: > On 08/11/2007, Albe Laurenz <laurenz.albe@wien.gv.at> wrote: >> So if we perform our database backups with incremental >> backups as described above, we could end up with additional >> files after the restore, because PostgreSQL files can get >> deleted (e.g. during DROP TABLE or TRUNCATE TABLE). >> >> Could such "resurrected" files (data files, files in >> pg_xlog, pg_clog or elsewhere) cause a problem for the database >> (other than the obvious one that there may be unnecessary files >> about that consume disk space)? > This will not work at all. To be more specific: the resurrected files aren't the problem; offhand I see no reason they'd create any issue beyond wasted disk space. The problem is version skew between files that were backed up at slightly different times, leading to inconsistency. You could make this work if you shut down Postgres whenever you are taking a backup, but as a means for backing up a live database it indeed won't work at all. regards, tom lane
Peter Childs wrote: >> We use a tape backup software that does "incremental backups" [...] >> So if we perform our database backups with incremental >> backups as described above, we could end up with additional >> files after the restore, because PostgreSQL files can get >> deleted (e.g. during DROP TABLE or TRUNCATE TABLE). >> >> My question is: >> >> Could such "resurrected" files (data files, files in >> pg_xlog, pg_clog or elsewhere) cause a problem for the database >> (other than the obvious one that there may be unnecessary files >> about that consume disk space)? > > This will not work at all. Can you give me a good reason why? > Try re-reading the instructions on backup in the manual. I know them well. That is why I ask if this questionable procedure could lead to damage. Yours, Laurenz Albe
On Nov 8, 2007 9:39 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Peter Childs" <peterachilds@gmail.com> writes: > > On 08/11/2007, Albe Laurenz <laurenz.albe@wien.gv.at> wrote: > >> So if we perform our database backups with incremental > >> backups as described above, we could end up with additional > >> files after the restore, because PostgreSQL files can get > >> deleted (e.g. during DROP TABLE or TRUNCATE TABLE). > >> > >> Could such "resurrected" files (data files, files in > >> pg_xlog, pg_clog or elsewhere) cause a problem for the database > >> (other than the obvious one that there may be unnecessary files > >> about that consume disk space)? > > > This will not work at all. > > To be more specific: the resurrected files aren't the problem; > offhand I see no reason they'd create any issue beyond wasted > disk space. The problem is version skew between files that were > backed up at slightly different times, leading to inconsistency. > > You could make this work if you shut down Postgres whenever you > are taking a backup, but as a means for backing up a live database > it indeed won't work at all. I think if you had real snapshotting file systems you could use the snapshots to create your backups. But this seems like a lot of work to avoid implementing PITR to me.
On Thursday 08 November 2007, "Albe Laurenz" <laurenz.albe@wien.gv.at> wrote: > Can you give me a good reason why? > > > Try re-reading the instructions on backup in the manual. > > I know them well. That is why I ask if this questionable procedure > could lead to damage. You cannot backup a live database with a filesystem backup and expect it to work afterwards, unless you take all the steps required to produce a PITR base backup. It's not even "questionable". It's a fundamental misunderstanding of what is required to backup a database. -- Peak Oil is now accepted as inevitable, and the debate only becomes as to when - James Schlesinger, former US Secretary of Energy
Tom Lane wrote: >>> So if we perform our database backups with incremental >>> backups as described above, we could end up with additional >>> files after the restore, because PostgreSQL files can get >>> deleted (e.g. during DROP TABLE or TRUNCATE TABLE). >>> >>> Could such "resurrected" files (data files, files in >>> pg_xlog, pg_clog or elsewhere) cause a problem for the database >>> (other than the obvious one that there may be unnecessary files >>> about that consume disk space)? >> >> This will not work at all. > > To be more specific: the resurrected files aren't the problem; > offhand I see no reason they'd create any issue beyond wasted > disk space. The problem is version skew between files that were > backed up at slightly different times, leading to inconsistency. I should have mentioned that before the (incremental) backup there would be a pg_start_backup() and a pg_stop_backup() afterwards, and we would use PITR. So there could only be three kinds of files: - Files that did not change since the full backup, restored from there. They should therefore look exactly as if the online backup were performed in the normal way. - Files that have changed or are new, restored from the incremental backup. These will also be ok, because they were backed up between pg_start_backup() and pg_stop_backup(). - Files that have been deleted between full and incremental backup and have been resurrected. This third group is the only one which might be problematic, as far as I can see, because PostgreSQL will no expect them to be there. The version skew between files backed up at slightly different times should be taken care of by PITR, shouldn't it? Yours, Laurenz Albe
On Thu, 2007-11-08 at 17:11 +0100, Albe Laurenz wrote: > Tom Lane wrote: > >>> So if we perform our database backups with incremental > >>> backups as described above, we could end up with additional > >>> files after the restore, because PostgreSQL files can get > >>> deleted (e.g. during DROP TABLE or TRUNCATE TABLE). > >>> > >>> Could such "resurrected" files (data files, files in > >>> pg_xlog, pg_clog or elsewhere) cause a problem for the database > >>> (other than the obvious one that there may be unnecessary files > >>> about that consume disk space)? > >> > >> This will not work at all. > > > > To be more specific: the resurrected files aren't the problem; > > offhand I see no reason they'd create any issue beyond wasted > > disk space. The problem is version skew between files that were > > backed up at slightly different times, leading to inconsistency. > > I should have mentioned that before the (incremental) backup > there would be a pg_start_backup() and a pg_stop_backup() > afterwards, and we would use PITR. > > So there could only be three kinds of files: > - Files that did not change since the full backup, restored > from there. They should therefore look exactly as if the > online backup were performed in the normal way. > - Files that have changed or are new, restored from the > incremental backup. These will also be ok, because > they were backed up between pg_start_backup() and > pg_stop_backup(). > - Files that have been deleted between full and incremental > backup and have been resurrected. > > This third group is the only one which might be problematic, > as far as I can see, because PostgreSQL will no expect them to > be there. > > The version skew between files backed up at slightly different > times should be taken care of by PITR, shouldn't it? The backup is not instantaneous, so there is no single time at which the hot backup takes place. So deciding whether a file has changed based upon a comparison of two file timestamps cannot work. You would need to take timestamps for the file both before the pg_start_backup() and after the pg_stop_backup() of the file for both full and incremental backups. If all four timestamps are equivalent, then you are safe. The relfilenode ids are potentially reused after a period of time, so that could cause errors if not catered for on the incremental restore. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com
>>>>> So if we perform our database backups with incremental >>>>> backups as described above, we could end up with additional >>>>> files after the restore, because PostgreSQL files can get >>>>> deleted (e.g. during DROP TABLE or TRUNCATE TABLE). >>>>> >>>>> Could such "resurrected" files (data files, files in >>>>> pg_xlog, pg_clog or elsewhere) cause a problem for the database >>>>> (other than the obvious one that there may be unnecessary files >>>>> about that consume disk space)? >>>> >>>> This will not work at all. >>> >>> To be more specific: the resurrected files aren't the problem; >>> offhand I see no reason they'd create any issue beyond wasted >>> disk space. The problem is version skew between files that were >>> backed up at slightly different times, leading to inconsistency. >> >> I should have mentioned that before the (incremental) backup >> there would be a pg_start_backup() and a pg_stop_backup() >> afterwards, and we would use PITR. >> >> So there could only be three kinds of files: >> - Files that did not change since the full backup, restored >> from there. They should therefore look exactly as if the >> online backup were performed in the normal way. >> - Files that have changed or are new, restored from the >> incremental backup. These will also be ok, because >> they were backed up between pg_start_backup() and >> pg_stop_backup(). >> - Files that have been deleted between full and incremental >> backup and have been resurrected. >> >> This third group is the only one which might be problematic, >> as far as I can see, because PostgreSQL will no expect them to >> be there. >> >> The version skew between files backed up at slightly different >> times should be taken care of by PITR, shouldn't it? > > The backup is not instantaneous, so there is no single time > at which the hot backup takes place. So deciding whether > a file has changed based upon a comparison of two file timestamps > cannot work. You would need to take timestamps for the file both > before the pg_start_backup() and after the pg_stop_backup() > of the file for both full and incremental backups. > If all four timestamps are equivalent, then you are safe. I am afraid that there is still a misunderstanding. The procedure would be as follows: The backup: - pg_start_backup() - full backup of the PostgreSQL files - pg_stop_backup() Next day: - pg_start_backup() - backup of the files that have changed since the last backup - pg_stop_backup() The recovery: - restore files from the full backup - restore files from the incremental backup - create recovery.conf and start the server With a normal online backup, the backup also does not take place at a single time. Why is that no problem there, but is a problem here? > The relfilenode ids are potentially reused after a period of time, so > that could cause errors if not catered for on the incremental restore. That was my original fear. What happens if PostgreSQL tries to reuse a relfilenode and the file is already present? Will the database crash or report an error? Yours, Laurenz Albe
On Fri, 2007-11-09 at 09:24 +0100, Albe Laurenz wrote: > >>>>> So if we perform our database backups with incremental > >>>>> backups as described above, we could end up with additional > >>>>> files after the restore, because PostgreSQL files can get > >>>>> deleted (e.g. during DROP TABLE or TRUNCATE TABLE). > >>>>> > >>>>> Could such "resurrected" files (data files, files in > >>>>> pg_xlog, pg_clog or elsewhere) cause a problem for the database > >>>>> (other than the obvious one that there may be unnecessary files > >>>>> about that consume disk space)? > >>>> > >>>> This will not work at all. > >>> > >>> To be more specific: the resurrected files aren't the problem; > >>> offhand I see no reason they'd create any issue beyond wasted > >>> disk space. The problem is version skew between files that were > >>> backed up at slightly different times, leading to inconsistency. > >> > >> I should have mentioned that before the (incremental) backup > >> there would be a pg_start_backup() and a pg_stop_backup() > >> afterwards, and we would use PITR. > >> > >> So there could only be three kinds of files: > >> - Files that did not change since the full backup, restored > >> from there. They should therefore look exactly as if the > >> online backup were performed in the normal way. > >> - Files that have changed or are new, restored from the > >> incremental backup. These will also be ok, because > >> they were backed up between pg_start_backup() and > >> pg_stop_backup(). > >> - Files that have been deleted between full and incremental > >> backup and have been resurrected. > >> > >> This third group is the only one which might be problematic, > >> as far as I can see, because PostgreSQL will no expect them to > >> be there. > >> > >> The version skew between files backed up at slightly different > >> times should be taken care of by PITR, shouldn't it? > > > > The backup is not instantaneous, so there is no single time > > at which the hot backup takes place. So deciding whether > > a file has changed based upon a comparison of two file timestamps > > cannot work. You would need to take timestamps for the file both > > before the pg_start_backup() and after the pg_stop_backup() > > of the file for both full and incremental backups. > > If all four timestamps are equivalent, then you are safe. > > I am afraid that there is still a misunderstanding. The > procedure would be as follows: > > The backup: > time1 > - pg_start_backup() > - full backup of the PostgreSQL files > - pg_stop_backup() time2 > > Next day: > time3 > - pg_start_backup() > - backup of the files that have changed since the last backup > - pg_stop_backup() time4 > The recovery: > > - restore files from the full backup > - restore files from the incremental backup > - create recovery.conf and start the server Yes, I understood that. > With a normal online backup, the backup also does not take > place at a single time. ;-) > Why is that no problem there, but is > a problem here? I'm not sure where "there" and "here" are, but it seems you have misunderstood. There *is* a problem with normal online backup, as many people have now explained, though I have shown you a way to do this. Please reread. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com
Simon Riggs wrote: >>>>>>> So if we perform our database backups with incremental >>>>>>> backups as described above, we could end up with additional >>>>>>> files after the restore, because PostgreSQL files can get >>>>>>> deleted (e.g. during DROP TABLE or TRUNCATE TABLE). >>>>>>> >>>>>>> Could such "resurrected" files (data files, files in >>>>>>> pg_xlog, pg_clog or elsewhere) cause a problem for >>>>>>> the database? >>> >>> The backup is not instantaneous, so there is no single time >>> at which the hot backup takes place. So deciding whether >>> a file has changed based upon a comparison of two file timestamps >>> cannot work. You would need to take timestamps for the file both >>> before the pg_start_backup() and after the pg_stop_backup() >>> of the file for both full and incremental backups. >>> If all four timestamps are equivalent, then you are safe. I think that understanding is finally dawning here. The problem you see is that the backup software might decide that the file has not been changed, skip it and go on backing up other files, but the file can still be modified before pg_stop_backup(), correct? Sorry for being such a blockhead, I can see now why this is a bad idea. Thanks for the patience, Laurenz Albe
On Fri, 2007-11-09 at 10:28 +0100, Albe Laurenz wrote: > Simon Riggs wrote: > >>>>>>> So if we perform our database backups with incremental > >>>>>>> backups as described above, we could end up with additional > >>>>>>> files after the restore, because PostgreSQL files can get > >>>>>>> deleted (e.g. during DROP TABLE or TRUNCATE TABLE). > >>>>>>> > >>>>>>> Could such "resurrected" files (data files, files in > >>>>>>> pg_xlog, pg_clog or elsewhere) cause a problem for > >>>>>>> the database? > >>> > >>> The backup is not instantaneous, so there is no single time > >>> at which the hot backup takes place. So deciding whether > >>> a file has changed based upon a comparison of two file timestamps > >>> cannot work. You would need to take timestamps for the file both > >>> before the pg_start_backup() and after the pg_stop_backup() > >>> of the file for both full and incremental backups. > >>> If all four timestamps are equivalent, then you are safe. > > I think that understanding is finally dawning here. > > The problem you see is that the backup software might decide > that the file has not been changed, skip it and go on backing > up other files, but the file can still be modified before > pg_stop_backup(), correct? Correct. I doubt whether the backup software performs these checks, but if you have time please bug your vendor for details. Note that the check must be performed *after* the backup, to see whether the backup itself is valid, which is a little strange. That only makes sense when using frozen snapshot techniques. Any information about specific vendors interface is always welcome. Speaking to them directly hasn't been very effective, but when there is a paying customer they are much more responsive. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com
Simon Riggs <simon@2ndquadrant.com> writes: > On Fri, 2007-11-09 at 10:28 +0100, Albe Laurenz wrote: >> I think that understanding is finally dawning here. >> >> The problem you see is that the backup software might decide >> that the file has not been changed, skip it and go on backing >> up other files, but the file can still be modified before >> pg_stop_backup(), correct? > Correct. Surely that's nonsense --- otherwise a time-extended base backup could not work either. What is required of the filesystem backup process is that each 8K page of each file be restored to a state that it had at some time between pg_start_backup and pg_stop_backup. The exact time can be different for different pages. I don't see a reason to think that a base+incremental backup method can't meet that requirement. regards, tom lane
Tom Lane wrote: >>> The problem you see is that the backup software might decide >>> that the file has not been changed, skip it and go on backing >>> up other files, but the file can still be modified before >>> pg_stop_backup(), correct? > >> Correct. > > Surely that's nonsense --- otherwise a time-extended base backup > could not work either. > > What is required of the filesystem backup process is that each 8K page > of each file be restored to a state that it had at some time between > pg_start_backup and pg_stop_backup. The exact time can be > different for different pages. I don't see a reason to think that a > base+incremental backup method can't meet that requirement. Ho hum, you're right. In the scenario on top, the file that was not backed up during incremental backup will - after restore of full and incremental backup - be in a state that was valid at some time between pg_start_backup and pg_stop_backup. So that should work. What remains are the resurrected files. You said that they shouldn't be a problem, right? What happens if PostgreSQL wants to create a new data file and that file already exists? Could it lead to problems in other directories like pg_clog? Thank you all for your replies, BTW. Yours, Laurenz Albe
On Fri, 2007-11-09 at 10:59 -0500, Tom Lane wrote: > Simon Riggs <simon@2ndquadrant.com> writes: > > On Fri, 2007-11-09 at 10:28 +0100, Albe Laurenz wrote: > >> I think that understanding is finally dawning here. > >> > >> The problem you see is that the backup software might decide > >> that the file has not been changed, skip it and go on backing > >> up other files, but the file can still be modified before > >> pg_stop_backup(), correct? > > > Correct. > > Surely that's nonsense --- otherwise a time-extended base backup > could not work either. > > What is required of the filesystem backup process is that each 8K page > of each file be restored to a state that it had at some time between > pg_start_backup and pg_stop_backup. The exact time can be different for > different pages. I don't see a reason to think that a base+incremental > backup method can't meet that requirement. Hmm, OK, I do think we can improve on what I said before. What I said was safe though, not nonsense, plus I think the difference between the two is a hair's breadth in practice. Most of the time we're taking about excluding historical data partitions, designed specifically to minimise backup windows and data maintenance. The question is which timestamps do you compare in order to arrive at the maximal set of files that don't need to be backed up twice? I'm assuming we're talking about starting recovery at the second pg_start_backup(). Any file that changes during the first backup cannot be a candidate, so timestamp before first backup must equal timestamp before second backup. So we can effectively reduce things to just 2 timestamps, not 4 as I had originally said. So we have: pg_start_backup() timestamp1 full backup pg_stop_backup() ... pg_start_backup() --- WAL chain starts here timestamp2 incremental backup pg_stop_backup() Any file for which ts1 == ts2 can be skipped during the incremental backup. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com