Обсуждение: missing history file
My warm standby server is looking for a history file when booting up. It is looking for 00000001.history file to be exact. Since my *live* server doesn’t produce such file, I create an empty 00000001.history file in the archive directory and that seems to satisfy this requirement allowing the standby server to come up in the recovery mode.
It would be nice to know though how such file is created. I know that the *live* server creates *.backup file as a result of pg_stop_backup() command and this file is accurately archived on the standby server. Should I be renaming this file, or is there some other mechanism to tell my standby server what the history file is?
Thanks in advance,
~george
On Fri, 2007-06-29 at 07:55 -0400, George Wilk wrote: > My warm standby server is looking for a history file when booting up. > It is looking for 00000001.history file to be exact. Just ignore 00000001. Recovery will work fine even if absent. Don't ignore all history files though, just that one. Hmmm, come to think of it, why is it requesting it at all? We should just skip that request. > Since my *live* server doesn’t produce such file, I create an empty > 00000001.history file in the archive directory and that seems to > satisfy this requirement allowing the standby server to come up in the > recovery mode. Well, that should at least generate a message that says "history file empty, using targetTLI". > It would be nice to know though how such file is created. I know that > the *live* server creates *.backup file as a result of > pg_stop_backup() command and this file is accurately archived on the > standby server. Should I be renaming this file, or is there some > other mechanism to tell my standby server what the history file is? The timeline history file is only required when you do a recovery of a system that has itself already undergone a PITR. Have a look at pg_standby, accessible via CVS in contrib/pg_standby. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
Thanks, Simon. I will ignore the request for the history file in my recovery_command from now on. Is the timeline history file needed when trying to put the standby server back into the recovery mode, after it assumed the primary role? (i.e. standby server goes *live*, and is subsequently restarted in the recovery mode). Is this a valid scenario at all, or should I be taking a new base backup and starting over? I am running into some problems when attempting this. ~george -----Original Message----- From: Simon Riggs [mailto:simon@2ndquadrant.com] Sent: Friday, June 29, 2007 9:42 AM To: George Wilk Cc: pgsql-admin@postgresql.org Subject: Re: [ADMIN] missing history file On Fri, 2007-06-29 at 07:55 -0400, George Wilk wrote: > My warm standby server is looking for a history file when booting up. > It is looking for 00000001.history file to be exact. Just ignore 00000001. Recovery will work fine even if absent. Don't ignore all history files though, just that one. Hmmm, come to think of it, why is it requesting it at all? We should just skip that request. > Since my *live* server doesn't produce such file, I create an empty > 00000001.history file in the archive directory and that seems to > satisfy this requirement allowing the standby server to come up in the > recovery mode. Well, that should at least generate a message that says "history file empty, using targetTLI". > It would be nice to know though how such file is created. I know that > the *live* server creates *.backup file as a result of > pg_stop_backup() command and this file is accurately archived on the > standby server. Should I be renaming this file, or is there some > other mechanism to tell my standby server what the history file is? The timeline history file is only required when you do a recovery of a system that has itself already undergone a PITR. Have a look at pg_standby, accessible via CVS in contrib/pg_standby. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
On Fri, 2007-06-29 at 09:57 -0400, George Wilk wrote: > Thanks, Simon. I will ignore the request for the history file in my > recovery_command from now on. > > Is the timeline history file needed when trying to put the standby server > back into the recovery mode, after it assumed the primary role? (i.e. > standby server goes *live*, and is subsequently restarted in the recovery > mode). No, not needed. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
"Simon Riggs" <simon@2ndquadrant.com> writes: > Just ignore 00000001. Recovery will work fine even if absent. Don't > ignore all history files though, just that one. Hmmm, come to think of > it, why is it requesting it at all? We should just skip that request. No, because then people would misdesign their recovery scripts to not be able to deal with not finding a history file. As things are, they will certainly be exposed to that case in any testing they do. If we optimize this call away, then they won't see the case until they're in very deep doo-doo. regards, tom lane
Tom, When and by what process is the history file being created? My standby server seems to be looking for it when put back in the recovery mode, after functioning as primary for a while. How should I handle missing history file in my script? Cheers, ~george -----Original Message----- From: Tom Lane [mailto:tgl@sss.pgh.pa.us] Sent: Friday, June 29, 2007 10:53 AM To: Simon Riggs Cc: George Wilk; pgsql-admin@postgresql.org Subject: Re: [ADMIN] missing history file "Simon Riggs" <simon@2ndquadrant.com> writes: > Just ignore 00000001. Recovery will work fine even if absent. Don't > ignore all history files though, just that one. Hmmm, come to think of > it, why is it requesting it at all? We should just skip that request. No, because then people would misdesign their recovery scripts to not be able to deal with not finding a history file. As things are, they will certainly be exposed to that case in any testing they do. If we optimize this call away, then they won't see the case until they're in very deep doo-doo. regards, tom lane
>>> On Fri, Jun 29, 2007 at 9:52 AM, in message <12838.1183128750@sss.pgh.pa.us>, Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Simon Riggs" <simon@2ndquadrant.com> writes: >> Just ignore 00000001. Recovery will work fine even if absent. Don't >> ignore all history files though, just that one. Hmmm, come to think of >> it, why is it requesting it at all? We should just skip that request. > > No, because then people would misdesign their recovery scripts to not > be able to deal with not finding a history file. As things are, they > will certainly be exposed to that case in any testing they do. If we > optimize this call away, then they won't see the case until they're in > very deep doo-doo. We certainly were exposed to the case. We weren't able to turn up any documenation on it, so we added these lines to our recovery script: if [[ $1 == *.history ]] ; then exit 1 fi Our warm standbys have apparently been working fine since. Is there documentation of this that we missed? Are our warm standby databases useful at this point, or have we wandered into very deeep doo-doo already? Based on Simon's email, I went and modified one line of our script. I'll paste the current form below my "signature". Please let me know if we're off base. -Kevin #! /bin/bash # Pick out county name from the back of the path. # The value of $PWD will be: /var/pgsql/data/county/<countyName>/data countyName=`dirname $PWD` countyName=`basename $countyName` while [ ! -f /var/pgsql/data/county/$countyName/wal-files/$1.gz \ -a ! -f /var/pgsql/data/county/$countyName/DONE \ -o -f /var/pgsql/data/county/$countyName/wal-files/rsync-in-progress ] do if [ $1 == 00000001.history ] ; then exit 1 fi sleep 10 # /* wait for ~10 sec */ done gunzip < /var/pgsql/data/county/$countyName/wal-files/$1.gz > "$2"
"George Wilk" <gwilk@ellacoya.com> writes: > When and by what process is the history file being created? My standby > server seems to be looking for it when put back in the recovery mode, after > functioning as primary for a while. > How should I handle missing history file in my script? History files are only created when you do a PITR recovery that stops short of the end of WAL (ie, you gave it an explicit stopping point criterion). So basically they never appear except by manual intervention on the primary server. A standby script should probably handle requests for them by looking to see if they're available, and returning 'em if so, but not waiting if they are not. Offhand I would recommend the same strategy for any requested filename that's not a plain WAL segment file (ie, all hex digits). regards, tom lane
>>> On Fri, Jun 29, 2007 at 11:47 AM, in message <5332.1183135679@sss.pgh.pa.us>, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > History files are only created when you do a PITR recovery that stops > short of the end of WAL (ie, you gave it an explicit stopping point > criterion). So basically they never appear except by manual > intervention on the primary server. A standby script should probably > handle requests for them by looking to see if they're available, and > returning 'em if so, but not waiting if they are not. > > Offhand I would recommend the same strategy for any requested filename > that's not a plain WAL segment file (ie, all hex digits). I suspect that it's worth waiting for something like this, too?: 000000010000000A000000CF.0000E744.backup
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Offhand I would recommend the same strategy for any requested filename >> that's not a plain WAL segment file (ie, all hex digits). > > I suspect that it's worth waiting for something like this, too?: > 000000010000000A000000CF.0000E744.backup No, I don't think so. AFAICS the slave server would only ask for one of those during its initial cold start from a base backup, and it'll be looking for the one that should have been generated at completion of that base backup. If it ain't there, it's unlikely to appear later. regards, tom lane
>>> On Fri, Jun 29, 2007 at 12:29 PM, in message <5750.1183138146@sss.pgh.pa.us>, Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: >> Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Offhand I would recommend the same strategy for any requested filename >>> that's not a plain WAL segment file (ie, all hex digits). >> >> I suspect that it's worth waiting for something like this, too?: >> 000000010000000A000000CF.0000E744.backup > > No, I don't think so. AFAICS the slave server would only ask for one > of those during its initial cold start from a base backup, and it'll be > looking for the one that should have been generated at completion of > that base backup. If it ain't there, it's unlikely to appear later. Fair enough. It would have saved us some time if this was mentioned in the warm standby documentation. I'll try to put a doc patch together. -Kevin
On Fri, 2007-06-29 at 10:52 -0400, Tom Lane wrote: > "Simon Riggs" <simon@2ndquadrant.com> writes: > > Just ignore 00000001. Recovery will work fine even if absent. Don't > > ignore all history files though, just that one. Hmmm, come to think of > > it, why is it requesting it at all? We should just skip that request. > > No, because then people would misdesign their recovery scripts to not > be able to deal with not finding a history file. As things are, they > will certainly be exposed to that case in any testing they do. If we > optimize this call away, then they won't see the case until they're in > very deep doo-doo. Main reason for suggesting this is that there is already code to optimize the call away in one place, but not in another, which seems strange either way. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com