Обсуждение: pg_xlog and standby

Поиск
Список
Период
Сортировка

pg_xlog and standby

От
"Roberto Scattini"
Дата:
hello everybody:

im trying to reconfigure a warm-standby server. the problem is that
for some reason, one day the standby server stopped recovering the
archives. this leaded to a full disk on that server, so i turned off
(commented) the archive_command on the main server.
 i want to restart the procedure described in
http://www.postgresql.org/docs/8.1/interactive/backup-online.html#BACKUP-PITR-RECOVERY
but i dont know how to "safely clean" the main server $DATA/pg_xlog/
dir.
with "safely clean" i mean how do i know which archives can i delete
(or move somewhere) without disrupting the normal operation of the
server.

im using postgres 8.2.5 from source on debian etch.

thanks in advance!


--
Roberto Scattini
 ___     _
 ))_) __ )L __
((__)(('(( ((_)

Re: pg_xlog and standby

От
Erik Jones
Дата:
On Jan 23, 2008, at 9:28 AM, Roberto Scattini wrote:

> hello everybody:
>
> im trying to reconfigure a warm-standby server. the problem is that
> for some reason, one day the standby server stopped recovering the
> archives. this leaded to a full disk on that server, so i turned off
> (commented) the archive_command on the main server.
>  i want to restart the procedure described in
> http://www.postgresql.org/docs/8.1/interactive/backup-
> online.html#BACKUP-PITR-RECOVERY
> but i dont know how to "safely clean" the main server $DATA/pg_xlog/
> dir.
> with "safely clean" i mean how do i know which archives can i delete
> (or move somewhere) without disrupting the normal operation of the
> server.
>
> im using postgres 8.2.5 from source on debian etch.
>
> thanks in advance!

You don't.  The main server should not be keeping archived WAL files
directly in pg_xlog/.  As it queues WAL files to be archived it puts
them in pg_xlog/archive_status/ with file names suffixed with .ready,
once they are archived that suffix changes to .done after which, at
some point (I'm not sure how long/many) they are removed.

Now, if you took your standby server offline, but didn't disable your
archive_command then you've basically been accumulating WALs with
the .ready prefix in the archive_status directory that, if you're
going to start from scratch with your standby, you can safely
delete.  Just make sure you have a couple of WAL files successfully
archived (suffix has changed to .done in the archive_status dir and
you've verified that they've reached whatever directory your standby
expects them to be in) before call pg_start_backup()  and starting
your new base backup.

IMO, the most important point to be had here is DO NOT delete WALs
that sit directly under pg_xlog/.  Mistakes with the rest can be
worked with, you could run into serious problems with your primary
when deleting WALs directly under pg_xlog/.

Also, do you know why your standby stopped recovering?  I'd say you
should make sure you know why and how, otherwise you run the risk of
the same thing happening again.

Erik Jones

DBA | Emma®
erik@myemma.com
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com




Re: pg_xlog and standby

От
"Roberto Scattini"
Дата:
On Jan 23, 2008 2:28 PM, Erik Jones <erik@myemma.com> wrote:
>
> You don't.  The main server should not be keeping archived WAL files
> directly in pg_xlog/.  As it queues WAL files to be archived it puts
> them in pg_xlog/archive_status/ with file names suffixed with .ready,
> once they are archived that suffix changes to .done after which, at
> some point (I'm not sure how long/many) they are removed.
>

mmmmmmmm, ok. the problem that im having is that i have A LOT of
archive files on pg_xlog dir, and thats because the archive_command
keeps failing (the standby server had filled his disk with archives
received but not proccesed), so now, i dont know how i can remove
those files and start again...

> Now, if you took your standby server offline, but didn't disable your
> archive_command then you've basically been accumulating WALs with
> the .ready prefix in the archive_status directory that, if you're
> going to start from scratch with your standby, you can safely
> delete.  Just make sure you have a couple of WAL files successfully
> archived (suffix has changed to .done in the archive_status dir and
> you've verified that they've reached whatever directory your standby
> expects them to be in) before call pg_start_backup()  and starting
> your new base backup.
>
> IMO, the most important point to be had here is DO NOT delete WALs
> that sit directly under pg_xlog/.  Mistakes with the rest can be
> worked with, you could run into serious problems with your primary
> when deleting WALs directly under pg_xlog/.
>

yeah, i agree. but now i have aprox 40GB of archive files in pg_xlog
dir in the production server.  :S

> Also, do you know why your standby stopped recovering?  I'd say you
> should make sure you know why and how, otherwise you run the risk of
> the same thing happening again.

i dont know exactly, but it is very possible that it could be an
unfinished server re-config.

>
> Erik Jones

thanks for your help!

--
Roberto Scattini
 ___     _
 ))_) __ )L __
((__)(('(( ((_)

Re: pg_xlog and standby

От
Erik Jones
Дата:
On Jan 23, 2008, at 2:18 PM, Roberto Scattini wrote:

> On Jan 23, 2008 2:28 PM, Erik Jones <erik@myemma.com> wrote:
>>
>> You don't.  The main server should not be keeping archived WAL files
>> directly in pg_xlog/.  As it queues WAL files to be archived it puts
>> them in pg_xlog/archive_status/ with file names suffixed with .ready,
>> once they are archived that suffix changes to .done after which, at
>> some point (I'm not sure how long/many) they are removed.
>>
>
> mmmmmmmm, ok. the problem that im having is that i have A LOT of
> archive files on pg_xlog dir, and thats because the archive_command
> keeps failing (the standby server had filled his disk with archives
> received but not proccesed), so now, i dont know how i can remove
> those files and start again...
>
>> Now, if you took your standby server offline, but didn't disable your
>> archive_command then you've basically been accumulating WALs with
>> the .ready prefix in the archive_status directory that, if you're
>> going to start from scratch with your standby, you can safely
>> delete.  Just make sure you have a couple of WAL files successfully
>> archived (suffix has changed to .done in the archive_status dir and
>> you've verified that they've reached whatever directory your standby
>> expects them to be in) before call pg_start_backup()  and starting
>> your new base backup.
>>
>> IMO, the most important point to be had here is DO NOT delete WALs
>> that sit directly under pg_xlog/.  Mistakes with the rest can be
>> worked with, you could run into serious problems with your primary
>> when deleting WALs directly under pg_xlog/.
>>
>
> yeah, i agree. but now i have aprox 40GB of archive files in pg_xlog
> dir in the production server.  :S

Watch your directory terminology.  The WALs that have backed up
should be in $PGDATA/pg_xlog/archive_status/ not $PGDATA/pg_xlog/.
Since you are going to start from scratch with you're standby you're
free to delete all of the WAL files in $PGDATA/pg_xlog/
archive_status/ but leave any files directly under $PGDATA/pg_xlog
alone.


Erik Jones

DBA | Emma®
erik@myemma.com
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com




Re: pg_xlog and standby

От
Simon Riggs
Дата:
On Wed, 2008-01-23 at 18:18 -0200, Roberto Scattini wrote:
> the standby server had filled his disk with archives
> received but not proccesed

Sounds like your standby has fallen badly behind. You should always
monitor the lag between primary and standby.

You will need to take steps to ensure the lag is reduced, or you will
continue to have problems with this technique. All asynchronous
replication systems have a potential for falling behind the master.
Fully synchronous replication techniques don't: they force the master to
slow down to a manageable pace.

--
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


Re: pg_xlog and standby

От
Greg Smith
Дата:
On Wed, 23 Jan 2008, Roberto Scattini wrote:

> the problem that im having is that i have A LOT of
> archive files on pg_xlog dir, and thats because the archive_command
> keeps failing (the standby server had filled his disk with archives
> received but not proccesed), so now, i dont know how i can remove
> those files and start again...

Under normal operation the checkpoint process will look at the number of
already created archive files, keep around up to (2*checkpoint_segments+1)
of them for future use, and delete the rest of them.  You never delete
them yourself, the server will take care of that automatically once it
gets to where it makes that decision.  If you set checkpoint_segments to
some very high number they can end up taking many GB worth of storage,
increasing that parameter has at least two costs associated with it (the
other being a longer recovery time).

Managing old archive logs on the backup server is your problem and related
tools like pg_standby help deal with that.  Managing them on the primary
server is that server's problem and you shouldn't touch them.  You can
execute a manual CHECKPOINT at the psql prompt if you want to force this
reclaimation to happen (there has to have been some activity since the
last checkpoint for this to work which doesn't sound like a problem on
your server).

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD