I am setting up a postgresql server (duh) and am using archive_mode=on
The archive command that I am using sends data to an enterprise
backup server across the network, and I must be able to handle outages
of that server without taking down the postgresql server.
Short outages are fine because the archive_command will return a non zero
result to postgresql and it will be retried every minute until successful.
If the backup server is out for a longer time, new WAL files will be created
by postgresql. This will eventually fill the pg_xlog filesystem and bad things
happen :-( To protect the production database functionality, when the pg_xlog
filesystem reaches some percentage full (we chose 90%) then the archive_command
starts reporting a success (return of zero) even though it is not able to
archive the xlog files.
I understand that this prevents me from doing a disaster recovery AND prevents
me from doing a point in time restore, but in our opinion it is better than letting
the database crash.
Now to the question.
Once the archive_command starts lying about its success, postgresql deletes
a number of the xlog files that it has been told have been successfuly archived.
Why does it do this? Can I control it? Can I turn it off?
--
Evan Rempel, Senior Systems Administrator
University of Victoria