Обсуждение: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby
BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby
От
l@lrowe.co.uk
Дата:
The following bug has been logged on the website: Bug reference: 13685 Logged by: Laurence Rwoe Email address: l@lrowe.co.uk PostgreSQL version: 9.4.5 Operating system: Mac OS X 10.10 Description: I'm seeing Postgres 9.4.5 archive while idle every archive_timeout when I set ``wal_level hot_standby``: $ initdb testpg $ cat << 'EOF' >> testpg/postgresql.conf > wal_level = hot_standby > archive_mode = on > archive_timeout = 10 > checkpoint_timeout = 1h > archive_command = 'echo $(date) archive %p' > log_checkpoints = on > EOF $ postgres -D testpg LOG: database system was shut down at 2015-10-13 11:58:45 PDT LOG: MultiXact member wraparound protections are now enabled LOG: autovacuum launcher started LOG: database system is ready to accept connections Tue Oct 13 12:00:47 PDT 2015 archive pg_xlog/000000010000000000000001 Tue Oct 13 12:00:57 PDT 2015 archive pg_xlog/000000010000000000000002 LOG: checkpoint starting: xlog Tue Oct 13 12:01:07 PDT 2015 archive pg_xlog/000000010000000000000003 LOG: checkpoint complete: wrote 0 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 0 recycled; write=0.000 s, sync=0.000 s, total=0.026 s; sync files=0, longest=0.000 s, average=0.000 s Tue Oct 13 12:01:17 PDT 2015 archive pg_xlog/000000010000000000000004 Tue Oct 13 12:01:27 PDT 2015 archive pg_xlog/000000010000000000000005 LOG: checkpoint starting: xlog Tue Oct 13 12:01:38 PDT 2015 archive pg_xlog/000000010000000000000006 LOG: checkpoint complete: wrote 0 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 3 recycled; write=0.000 s, sync=0.000 s, total=0.027 s; sync files=0, longest=0.000 s, average=0.000 s Tue Oct 13 12:01:48 PDT 2015 archive pg_xlog/000000010000000000000007 Tue Oct 13 12:01:58 PDT 2015 archive pg_xlog/000000010000000000000008 LOG: checkpoint starting: xlog LOG: checkpoint complete: wrote 0 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 3 recycled; write=0.000 s, sync=0.000 s, total=0.001 s; sync files=0, longest=0.000 s, average=0.000 s At ``wal_level archive`` I only see archiving every checkpoint_timeout (that it archives every checkpoint_timeout is a known limitation, see http://www.postgresql.org/message-id/1407389876762-5813999.post@n5.nabble.com): $ initdb testpg $ cat << 'EOF' >> testpg/postgresql.conf > wal_level = archive > archive_mode = on > archive_timeout = 10 > checkpoint_timeout = 60 > archive_command = 'echo $(date) archive %p' > log_checkpoints = on > EOF # lrowe@Laurences-iMac ~/scratch $ postgres -D testpg LOG: database system was shut down at 2015-10-13 12:25:38 PDT LOG: MultiXact member wraparound protections are now enabled LOG: autovacuum launcher started LOG: database system is ready to accept connections Tue Oct 13 12:25:49 PDT 2015 archive pg_xlog/000000010000000000000001 LOG: checkpoint starting: time LOG: checkpoint complete: wrote 0 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 0 recycled; write=0.000 s, sync=0.000 s, total=0.029 s; sync files=0, longest=0.000 s, average=0.000 s Tue Oct 13 12:26:39 PDT 2015 archive pg_xlog/000000010000000000000002 LOG: checkpoint starting: time LOG: checkpoint complete: wrote 0 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 1 recycled; write=0.000 s, sync=0.000 s, total=0.030 s; sync files=0, longest=0.000 s, average=0.000 s Tue Oct 13 12:27:39 PDT 2015 archive pg_xlog/000000010000000000000003 I think this additional archiving at wal_level hot_standby is a bug.
Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby
От
Michael Paquier
Дата:
On Sat, Oct 17, 2015 at 5:30 AM, <l@lrowe.co.uk> wrote:
> I'm seeing Postgres 9.4.5 archive while idle every archive_timeout when I
> set ``wal_level hot_standby``.
> At ``wal_level archive`` I only see archiving every checkpoint_timeout (that
> it archives every checkpoint_timeout is a known limitation, see
> http://www.postgresql.org/message-id/1407389876762-5813999.post@n5.nabble.com):
> I think this additional archiving at wal_level hot_standby is a bug.
Agreed. There is indeed a difference between the way 9.3 and 9.4 behave. When wal_level = hot_standby, with 9.4 a segment will be archived depending on archive_timeout as you mention, and that's not the case of 9.3. There is definitely a regression here: we should not archive a segment if there is no activity.
If I look at the contents of the segments with 9.4 when there is no activity, I am seeing that actually a record XLOG_RUNNING_XACTS is generated all the time after switching a segment, leading to the archiving of this segment because server thinks that there is new data, and actually there is, so the segment will be archived... Here is for example the output of pg_xlogdump in this case:
$ pg_xlogdump 000000010000000000000018
rmgr: Standby len (rec/tot): 24/ 56, tx: 0, lsn: 0/18000028, prev 0/17000060, bkp: 0000, desc: running xacts: nextXid 1001 latestCompletedXid 1000 oldestRunningXid 1001
rmgr: XLOG len (rec/tot): 0/ 32, tx: 0, lsn: 0/18000060, prev 0/18000028, bkp: 0000, desc: xlog switch
--
Michael
Michael
Вложения
Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby
От
Michael Paquier
Дата:
On Sat, Oct 17, 2015 at 11:10 PM, Michael Paquier <michael.paquier@gmail.com>wrote: > I think that a simple idea would be to not call LogStandbySnapshot() when > we are still at the beginning of a new segment. We know that the first page > of a WAL segment has a size of SizeOfXLogLongPHD, so just having a check on > that sounds enough to me. Per se the patch attached that should be applied > down to 9.4. This fixes the regression reported by Laurence for me. > This bug fix is registered in next CF so as we do not lose track of it; https://commitfest.postgresql.org/7/398/ -- Michael
Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby
От
Michael Paquier
Дата:
On Tue, Oct 20, 2015 at 2:21 AM, Michael Paquier wrote: > On Sat, Oct 17, 2015 at 11:10 PM, Michael Paquier wrote: >> I think that a simple idea would be to not call LogStandbySnapshot() when >> we are still at the beginning of a new segment. We know that the first page >> of a WAL segment has a size of SizeOfXLogLongPHD, so just having a check on >> that sounds enough to me. Per se the patch attached that should be applied >> down to 9.4. This fixes the regression reported by Laurence for me. > > This bug fix is registered in next CF so as we do not lose track of it; > https://commitfest.postgresql.org/7/398/ Andres has mentioned me during Postgres Europe that instead of looking at if the last inserted record was at the beginning of a new segment, we had better check if some progress has been done since the last checkpoint that happened. Attached is a patch adding some logic to track the LSN position of the last checkpoint record created, and log the standby activity only if some progress has been done since. It seems that we had better be sure that the last checkpoint record is neither the last inserted record, nor the one previously inserted before logging the standby activity, as a XLOG_STANDBY_LOCK record may be added in between. Regards, -- Michael