Re: Replication failure, slave requesting old segments

Поиск
Список
Период
Сортировка
От Adrian Klaver
Тема Re: Replication failure, slave requesting old segments
Дата
Msg-id 444ada2d-8896-cd74-57dd-531999190182@aklaver.com
обсуждение исходный текст
Ответ на Re: Replication failure, slave requesting old segments  ("Phil Endecott" <spam_from_pgsql_lists@chezphil.org>)
Список pgsql-general
On 08/12/2018 12:53 PM, Phil Endecott wrote:
> Phil Endecott wrote:
>> On the master, I have:
>>
>> wal_level = replica
>> archive_mode = on
>> archive_command = 'ssh backup test ! -f backup/postgresql/archivedir/%f &&
>>                     scp %p backup:backup/postgresql/archivedir/%f'
>>
>> On the slave I have:
>>
>> standby_mode = 'on'
>> primary_conninfo = 'user=postgres host=master port=5432'
>> restore_command = 'scp backup:backup/postgresql/archivedir/%f %p'
>>
>> hot_standby = on
> 
>> 2018-08-11 00:05:50.364 UTC [615] LOG:  restored log file "0000000100000007000000D0" from archive
>> scp: backup/postgresql/archivedir/0000000100000007000000D1: No such file or directory
>> 2018-08-11 00:05:51.325 UTC [7208] LOG:  started streaming WAL from primary at 7/D0000000 on timeline 1
>> 2018-08-11 00:05:51.325 UTC [7208] FATAL:  could not receive data from WAL stream: ERROR:  requested WAL segment
0000000100000007000000D0has already been removed
 
> 
> 
> I am wondering if I need to set wal_keep_segments to at least 1 or 2 for
> this to work.  I currently have it unset and I believe the default is 0.

Given that WAL's are only 16 MB I would probably bump it up to be on 
safe side, or use:

https://www.postgresql.org/docs/9.6/static/warm-standby.html

26.2.6. Replication Slots

Though the above does not limit storage of WAL's, so a long outage could 
result in WAL's piling up.

> 
> My understanding was that when using archive_command/restore_command to copy
> WAL segments it would not be necessary to use wal_keep_segments to retain
> files in pg_xlog on the server; the slave can get everything using a
> combination of copying files using the restore_command and streaming.
> But these lines from the log:
> 
> 2018-08-11 00:12:15.797 UTC [7954] LOG: redo starts at 7/D0F956C0
> 2018-08-11 00:12:16.068 UTC [7954] LOG: consistent recovery state reached at 7/D0FFF088
> 
> make me think that there is an issue when the slave reaches the end of the
> copied WAL file.  I speculate that the useful content of this WAL segment
> ends at FFF088, which is followed by an empty gap due to record sizes.  But
> the slave tries to start streaming from this point, D0FFF088, not D1000000.
> If the master still had a copy of segment D0 then it would be able to stream
> this gap followed by the real content in the current segment D1.
> 
> Does that make any sense at all?
> 
> 
> Regards, Phil.
> 
> 
> 
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com


В списке pgsql-general по дате отправления:

Предыдущее
От: TalGloz
Дата:
Сообщение: Re: PostgreSQL C Language Extension with C++ Code
Следующее
От: TalGloz
Дата:
Сообщение: Re: PostgreSQL C Language Extension with C++ Code