Обсуждение: Standby node using replication slot not visible in pg_stat_replication while catching up

Поиск
Список
Период
Сортировка

Standby node using replication slot not visible in pg_stat_replication while catching up

От
Michael Paquier
Дата:
Hi all,

I have been playing a bit with the replication slots, and I noticed a
weird behavior in such a scenario:
1) Create a master/slave cluster, and have slave use a replication slot
2) Stop the master
3) Create a certain amount of WAL, during my tests I played with 4~5GB of WAL
4) Restart the slave, it catches up with the WALs that master has
retained in pg_xlog.
I noticed that while the standby using the replication slot catches
up, it is not visible in pg_stat_replication on master. This makes
monitoring of the replication lag difficult to follow, particularly in
the case where the standby disconnects from the master. Once the
standby has caught up, it reappears once again in pg_stat_replication.
I didn't have a look at the code to see what is happening, but is this
behavior expected?
Regards,
-- 
Michael



Re: Standby node using replication slot not visible in pg_stat_replication while catching up

От
Andres Freund
Дата:
Hi,

On 2014-03-10 21:06:53 +0900, Michael Paquier wrote:
> I have been playing a bit with the replication slots, and I noticed a
> weird behavior in such a scenario:
> 1) Create a master/slave cluster, and have slave use a replication slot
> 2) Stop the master
> 3) Create a certain amount of WAL, during my tests I played with 4~5GB of WAL
> 4) Restart the slave, it catches up with the WALs that master has
> retained in pg_xlog.
> I noticed that while the standby using the replication slot catches
> up, it is not visible in pg_stat_replication on master. This makes
> monitoring of the replication lag difficult to follow, particularly in
> the case where the standby disconnects from the master. Once the
> standby has caught up, it reappears once again in pg_stat_replication.
> I didn't have a look at the code to see what is happening, but is this
> behavior expected?

Does the use of replication slots actually alter the behaviour? I don't
see how the slot code could influence things to that degree here. Could
it be that it's just restoring code from the standby's pg_xlog or using
restore_command?

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Standby node using replication slot not visible in pg_stat_replication while catching up

От
Michael Paquier
Дата:
On Mon, Mar 10, 2014 at 9:24 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> Hi,
>
> On 2014-03-10 21:06:53 +0900, Michael Paquier wrote:
>> I have been playing a bit with the replication slots, and I noticed a
>> weird behavior in such a scenario:
>> 1) Create a master/slave cluster, and have slave use a replication slot
>> 2) Stop the master
>> 3) Create a certain amount of WAL, during my tests I played with 4~5GB of WAL
>> 4) Restart the slave, it catches up with the WALs that master has
>> retained in pg_xlog.
>> I noticed that while the standby using the replication slot catches
>> up, it is not visible in pg_stat_replication on master. This makes
>> monitoring of the replication lag difficult to follow, particularly in
>> the case where the standby disconnects from the master. Once the
>> standby has caught up, it reappears once again in pg_stat_replication.
>> I didn't have a look at the code to see what is happening, but is this
>> behavior expected?
>
> Does the use of replication slots actually alter the behaviour? I don't
> see how the slot code could influence things to that degree here. Could
> it be that it's just restoring code from the standby's pg_xlog or using
> restore_command?
Sorry for the noise, I'm feeling stupid. Yes the standby was using a
restore_command so it recovered the WAL from archives before reporting
activity back to master.
-- 
Michael