Обсуждение: pg_stat_replication log positions vs base backups

Поиск
Список
Период
Сортировка

pg_stat_replication log positions vs base backups

От
Magnus Hagander
Дата:
Are the values for the log locations really relevant for backup connections? And if they are, we need to document it :) ISTM we are just more or less leaking random data out there?

I'm talking about the actual state=backup connection - not the connection if we're using -x with pg_basebackup. Where we have output like:

state            | backup
sent_location    | 0/0
write_location   | 2/76CE0000
flush_location   | 2/76CC0000
replay_location  | 2/76CBF938

I'm thinking those fields should probably all be NULL for state=backup?

--

Re: pg_stat_replication log positions vs base backups

От
Magnus Hagander
Дата:
On Wed, Nov 25, 2015 at 10:19 AM, Magnus Hagander <magnus@hagander.net> wrote:
Are the values for the log locations really relevant for backup connections? And if they are, we need to document it :) ISTM we are just more or less leaking random data out there?

I'm talking about the actual state=backup connection - not the connection if we're using -x with pg_basebackup. Where we have output like:

state            | backup
sent_location    | 0/0
write_location   | 2/76CE0000
flush_location   | 2/76CC0000
replay_location  | 2/76CBF938

I'm thinking those fields should probably all be NULL for state=backup?


In particular, it seems that in InitWalSenderSlot, we only initialize the sent location. Perhaps this is needed? 

--
Вложения

Re: pg_stat_replication log positions vs base backups

От
Michael Paquier
Дата:


On Wed, Nov 25, 2015 at 7:18 PM, Magnus Hagander <magnus@hagander.net> wrote:

On Wed, Nov 25, 2015 at 10:19 AM, Magnus Hagander <magnus@hagander.net> wrote:
Are the values for the log locations really relevant for backup connections? And if they are, we need to document it :) ISTM we are just more or less leaking random data out there?

I'm talking about the actual state=backup connection - not the connection if we're using -x with pg_basebackup. Where we have output like:

state            | backup
sent_location    | 0/0
write_location   | 2/76CE0000
flush_location   | 2/76CC0000
replay_location  | 2/76CBF938

I'm thinking those fields should probably all be NULL for state=backup?

Hm. You would basically get that when a backup WAL sender is reusing the sender of another node that is not here anymore.
 
In particular, it seems that in InitWalSenderSlot, we only initialize the sent location. Perhaps this is needed?

Yes, that would be nice to start with a clean state. At the same time, I am noticing that pg_stat_get_wal_senders() is comparing flush, apply and write directly with 0. I think those should be InvalidXLogRecPtr. Combined with your patch it gives the attached.
--
Michael
Вложения

Re: pg_stat_replication log positions vs base backups

От
Magnus Hagander
Дата:
On Wed, Nov 25, 2015 at 3:03 PM, Michael Paquier <michael.paquier@gmail.com> wrote:


On Wed, Nov 25, 2015 at 7:18 PM, Magnus Hagander <magnus@hagander.net> wrote:

On Wed, Nov 25, 2015 at 10:19 AM, Magnus Hagander <magnus@hagander.net> wrote:
Are the values for the log locations really relevant for backup connections? And if they are, we need to document it :) ISTM we are just more or less leaking random data out there?

I'm talking about the actual state=backup connection - not the connection if we're using -x with pg_basebackup. Where we have output like:

state            | backup
sent_location    | 0/0
write_location   | 2/76CE0000
flush_location   | 2/76CC0000
replay_location  | 2/76CBF938

I'm thinking those fields should probably all be NULL for state=backup?

Hm. You would basically get that when a backup WAL sender is reusing the sender of another node that is not here anymore.

Yeah - and that's obviously incorrect.

 
In particular, it seems that in InitWalSenderSlot, we only initialize the sent location. Perhaps this is needed?

Yes, that would be nice to start with a clean state. At the same time, I am noticing that pg_stat_get_wal_senders() is comparing flush, apply and write directly with 0. I think those should be InvalidXLogRecPtr. Combined with your patch it gives the attached.

Good point.

Another thought - why do we leave 0/0 in sent location, but turn the other three fields to NULL if it's invalid? That seems inconsistent. Is there a reason for that, or should we fix that as well while we're at it?

--

Re: pg_stat_replication log positions vs base backups

От
Michael Paquier
Дата:


On Wed, Nov 25, 2015 at 11:08 PM, Magnus Hagander <magnus@hagander.net> wrote:
On Wed, Nov 25, 2015 at 3:03 PM, Michael Paquier <michael.paquier@gmail.com> wrote:
On Wed, Nov 25, 2015 at 7:18 PM, Magnus Hagander <magnus@hagander.net> wrote:
In particular, it seems that in InitWalSenderSlot, we only initialize the sent location. Perhaps this is needed?

Yes, that would be nice to start with a clean state. At the same time, I am noticing that pg_stat_get_wal_senders() is comparing flush, apply and write directly with 0. I think those should be InvalidXLogRecPtr. Combined with your patch it gives the attached.

Good point.

Another thought - why do we leave 0/0 in sent location, but turn the other three fields to NULL if it's invalid? That seems inconsistent. Is there a reason for that, or should we fix that as well while we're at it?

In some code paths, values are expected to be equal to 0 as XLogRecPtr values are doing direct comparisons with each other, so you can think that 0 for a XLogRecPtr is slightly different from InvalidXLogRecPtr that could be expected to be invalid but *not* minimal, imagine for example that we decide at some point to redefine it as INT64_MAX. See for example receivedUpto in xlog.c or WalSndWaitForWal in walsender.c. Honestly, I think that it would be nice to make all the logic consistent under a unique InvalidXLogRecPtr banner that is defined at 0 once and for all, adding at the same time a comment in xlogdefs.h mentioning that this should not be redefined to a higher value because comparison logic of XlogRecPtr's depend on that. We have actually a similar constraint with TimeLineID = 0 that is equivalent to infinite. Patch attached following those lines, which should be backpatched for correctness.

Btw, I found at the same time a portion of NOT_USED code setting up a XLogRecPtr incorrectly in GetOldestWALSendPointer.

FWIW, the last time I poked at this code, introducing some kind of MinimalXLogRecPtr logic different to distinguish from InvalidXLogRecPtr I hit a wall, others did not like that much:
http://www.postgresql.org/message-id/CAB7nPqRUaFyTJ024W-HiihW4PwfcadZN9HFMsJQfOZSQtZtDUA@mail.gmail.com

Regards,
--
Michael
Вложения

Re: pg_stat_replication log positions vs base backups

От
Magnus Hagander
Дата:
On Thu, Nov 26, 2015 at 8:17 AM, Michael Paquier <michael.paquier@gmail.com> wrote:


On Wed, Nov 25, 2015 at 11:08 PM, Magnus Hagander <magnus@hagander.net> wrote:
On Wed, Nov 25, 2015 at 3:03 PM, Michael Paquier <michael.paquier@gmail.com> wrote:
On Wed, Nov 25, 2015 at 7:18 PM, Magnus Hagander <magnus@hagander.net> wrote:
In particular, it seems that in InitWalSenderSlot, we only initialize the sent location. Perhaps this is needed?

Yes, that would be nice to start with a clean state. At the same time, I am noticing that pg_stat_get_wal_senders() is comparing flush, apply and write directly with 0. I think those should be InvalidXLogRecPtr. Combined with your patch it gives the attached.

Good point.

Another thought - why do we leave 0/0 in sent location, but turn the other three fields to NULL if it's invalid? That seems inconsistent. Is there a reason for that, or should we fix that as well while we're at it?

In some code paths, values are expected to be equal to 0 as XLogRecPtr values are doing direct comparisons with each other, so you can think that 0 for a XLogRecPtr is slightly different from InvalidXLogRecPtr that could be expected to be invalid but *not* minimal, imagine for example that we decide at some point to redefine it as INT64_MAX. See for example receivedUpto in xlog.c or WalSndWaitForWal in walsender.c. Honestly, I think that it would be nice to make all the logic consistent under a unique InvalidXLogRecPtr banner that is defined at 0 once and for all, adding at the same time a comment in xlogdefs.h mentioning that this should not be redefined to a higher value because comparison logic of XlogRecPtr's depend on that. We have actually a similar constraint with TimeLineID = 0 that is equivalent to infinite. Patch attached following those lines, which should be backpatched for correctness.

I'm only talking about the actual value in pg_stat_replication here, not what we are using internally. These are two different things of course - let's keep them separate for now. In pg_stat_replication, we explicitly check for InvalidXLogRecPtr and then explicitly set the resulting value to NULL in the SQL return.

--

Re: pg_stat_replication log positions vs base backups

От
Michael Paquier
Дата:
On Thu, Nov 26, 2015 at 6:45 PM, Magnus Hagander wrote:
> I'm only talking about the actual value in pg_stat_replication here, not
> what we are using internally. These are two different things of course -
> let's keep them separate for now. In pg_stat_replication, we explicitly
> check for InvalidXLogRecPtr and then explicitly set the resulting value to
> NULL in the SQL return.

No objections from here. I guess you already have a patch?
-- 
Michael



Re: pg_stat_replication log positions vs base backups

От
Magnus Hagander
Дата:
On Thu, Nov 26, 2015 at 1:03 PM, Michael Paquier <michael.paquier@gmail.com> wrote:
On Thu, Nov 26, 2015 at 6:45 PM, Magnus Hagander wrote:
> I'm only talking about the actual value in pg_stat_replication here, not
> what we are using internally. These are two different things of course -
> let's keep them separate for now. In pg_stat_replication, we explicitly
> check for InvalidXLogRecPtr and then explicitly set the resulting value to
> NULL in the SQL return.

No objections from here. I guess you already have a patch?

Well, no, because I haven't figured out which way is the logical one - make them all return NULL or make them all return 0/0... 


--

Re: pg_stat_replication log positions vs base backups

От
Michael Paquier
Дата:
On Thu, Nov 26, 2015 at 10:53 PM, Magnus Hagander <magnus@hagander.net> wrote:
> On Thu, Nov 26, 2015 at 1:03 PM, Michael Paquier <michael.paquier@gmail.com>
> wrote:
>>
>> On Thu, Nov 26, 2015 at 6:45 PM, Magnus Hagander wrote:
>> > I'm only talking about the actual value in pg_stat_replication here, not
>> > what we are using internally. These are two different things of course -
>> > let's keep them separate for now. In pg_stat_replication, we explicitly
>> > check for InvalidXLogRecPtr and then explicitly set the resulting value
>> > to
>> > NULL in the SQL return.
>>
>> No objections from here. I guess you already have a patch?
>
> Well, no, because I haven't figured out which way is the logical one - make
> them all return NULL or make them all return 0/0...

It seems to me that NULL is the logical one. We want to define a value
from the user prospective where things are in an undefined state.
That's my logic flow, other opinions are of course welcome.
-- 
Michael



Re: pg_stat_replication log positions vs base backups

От
Magnus Hagander
Дата:


On Fri, Nov 27, 2015 at 6:07 AM, Michael Paquier <michael.paquier@gmail.com> wrote:
On Thu, Nov 26, 2015 at 10:53 PM, Magnus Hagander <magnus@hagander.net> wrote:
> On Thu, Nov 26, 2015 at 1:03 PM, Michael Paquier <michael.paquier@gmail.com>
> wrote:
>>
>> On Thu, Nov 26, 2015 at 6:45 PM, Magnus Hagander wrote:
>> > I'm only talking about the actual value in pg_stat_replication here, not
>> > what we are using internally. These are two different things of course -
>> > let's keep them separate for now. In pg_stat_replication, we explicitly
>> > check for InvalidXLogRecPtr and then explicitly set the resulting value
>> > to
>> > NULL in the SQL return.
>>
>> No objections from here. I guess you already have a patch?
>
> Well, no, because I haven't figured out which way is the logical one - make
> them all return NULL or make them all return 0/0...

It seems to me that NULL is the logical one. We want to define a value
from the user prospective where things are in an undefined state.
That's my logic flow, other opinions are of course welcome.

I've applied these two patches now.

The one that fixes the initialization backpatched to 9.3 which is the oldest one that has it, and the one that changes the actual 0-vs-NULL output to 9.5 only as it's a behaviour change. 

--

Re: pg_stat_replication log positions vs base backups

От
Michael Paquier
Дата:
On Mon, Dec 14, 2015 at 1:01 AM, Magnus Hagander <magnus@hagander.net> wrote:
> I've applied these two patches now.
>
> The one that fixes the initialization backpatched to 9.3 which is the oldest
> one that has it, and the one that changes the actual 0-vs-NULL output to 9.5
> only as it's a behaviour change.

Thanks!
-- 
Michael



Re: pg_stat_replication log positions vs base backups

От
Michael Paquier
Дата:
On Mon, Dec 14, 2015 at 8:59 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Mon, Dec 14, 2015 at 1:01 AM, Magnus Hagander <magnus@hagander.net> wrote:
>> I've applied these two patches now.
>>
>> The one that fixes the initialization backpatched to 9.3 which is the oldest
>> one that has it, and the one that changes the actual 0-vs-NULL output to 9.5
>> only as it's a behaviour change.
>
> Thanks!

Interesting. I got just today a bug report that is actually a symptom
that people should be careful about: it is possible that
pg_stat_replication reports 1/potential for sync_priority/sync_state
in the case of a WAL sender in "backup" state: a base backup just
needs to reuse the shared memory slot of a standby that was previously
connected. Commit 61c7bee of Magnus fixes the issue, just let's be
careful if there are similar reports that do not include this fix.
-- 
Michael



Re: pg_stat_replication log positions vs base backups

От
Magnus Hagander
Дата:
On Wed, Dec 16, 2015 at 8:34 AM, Michael Paquier <michael.paquier@gmail.com> wrote:
On Mon, Dec 14, 2015 at 8:59 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Mon, Dec 14, 2015 at 1:01 AM, Magnus Hagander <magnus@hagander.net> wrote:
>> I've applied these two patches now.
>>
>> The one that fixes the initialization backpatched to 9.3 which is the oldest
>> one that has it, and the one that changes the actual 0-vs-NULL output to 9.5
>> only as it's a behaviour change.
>
> Thanks!

Interesting. I got just today a bug report that is actually a symptom
that people should be careful about: it is possible that
pg_stat_replication reports 1/potential for sync_priority/sync_state
in the case of a WAL sender in "backup" state: a base backup just
needs to reuse the shared memory slot of a standby that was previously
connected. Commit 61c7bee of Magnus fixes the issue, just let's be
careful if there are similar reports that do not include this fix.

Hmm. With the fix, it returns "async", right?

Perhaps it should return either "backup" or NULL, to be even more clear? And with priority set to NULL? 


--

Re: pg_stat_replication log positions vs base backups

От
Michael Paquier
Дата:
On Wed, Dec 16, 2015 at 7:14 PM, Magnus Hagander <magnus@hagander.net> wrote:
> On Wed, Dec 16, 2015 at 8:34 AM, Michael Paquier <michael.paquier@gmail.com>
> wrote:
>> Interesting. I got just today a bug report that is actually a symptom
>> that people should be careful about: it is possible that
>> pg_stat_replication reports 1/potential for sync_priority/sync_state
>> in the case of a WAL sender in "backup" state: a base backup just
>> needs to reuse the shared memory slot of a standby that was previously
>> connected. Commit 61c7bee of Magnus fixes the issue, just let's be
>> careful if there are similar reports that do not include this fix.
>
>
> Hmm. With the fix, it returns "async", right?

Yes, it returns async with priority set at 0 after your commit. That's
a bit better than the old behavior, still..

> Perhaps it should return either "backup" or NULL, to be even more clear? And
> with priority set to NULL?

I'd vote for just NULL for both fields. This async/sync status has no
real sense for a backup. Thoughts?
-- 
Michael



Re: pg_stat_replication log positions vs base backups

От
Michael Paquier
Дата:
On Wed, Dec 16, 2015 at 9:35 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Wed, Dec 16, 2015 at 7:14 PM, Magnus Hagander <magnus@hagander.net> wrote:
>> On Wed, Dec 16, 2015 at 8:34 AM, Michael Paquier <michael.paquier@gmail.com>
>> wrote:
>>> Interesting. I got just today a bug report that is actually a symptom
>>> that people should be careful about: it is possible that
>>> pg_stat_replication reports 1/potential for sync_priority/sync_state
>>> in the case of a WAL sender in "backup" state: a base backup just
>>> needs to reuse the shared memory slot of a standby that was previously
>>> connected. Commit 61c7bee of Magnus fixes the issue, just let's be
>>> careful if there are similar reports that do not include this fix.
>>
>>
>> Hmm. With the fix, it returns "async", right?
>
> Yes, it returns async with priority set at 0 after your commit. That's
> a bit better than the old behavior, still..
>
>> Perhaps it should return either "backup" or NULL, to be even more clear? And
>> with priority set to NULL?
>
> I'd vote for just NULL for both fields. This async/sync status has no
> real sense for a backup. Thoughts?

While looking again at that, considering a WAL sender having an
invalid WAL flush position as always asynchronous like a base backup
one is directly mentioned in walsender.c, per 0c4b468:
+                       /*
+                        * Treat a standby such as a pg_basebackup
background process
+                        * which always returns an invalid flush location, as an
+                        * asynchronous standby.
+                        */
The point is just to be sure that such a node won't be selected as a
sync standby, while it is a surprising behavior, but after this many
years it may not be worth switching to NULL values.

Fujii-san, thoughts? You introduced this behavior after all.
-- 
Michael