Обсуждение: A failure of standby to follow timeline switch

Поиск
Список
Период
Сортировка

A failure of standby to follow timeline switch

От
Kyotaro Horiguchi
Дата:
Hello.

We found a behavioral change (which seems to be a bug) in recovery at
PG13.

The following steps might seem somewhat strange but the replication
code deliberately cope with the case.  This is a sequense seen while
operating a HA cluseter using Pacemaker.

- Run initdb to create a primary.
- Set archive_mode=on on the primary.
- Start the primary.

- Create a standby using pg_basebackup from the primary.
- Stop the standby.
- Stop the primary.

- Put stnadby.signal to the primary then start it.
- Promote the primary.

- Start the standby.


Until PG12, the parimary signals end-of-timeline to the standby and
switches to the next timeline.  Since PG13, that doesn't happen and
the standby continues to request for the segment of the older
timeline, which no longer exists.

FATAL:  could not receive data from WAL stream: ERROR:  requested WAL segment 000000010000000000000003 has already been
removed

It is because WalSndSegmentOpen() can fail to detect a timeline switch
on a historic timeline, due to use of a wrong variable to check
that. It is using state->seg.ws_segno but it seems to be a thinko when
the code around was refactored in 709d003fbd.

The first patch detects the wrong behavior.  The second small patch
fixes it.

In the first patch, the test added to 001_stream_rep.pl involves two
copied functions related to server-log investigation from
019_repslot_limit.pl.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
From f7fbb1574a412100a6732dbe17b0d7f66a497298 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horikyoga.ntt@gmail.com>
Date: Wed, 9 Dec 2020 17:21:49 +0900
Subject: [PATCH 1/2] Add a new test to detect a replication bug

---
 src/test/recovery/t/001_stream_rep.pl | 64 ++++++++++++++++++++++++++-
 1 file changed, 63 insertions(+), 1 deletion(-)

diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index 9e31a53de7..9dd1a70c90 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -3,7 +3,7 @@ use strict;
 use warnings;
 use PostgresNode;
 use TestLib;
-use Test::More tests => 36;
+use Test::More tests => 37;
 
 # Initialize primary node
 my $node_primary = get_new_node('primary');
@@ -409,3 +409,65 @@ ok( ($phys_restart_lsn_pre cmp $phys_restart_lsn_post) == 0,
 my $primary_data = $node_primary->data_dir;
 ok(!-f "$primary_data/pg_wal/$segment_removed",
     "WAL segment $segment_removed recycled after physical slot advancing");
+
+#
+# Check if timeline-increment works while reading a historic timeline.
+my $node_primary_2 = get_new_node('primary_2');
+$node_primary_2->init(allows_streaming => 1);
+$node_primary_2->enable_archiving; # needed to make .paritial segment
+$node_primary_2->start;
+$node_primary_2->backup($backup_name);
+my $node_standby_3 = get_new_node('standby_3');
+$node_standby_3->init_from_backup($node_primary_2, $backup_name,
+                                  has_streaming => 1);
+$node_primary_2->stop;
+$node_primary_2->set_standby_mode; # increment primary timeline
+$node_primary_2->start;
+$node_primary_2->promote;
+my $logstart = get_log_size($node_standby_3);
+$node_standby_3->start;
+
+my $success = 0;
+for (my $i = 0 ; $i < 10000; $i++)
+{
+    if (find_in_log(
+            $node_standby_3,
+            "requested WAL segment [0-9A-F]+ has already been removed",
+            $logstart))
+    {
+        last;
+    }
+    elsif (find_in_log(
+            $node_standby_3,
+            "End of WAL reached on timeline",
+               $logstart))
+    {
+        $success = 1;
+        last;
+    }
+    usleep(100_000);
+}
+
+ok($success, 'Timeline increment while reading a historic timeline');
+
+# return the size of logfile of $node in bytes
+sub get_log_size
+{
+    my ($node) = @_;
+
+    return (stat $node->logfile)[7];
+}
+
+# find $pat in logfile of $node after $off-th byte
+sub find_in_log
+{
+    my ($node, $pat, $off) = @_;
+
+    $off = 0 unless defined $off;
+    my $log = TestLib::slurp_file($node->logfile);
+    return 0 if (length($log) <= $off);
+
+    $log = substr($log, $off);
+
+    return $log =~ m/$pat/;
+}
-- 
2.27.0

From f3862b02b928c4f8aa6380bbccbac8e3c4bf800e Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horikyoga.ntt@gmail.com>
Date: Wed, 9 Dec 2020 17:22:41 +0900
Subject: [PATCH 2/2] Fix a bug of timeline-tracking while sending a historic
 timeline

Since PG13, we should track a timeline switch while sending a historic
timeline running physical replication, but WalSndSegmentOpen fails to
do that. It is a thinko of 709d003fbd.
---
 src/backend/replication/walsender.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 2eb19ad293..0a3f2f1359 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -2497,7 +2497,7 @@ WalSndSegmentOpen(XLogReaderState *state, XLogSegNo nextSegNo,
         XLogSegNo    endSegNo;
 
         XLByteToSeg(sendTimeLineValidUpto, endSegNo, state->segcxt.ws_segsize);
-        if (state->seg.ws_segno == endSegNo)
+        if (nextSegNo == endSegNo)
             *tli_p = sendTimeLineNextTLI;
     }
 
-- 
2.27.0


Re: A failure of standby to follow timeline switch

От
Fujii Masao
Дата:

On 2020/12/09 17:43, Kyotaro Horiguchi wrote:
> Hello.
> 
> We found a behavioral change (which seems to be a bug) in recovery at
> PG13.
> 
> The following steps might seem somewhat strange but the replication
> code deliberately cope with the case.  This is a sequense seen while
> operating a HA cluseter using Pacemaker.
> 
> - Run initdb to create a primary.
> - Set archive_mode=on on the primary.
> - Start the primary.
> 
> - Create a standby using pg_basebackup from the primary.
> - Stop the standby.
> - Stop the primary.
> 
> - Put stnadby.signal to the primary then start it.
> - Promote the primary.
> 
> - Start the standby.
> 
> 
> Until PG12, the parimary signals end-of-timeline to the standby and
> switches to the next timeline.  Since PG13, that doesn't happen and
> the standby continues to request for the segment of the older
> timeline, which no longer exists.
> 
> FATAL:  could not receive data from WAL stream: ERROR:  requested WAL segment 000000010000000000000003 has already
beenremoved
 
> 
> It is because WalSndSegmentOpen() can fail to detect a timeline switch
> on a historic timeline, due to use of a wrong variable to check
> that. It is using state->seg.ws_segno but it seems to be a thinko when
> the code around was refactored in 709d003fbd.
> 
> The first patch detects the wrong behavior.  The second small patch
> fixes it.

Thanks for reporting this! This looks like a bug.

When I applied two patches in the master branch and
ran "make check-world", I got the following error.

============== creating database "contrib_regression" ==============
# Looks like you planned 37 tests but ran 36.
# Looks like your test exited with 255 just after 36.
t/001_stream_rep.pl ..................
Dubious, test returned 255 (wstat 65280, 0xff00)
Failed 1/37 subtests
...
Test Summary Report
-------------------
t/001_stream_rep.pl                (Wstat: 65280 Tests: 36 Failed: 0)
   Non-zero exit status: 255
   Parse errors: Bad plan.  You planned 37 tests but ran 36.
Files=21, Tests=239, 302 wallclock secs ( 0.10 usr  0.05 sys + 41.69 cusr 39.84 csys = 81.68 CPU)
Result: FAIL
make[2]: *** [check] Error 1
make[1]: *** [check-recovery-recurse] Error 2
make[1]: *** Waiting for unfinished jobs....
t/070_dropuser.pl ......... ok


Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION



Re: A failure of standby to follow timeline switch

От
Kyotaro Horiguchi
Дата:
Thank you for looking this.

At Thu, 24 Dec 2020 15:33:04 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in 
> When I applied two patches in the master branch and
> ran "make check-world", I got the following error.
> 
> ============== creating database "contrib_regression" ==============
> # Looks like you planned 37 tests but ran 36.
> # Looks like your test exited with 255 just after 36.
> t/001_stream_rep.pl ..................
> Dubious, test returned 255 (wstat 65280, 0xff00)
> Failed 1/37 subtests
> ...
> Test Summary Report
> -------------------
> t/001_stream_rep.pl                (Wstat: 65280 Tests: 36 Failed: 0)
>   Non-zero exit status: 255
>   Parse errors: Bad plan.  You planned 37 tests but ran 36.
> Files=21, Tests=239, 302 wallclock secs ( 0.10 usr 0.05 sys + 41.69
> cusr 39.84 csys = 81.68 CPU)
> Result: FAIL
> make[2]: *** [check] Error 1
> make[1]: *** [check-recovery-recurse] Error 2
> make[1]: *** Waiting for unfinished jobs....
> t/070_dropuser.pl ......... ok

Mmm. I retried that and saw it succeed (with 0002 applied).

If I modified "user Test::More tests => 37" to 38 in the perl file, I
got a similar result.

> t/001_stream_rep.pl .. 37/38 # Looks like you planned 38 tests but ran 37.
> t/001_stream_rep.pl .. Dubious, test returned 255 (wstat 65280, 0xff00)
> Failed 1/38 subtests 
> 
> Test Summary Report
> -------------------
> t/001_stream_rep.pl (Wstat: 65280 Tests: 37 Failed: 0)
>   Non-zero exit status: 255
>   Parse errors: Bad plan.  You planned 38 tests but ran 37.
> Files=1, Tests=37, 10 wallclock secs ( 0.03 usr  0.00 sys +  3.64 cusr  2.05 csy
> s =  5.72 CPU)
> Result: FAIL
> make: *** [Makefile:19: check] Error 1

I can't guess what happenened on your environment..

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: A failure of standby to follow timeline switch

От
Fujii Masao
Дата:

On 2020/12/25 12:03, Kyotaro Horiguchi wrote:
> Thank you for looking this.
> 
> At Thu, 24 Dec 2020 15:33:04 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in
>> When I applied two patches in the master branch and
>> ran "make check-world", I got the following error.
>>
>> ============== creating database "contrib_regression" ==============
>> # Looks like you planned 37 tests but ran 36.
>> # Looks like your test exited with 255 just after 36.
>> t/001_stream_rep.pl ..................
>> Dubious, test returned 255 (wstat 65280, 0xff00)
>> Failed 1/37 subtests
>> ...
>> Test Summary Report
>> -------------------
>> t/001_stream_rep.pl                (Wstat: 65280 Tests: 36 Failed: 0)
>>    Non-zero exit status: 255
>>    Parse errors: Bad plan.  You planned 37 tests but ran 36.
>> Files=21, Tests=239, 302 wallclock secs ( 0.10 usr 0.05 sys + 41.69
>> cusr 39.84 csys = 81.68 CPU)
>> Result: FAIL
>> make[2]: *** [check] Error 1
>> make[1]: *** [check-recovery-recurse] Error 2
>> make[1]: *** Waiting for unfinished jobs....
>> t/070_dropuser.pl ......... ok
> 
> Mmm. I retried that and saw it succeed (with 0002 applied).
> 
> If I modified "user Test::More tests => 37" to 38 in the perl file, I
> got a similar result.

What happens if you run make check-world with -j 4? When I ran that,
the test failed. But with -j 1, the test finished with success. I'm not sure
why this happened, though..

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION



Re: A failure of standby to follow timeline switch

От
Kyotaro Horiguchi
Дата:
At Sat, 26 Dec 2020 02:15:06 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in 
> 
> 
> On 2020/12/25 12:03, Kyotaro Horiguchi wrote:
> > Thank you for looking this.
> > At Thu, 24 Dec 2020 15:33:04 +0900, Fujii Masao
> > <masao.fujii@oss.nttdata.com> wrote in
> >> When I applied two patches in the master branch and
> >> ran "make check-world", I got the following error.
> >>
> >> ============== creating database "contrib_regression" ==============
> >> # Looks like you planned 37 tests but ran 36.
> >> # Looks like your test exited with 255 just after 36.
> >> t/001_stream_rep.pl ..................
> >> Dubious, test returned 255 (wstat 65280, 0xff00)
> >> Failed 1/37 subtests
> >> ...
> >> Test Summary Report
> >> -------------------
> >> t/001_stream_rep.pl                (Wstat: 65280 Tests: 36 Failed: 0)
> >>    Non-zero exit status: 255
> >>    Parse errors: Bad plan.  You planned 37 tests but ran 36.
> >> Files=21, Tests=239, 302 wallclock secs ( 0.10 usr 0.05 sys + 41.69
> >> cusr 39.84 csys = 81.68 CPU)
> >> Result: FAIL
> >> make[2]: *** [check] Error 1
> >> make[1]: *** [check-recovery-recurse] Error 2
> >> make[1]: *** Waiting for unfinished jobs....
> >> t/070_dropuser.pl ......... ok
> > Mmm. I retried that and saw it succeed (with 0002 applied).
> > If I modified "user Test::More tests => 37" to 38 in the perl file, I
> > got a similar result.
> 
> What happens if you run make check-world with -j 4? When I ran that,
> the test failed. But with -j 1, the test finished with success. I'm
> not sure
> why this happened, though..

Maybe this is it.

+    usleep(100_000);

If the script doesn't find the expected log line, it reaches the
usleep and bark that "Undefined subroutine &main::usleep called...". I
thought I tested that path but perhaps I overlooked the error. "use
Time::HiRes" is needed.

The attached is the fixed version.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
From 9853edb792f81953567b68da0b79baa57a8c518c Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horikyoga.ntt@gmail.com>
Date: Wed, 9 Dec 2020 17:21:49 +0900
Subject: [PATCH v2 1/2] Add a new test to detect a replication bug

---
 src/test/recovery/t/001_stream_rep.pl | 65 ++++++++++++++++++++++++++-
 1 file changed, 64 insertions(+), 1 deletion(-)

diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index 9e31a53de7..13b6e00da4 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -2,8 +2,9 @@
 use strict;
 use warnings;
 use PostgresNode;
+use Time::HiRes qw(usleep);
 use TestLib;
-use Test::More tests => 36;
+use Test::More tests => 37;
 
 # Initialize primary node
 my $node_primary = get_new_node('primary');
@@ -409,3 +410,65 @@ ok( ($phys_restart_lsn_pre cmp $phys_restart_lsn_post) == 0,
 my $primary_data = $node_primary->data_dir;
 ok(!-f "$primary_data/pg_wal/$segment_removed",
     "WAL segment $segment_removed recycled after physical slot advancing");
+
+#
+# Check if timeline-increment works while reading a historic timeline.
+my $node_primary_2 = get_new_node('primary_2');
+$node_primary_2->init(allows_streaming => 1);
+$node_primary_2->enable_archiving; # needed to make .paritial segment
+$node_primary_2->start;
+$node_primary_2->backup($backup_name);
+my $node_standby_3 = get_new_node('standby_3');
+$node_standby_3->init_from_backup($node_primary_2, $backup_name,
+                                  has_streaming => 1);
+$node_primary_2->stop;
+$node_primary_2->set_standby_mode; # increment primary timeline
+$node_primary_2->start;
+$node_primary_2->promote;
+my $logstart = get_log_size($node_standby_3);
+$node_standby_3->start;
+
+my $success = 0;
+for (my $i = 0 ; $i < 1000; $i++)
+{
+    if (find_in_log(
+            $node_standby_3,
+            "requested WAL segment [0-9A-F]+ has already been removed",
+            $logstart))
+    {
+        last;
+    }
+    elsif (find_in_log(
+            $node_standby_3,
+            "End of WAL reached on timeline",
+               $logstart))
+    {
+        $success = 1;
+        last;
+    }
+    usleep(100_000);
+}
+
+ok($success, 'Timeline increment while reading a historic timeline');
+
+# return the size of logfile of $node in bytes
+sub get_log_size
+{
+    my ($node) = @_;
+
+    return (stat $node->logfile)[7];
+}
+
+# find $pat in logfile of $node after $off-th byte
+sub find_in_log
+{
+    my ($node, $pat, $off) = @_;
+
+    $off = 0 unless defined $off;
+    my $log = TestLib::slurp_file($node->logfile);
+    return 0 if (length($log) <= $off);
+
+    $log = substr($log, $off);
+
+    return $log =~ m/$pat/;
+}
-- 
2.27.0

From 6c70a07de75b78830c296819824f2bc4ecfb4677 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horikyoga.ntt@gmail.com>
Date: Wed, 9 Dec 2020 17:22:41 +0900
Subject: [PATCH v2 2/2] Fix a bug of timeline-tracking while sending a
 historic timeline

Since PG13, we should track a timeline switch while sending a historic
timeline running physical replication, but WalSndSegmentOpen fails to
do that. It is a thinko of 709d003fbd.
---
 src/backend/replication/walsender.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index fe0d368a35..8545c6c423 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -2491,7 +2491,7 @@ WalSndSegmentOpen(XLogReaderState *state, XLogSegNo nextSegNo,
         XLogSegNo    endSegNo;
 
         XLByteToSeg(sendTimeLineValidUpto, endSegNo, state->segcxt.ws_segsize);
-        if (state->seg.ws_segno == endSegNo)
+        if (nextSegNo == endSegNo)
             *tli_p = sendTimeLineNextTLI;
     }
 
-- 
2.27.0


Re: A failure of standby to follow timeline switch

От
Fujii Masao
Дата:

On 2021/01/04 12:06, Kyotaro Horiguchi wrote:
> At Sat, 26 Dec 2020 02:15:06 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in
>>
>>
>> On 2020/12/25 12:03, Kyotaro Horiguchi wrote:
>>> Thank you for looking this.
>>> At Thu, 24 Dec 2020 15:33:04 +0900, Fujii Masao
>>> <masao.fujii@oss.nttdata.com> wrote in
>>>> When I applied two patches in the master branch and
>>>> ran "make check-world", I got the following error.
>>>>
>>>> ============== creating database "contrib_regression" ==============
>>>> # Looks like you planned 37 tests but ran 36.
>>>> # Looks like your test exited with 255 just after 36.
>>>> t/001_stream_rep.pl ..................
>>>> Dubious, test returned 255 (wstat 65280, 0xff00)
>>>> Failed 1/37 subtests
>>>> ...
>>>> Test Summary Report
>>>> -------------------
>>>> t/001_stream_rep.pl                (Wstat: 65280 Tests: 36 Failed: 0)
>>>>     Non-zero exit status: 255
>>>>     Parse errors: Bad plan.  You planned 37 tests but ran 36.
>>>> Files=21, Tests=239, 302 wallclock secs ( 0.10 usr 0.05 sys + 41.69
>>>> cusr 39.84 csys = 81.68 CPU)
>>>> Result: FAIL
>>>> make[2]: *** [check] Error 1
>>>> make[1]: *** [check-recovery-recurse] Error 2
>>>> make[1]: *** Waiting for unfinished jobs....
>>>> t/070_dropuser.pl ......... ok
>>> Mmm. I retried that and saw it succeed (with 0002 applied).
>>> If I modified "user Test::More tests => 37" to 38 in the perl file, I
>>> got a similar result.
>>
>> What happens if you run make check-world with -j 4? When I ran that,
>> the test failed. But with -j 1, the test finished with success. I'm
>> not sure
>> why this happened, though..
> 
> Maybe this is it.
> 
> +    usleep(100_000);
> 
> If the script doesn't find the expected log line, it reaches the
> usleep and bark that "Undefined subroutine &main::usleep called...". I
> thought I tested that path but perhaps I overlooked the error. "use
> Time::HiRes" is needed.

Yes.

> 
> The attached is the fixed version.

Thanks for updating the patches!

> In the first patch, the test added to 001_stream_rep.pl involves two
> copied functions related to server-log investigation from
> 019_repslot_limit.pl.

So you're planning to define them commonly in TestLib.pm or elsewhere?

+$node_primary_2->init(allows_streaming => 1);
+$node_primary_2->enable_archiving; # needed to make .paritial segment

Isn't it better to use has_archiving flag in init() instead of doing
enable_archiving, like other tests do?

0002 looks good to me.

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION



Re: A failure of standby to follow timeline switch

От
Kyotaro Horiguchi
Дата:
At Mon, 4 Jan 2021 19:00:21 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in 
> 
> 
> On 2021/01/04 12:06, Kyotaro Horiguchi wrote:
> > At Sat, 26 Dec 2020 02:15:06 +0900, Fujii Masao
> > <masao.fujii@oss.nttdata.com> wrote in
> >>
> >> On 2020/12/25 12:03, Kyotaro Horiguchi wrote:
> > The attached is the fixed version.
> 
> Thanks for updating the patches!
> 
> > In the first patch, the test added to 001_stream_rep.pl involves two
> > copied functions related to server-log investigation from
> > 019_repslot_limit.pl.
> 
> So you're planning to define them commonly in TestLib.pm or elsewhere?

Yeah.. That's correct. Newly added as the first patch.

While making that change, I extended the interface of slurp_file to
allow reading from arbitrary position. I attached this as a separate
patch just for clarifying the changeset.

The existing messages for open() and OSHandleOpen() look somewhat
strange after patching since they are not really "read" errors, but
they're harmless. (It successfully ran also on Windows10)

The first hunk below is a fix for a forgotten line-feed.

         my $fHandle = createFile($filename, "r", "rwd")
-          or croak "could not open \"$filename\": $^E";
+          or croak "could not open \"$filename\": $^E\n";
         OsFHandleOpen(my $fh = IO::Handle->new(), $fHandle, 'r')
           or croak "could not read \"$filename\": $^E\n";
+        seek($fh, $from, 0)
+          or croak "could not seek \"$filename\" to $from: $^E\n";


> +$node_primary_2->init(allows_streaming => 1);
> +$node_primary_2->enable_archiving; # needed to make .paritial segment
> 
> Isn't it better to use has_archiving flag in init() instead of doing
> enable_archiving, like other tests do?

Agreed. Fixed 0002 (formerly 0001).

> 0002 looks good to me.

Thanks. The attached is the revised patchset. 

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
From 41169fdadeecc04c20e537cc4e8f68da1a6b3f94 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horikyoga.ntt@gmail.com>
Date: Tue, 5 Jan 2021 13:34:36 +0900
Subject: [PATCH v3 1/3] Move TAP log-searching feature to common modules

Some private functions in 019_repslot_limit.pl will be used in other
tests. Move them to common modules so that they are available to the
new tests.
---
 src/test/perl/PostgresNode.pm             | 33 ++++++++++++++++++++
 src/test/perl/TestLib.pm                  | 16 +++++++---
 src/test/recovery/t/019_replslot_limit.pl | 37 ++++-------------------
 3 files changed, 51 insertions(+), 35 deletions(-)

diff --git a/src/test/perl/PostgresNode.pm b/src/test/perl/PostgresNode.pm
index 9667f7667e..b09273efe5 100644
--- a/src/test/perl/PostgresNode.pm
+++ b/src/test/perl/PostgresNode.pm
@@ -2223,6 +2223,39 @@ sub pg_recvlogical_upto
 
 =pod
 
+=item $node->current_log_position()
+
+Return the current position of server log.
+
+=cut
+
+sub current_log_position
+{
+    my $self = shift;
+
+    return (stat $self->logfile)[7];
+}
+
+=pod
+
+=item $node->find_in_log($pattern, $startpos)
+
+Returns whether the $pattern occurs after $startpos in the server log.
+
+=cut
+
+sub find_in_log
+{
+    my ($self, $pattern, $startpos) = @_;
+
+    $startpos = 0 unless defined $startpos;
+    my $log = TestLib::slurp_file($self->logfile, $startpos);
+
+    return $log =~ m/$pattern/;
+}
+
+=pod
+
 =back
 
 =cut
diff --git a/src/test/perl/TestLib.pm b/src/test/perl/TestLib.pm
index 1baf6bd001..8751f48dfe 100644
--- a/src/test/perl/TestLib.pm
+++ b/src/test/perl/TestLib.pm
@@ -430,30 +430,38 @@ sub slurp_dir
 
 =pod
 
-=item slurp_file(filename)
+=item slurp_file(filename, pos)
 
-Return the full contents of the specified file.
+Return the contents after pos of the specified file.
+Reutrns the full contents if pos is omitted.
 
 =cut
 
 sub slurp_file
 {
-    my ($filename) = @_;
+    my ($filename, $from) = @_;
     local $/;
     my $contents;
+
+    $from = 0 unless defined $from;
+
     if ($Config{osname} ne 'MSWin32')
     {
         open(my $in, '<', $filename)
           or croak "could not read \"$filename\": $!";
+        seek($in, $from, 0)
+          or croak "could not seek \"$filename\" to $from: $!";
         $contents = <$in>;
         close $in;
     }
     else
     {
         my $fHandle = createFile($filename, "r", "rwd")
-          or croak "could not open \"$filename\": $^E";
+          or croak "could not open \"$filename\": $^E\n";
         OsFHandleOpen(my $fh = IO::Handle->new(), $fHandle, 'r')
           or croak "could not read \"$filename\": $^E\n";
+        seek($fh, $from, 0)
+          or croak "could not seek \"$filename\" to $from: $^E\n";
         $contents = <$fh>;
         CloseHandle($fHandle)
           or croak "could not close \"$filename\": $^E\n";
diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl
index 20f4a1bbc3..b298d9992f 100644
--- a/src/test/recovery/t/019_replslot_limit.pl
+++ b/src/test/recovery/t/019_replslot_limit.pl
@@ -165,19 +165,17 @@ $node_primary->wait_for_catchup($node_standby, 'replay', $start_lsn);
 
 $node_standby->stop;
 
-ok( !find_in_log(
-        $node_standby,
-        "requested WAL segment [0-9A-F]+ has already been removed"),
+ok( !$node_standby->find_in_log(
+         "requested WAL segment [0-9A-F]+ has already been removed"),
     'check that required WAL segments are still available');
 
 # Advance WAL again, the slot loses the oldest segment.
-my $logstart = get_log_size($node_primary);
+my $logstart = $node_primary->current_log_position();
 advance_wal($node_primary, 7);
 $node_primary->safe_psql('postgres', "CHECKPOINT;");
 
 # WARNING should be issued
-ok( find_in_log(
-        $node_primary,
+ok( $node_primary->find_in_log(
         "invalidating slot \"rep1\" because its restart_lsn [0-9A-F/]+ exceeds max_slot_wal_keep_size",
         $logstart),
     'check that the warning is logged');
@@ -190,14 +188,13 @@ is($result, "rep1|f|t|lost|",
     'check that the slot became inactive and the state "lost" persists');
 
 # The standby no longer can connect to the primary
-$logstart = get_log_size($node_standby);
+$logstart = $node_standby->current_log_position();
 $node_standby->start;
 
 my $failed = 0;
 for (my $i = 0; $i < 10000; $i++)
 {
-    if (find_in_log(
-            $node_standby,
+    if ($node_standby->find_in_log(
             "requested WAL segment [0-9A-F]+ has already been removed",
             $logstart))
     {
@@ -264,25 +261,3 @@ sub advance_wal
     }
     return;
 }
-
-# return the size of logfile of $node in bytes
-sub get_log_size
-{
-    my ($node) = @_;
-
-    return (stat $node->logfile)[7];
-}
-
-# find $pat in logfile of $node after $off-th byte
-sub find_in_log
-{
-    my ($node, $pat, $off) = @_;
-
-    $off = 0 unless defined $off;
-    my $log = TestLib::slurp_file($node->logfile);
-    return 0 if (length($log) <= $off);
-
-    $log = substr($log, $off);
-
-    return $log =~ m/$pat/;
-}
-- 
2.27.0

From 918d75aa4064cda0fe6050b80b3aa914ab7eef47 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horikyoga.ntt@gmail.com>
Date: Wed, 9 Dec 2020 17:21:49 +0900
Subject: [PATCH v3 2/3] New test for timeline-tracking of walsender

Walsender should track timeline changes while sending a historic
timeline. Add a test for it.
---
 src/test/recovery/t/001_stream_rep.pl | 41 ++++++++++++++++++++++++++-
 1 file changed, 40 insertions(+), 1 deletion(-)

diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index 9e31a53de7..e78e98fb43 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -2,8 +2,9 @@
 use strict;
 use warnings;
 use PostgresNode;
+use Time::HiRes qw(usleep);
 use TestLib;
-use Test::More tests => 36;
+use Test::More tests => 37;
 
 # Initialize primary node
 my $node_primary = get_new_node('primary');
@@ -409,3 +410,41 @@ ok( ($phys_restart_lsn_pre cmp $phys_restart_lsn_post) == 0,
 my $primary_data = $node_primary->data_dir;
 ok(!-f "$primary_data/pg_wal/$segment_removed",
     "WAL segment $segment_removed recycled after physical slot advancing");
+
+#
+# Check if timeline-increment works while reading a historic timeline.
+my $node_primary_2 = get_new_node('primary_2');
+# archiving is needed to create .paritial segment
+$node_primary_2->init(allows_streaming => 1, has_archiving => 1);
+$node_primary_2->start;
+$node_primary_2->backup($backup_name);
+my $node_standby_3 = get_new_node('standby_3');
+$node_standby_3->init_from_backup($node_primary_2, $backup_name,
+                                  has_streaming => 1);
+$node_primary_2->stop;
+$node_primary_2->set_standby_mode; # increment primary timeline
+$node_primary_2->start;
+$node_primary_2->promote;
+my $logstart = $node_standby_3->current_log_position();
+$node_standby_3->start;
+
+my $success = 0;
+for (my $i = 0 ; $i < 1000; $i++)
+{
+    if ($node_standby_3->find_in_log(
+            "requested WAL segment [0-9A-F]+ has already been removed",
+            $logstart))
+    {
+        last;
+    }
+    elsif ($node_standby_3->find_in_log(
+            "End of WAL reached on timeline",
+               $logstart))
+    {
+        $success = 1;
+        last;
+    }
+    usleep(100_000);
+}
+
+ok($success, 'Timeline increment while reading a historic timeline');
-- 
2.27.0

From 5212aa7733aecd7c4ae8cc615584f26caa8a2ee4 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horikyoga.ntt@gmail.com>
Date: Wed, 9 Dec 2020 17:22:41 +0900
Subject: [PATCH v3 3/3] Fix timeline-tracking failure while sending a historic
 timeline

Walsender should track timeline switches while sending a historic
timeline. Regain that behavior, which was broken in PG13, by a thinko
of 709d003fbd.
---
 src/backend/replication/walsender.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index fe0d368a35..8545c6c423 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -2491,7 +2491,7 @@ WalSndSegmentOpen(XLogReaderState *state, XLogSegNo nextSegNo,
         XLogSegNo    endSegNo;
 
         XLByteToSeg(sendTimeLineValidUpto, endSegNo, state->segcxt.ws_segsize);
-        if (state->seg.ws_segno == endSegNo)
+        if (nextSegNo == endSegNo)
             *tli_p = sendTimeLineNextTLI;
     }
 
-- 
2.27.0


Re: A failure of standby to follow timeline switch

От
Kyotaro Horiguchi
Дата:
At Tue, 05 Jan 2021 17:26:02 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in 
> Thanks. The attached is the revised patchset. 

It is not applicable to PG13 due to wording changes. This is an
applicable all-in-one version to PG13.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
From d4b11d93f93f3af2aa55033b91ddbec176763325 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horikyoga.ntt@gmail.com>
Date: Tue, 5 Jan 2021 13:34:36 +0900
Subject: [PATCH] Fix timeline-tracking failure while sending a historic
 timeline

Walsender should track timeline switches while sending a historic
timeline. Regain that behavior, which was broken in PG13, by a thinko
of 709d003fbd. Backpatch to PG13.
---
 src/backend/replication/walsender.c       |  2 +-
 src/test/perl/PostgresNode.pm             | 33 ++++++++++++++++++
 src/test/perl/TestLib.pm                  | 16 ++++++---
 src/test/recovery/t/001_stream_rep.pl     | 41 ++++++++++++++++++++++-
 src/test/recovery/t/019_replslot_limit.pl | 37 ++++----------------
 5 files changed, 92 insertions(+), 37 deletions(-)

diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 7f87eb7f19..04f6c3ebb4 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -2478,7 +2478,7 @@ WalSndSegmentOpen(XLogReaderState *state, XLogSegNo nextSegNo,
         XLogSegNo    endSegNo;
 
         XLByteToSeg(sendTimeLineValidUpto, endSegNo, state->segcxt.ws_segsize);
-        if (state->seg.ws_segno == endSegNo)
+        if (nextSegNo == endSegNo)
             *tli_p = sendTimeLineNextTLI;
     }
 
diff --git a/src/test/perl/PostgresNode.pm b/src/test/perl/PostgresNode.pm
index 980f1f1533..687aa3ac88 100644
--- a/src/test/perl/PostgresNode.pm
+++ b/src/test/perl/PostgresNode.pm
@@ -2138,6 +2138,39 @@ sub pg_recvlogical_upto
 
 =pod
 
+=item $node->current_log_position()
+
+Return the current position of server log.
+
+=cut
+
+sub current_log_position
+{
+    my $self = shift;
+
+    return (stat $self->logfile)[7];
+}
+
+=pod
+
+=item $node->find_in_log($pattern, $startpos)
+
+Returns whether the $pattern occurs after $startpos in the server log.
+
+=cut
+
+sub find_in_log
+{
+    my ($self, $pattern, $startpos) = @_;
+
+    $startpos = 0 unless defined $startpos;
+    my $log = TestLib::slurp_file($self->logfile, $startpos);
+
+    return $log =~ m/$pattern/;
+}
+
+=pod
+
 =back
 
 =cut
diff --git a/src/test/perl/TestLib.pm b/src/test/perl/TestLib.pm
index a7490d2ce7..a0ce9521e2 100644
--- a/src/test/perl/TestLib.pm
+++ b/src/test/perl/TestLib.pm
@@ -402,30 +402,38 @@ sub slurp_dir
 
 =pod
 
-=item slurp_file(filename)
+=item slurp_file(filename, pos)
 
-Return the full contents of the specified file.
+Return the contents after pos of the specified file.
+Reutrns the full contents if pos is omitted.
 
 =cut
 
 sub slurp_file
 {
-    my ($filename) = @_;
+    my ($filename, $from) = @_;
     local $/;
     my $contents;
+
+    $from = 0 unless defined $from;
+
     if ($Config{osname} ne 'MSWin32')
     {
         open(my $in, '<', $filename)
           or die "could not read \"$filename\": $!";
+        seek($in, $from, 0)
+          or die "could not seek \"$filename\" to $from: $!";
         $contents = <$in>;
         close $in;
     }
     else
     {
         my $fHandle = createFile($filename, "r", "rwd")
-          or die "could not open \"$filename\": $^E";
+          or die "could not open \"$filename\": $^E\n";
         OsFHandleOpen(my $fh = IO::Handle->new(), $fHandle, 'r')
           or die "could not read \"$filename\": $^E\n";
+        seek($fh, $from, 0)
+          or die "could not seek \"$filename\" to $from: $^E\n";
         $contents = <$fh>;
         CloseHandle($fHandle)
           or die "could not close \"$filename\": $^E\n";
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index 778f11b28b..8d2b24fe55 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -2,8 +2,9 @@
 use strict;
 use warnings;
 use PostgresNode;
+use Time::HiRes qw(usleep);
 use TestLib;
-use Test::More tests => 36;
+use Test::More tests => 37;
 
 # Initialize master node
 my $node_master = get_new_node('master');
@@ -409,3 +410,41 @@ ok( ($phys_restart_lsn_pre cmp $phys_restart_lsn_post) == 0,
 my $master_data = $node_master->data_dir;
 ok(!-f "$master_data/pg_wal/$segment_removed",
     "WAL segment $segment_removed recycled after physical slot advancing");
+
+#
+# Check if timeline-increment works while reading a historic timeline.
+my $node_primary_2 = get_new_node('primary_2');
+# archiving is needed to create .paritial segment
+$node_primary_2->init(allows_streaming => 1, has_archiving => 1);
+$node_primary_2->start;
+$node_primary_2->backup($backup_name);
+my $node_standby_3 = get_new_node('standby_3');
+$node_standby_3->init_from_backup($node_primary_2, $backup_name,
+                                  has_streaming => 1);
+$node_primary_2->stop;
+$node_primary_2->set_standby_mode; # increment primary timeline
+$node_primary_2->start;
+$node_primary_2->promote;
+my $logstart = $node_standby_3->current_log_position();
+$node_standby_3->start;
+
+my $success = 0;
+for (my $i = 0 ; $i < 1000; $i++)
+{
+    if ($node_standby_3->find_in_log(
+            "requested WAL segment [0-9A-F]+ has already been removed",
+            $logstart))
+    {
+        last;
+    }
+    elsif ($node_standby_3->find_in_log(
+            "End of WAL reached on timeline",
+               $logstart))
+    {
+        $success = 1;
+        last;
+    }
+    usleep(100_000);
+}
+
+ok($success, 'Timeline increment while reading a historic timeline');
diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl
index a7231dcd47..8b3c5de057 100644
--- a/src/test/recovery/t/019_replslot_limit.pl
+++ b/src/test/recovery/t/019_replslot_limit.pl
@@ -165,19 +165,17 @@ $node_master->wait_for_catchup($node_standby, 'replay', $start_lsn);
 
 $node_standby->stop;
 
-ok( !find_in_log(
-        $node_standby,
-        "requested WAL segment [0-9A-F]+ has already been removed"),
+ok( !$node_standby->find_in_log(
+         "requested WAL segment [0-9A-F]+ has already been removed"),
     'check that required WAL segments are still available');
 
 # Advance WAL again, the slot loses the oldest segment.
-my $logstart = get_log_size($node_master);
+my $logstart = $node_master->current_log_position();
 advance_wal($node_master, 7);
 $node_master->safe_psql('postgres', "CHECKPOINT;");
 
 # WARNING should be issued
-ok( find_in_log(
-        $node_master,
+ok( $node_master->find_in_log(
         "invalidating slot \"rep1\" because its restart_lsn [0-9A-F/]+ exceeds max_slot_wal_keep_size",
         $logstart),
     'check that the warning is logged');
@@ -190,14 +188,13 @@ is($result, "rep1|f|t|lost|",
     'check that the slot became inactive and the state "lost" persists');
 
 # The standby no longer can connect to the master
-$logstart = get_log_size($node_standby);
+$logstart = $node_standby->current_log_position();
 $node_standby->start;
 
 my $failed = 0;
 for (my $i = 0; $i < 10000; $i++)
 {
-    if (find_in_log(
-            $node_standby,
+    if ($node_standby->find_in_log(
             "requested WAL segment [0-9A-F]+ has already been removed",
             $logstart))
     {
@@ -264,25 +261,3 @@ sub advance_wal
     }
     return;
 }
-
-# return the size of logfile of $node in bytes
-sub get_log_size
-{
-    my ($node) = @_;
-
-    return (stat $node->logfile)[7];
-}
-
-# find $pat in logfile of $node after $off-th byte
-sub find_in_log
-{
-    my ($node, $pat, $off) = @_;
-
-    $off = 0 unless defined $off;
-    my $log = TestLib::slurp_file($node->logfile);
-    return 0 if (length($log) <= $off);
-
-    $log = substr($log, $off);
-
-    return $log =~ m/$pat/;
-}
-- 
2.27.0


Re: A failure of standby to follow timeline switch

От
Fujii Masao
Дата:

On 2021/01/05 17:26, Kyotaro Horiguchi wrote:
> At Mon, 4 Jan 2021 19:00:21 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in
>>
>>
>> On 2021/01/04 12:06, Kyotaro Horiguchi wrote:
>>> At Sat, 26 Dec 2020 02:15:06 +0900, Fujii Masao
>>> <masao.fujii@oss.nttdata.com> wrote in
>>>>
>>>> On 2020/12/25 12:03, Kyotaro Horiguchi wrote:
>>> The attached is the fixed version.
>>
>> Thanks for updating the patches!
>>
>>> In the first patch, the test added to 001_stream_rep.pl involves two
>>> copied functions related to server-log investigation from
>>> 019_repslot_limit.pl.
>>
>> So you're planning to define them commonly in TestLib.pm or elsewhere?
> 
> Yeah.. That's correct. Newly added as the first patch.
> 
> While making that change, I extended the interface of slurp_file to
> allow reading from arbitrary position.

Is this extension really helpful for current use case?
At least I'd like to avoid back-patching this since it's an exntesion...

          OsFHandleOpen(my $fh = IO::Handle->new(), $fHandle, 'r')
            or croak "could not read \"$filename\": $^E\n";
+        seek($fh, $from, 0)
+          or croak "could not seek \"$filename\" to $from: $^E\n";

I'm not familiar with this area, but SetFilePointer() is more suitable
rather than seek()?


> Thanks. The attached is the revised patchset.

Thanks!

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION



Re: A failure of standby to follow timeline switch

От
Kyotaro Horiguchi
Дата:
At Thu, 7 Jan 2021 11:55:33 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in 
> 
> 
> On 2021/01/05 17:26, Kyotaro Horiguchi wrote:
> > At Mon, 4 Jan 2021 19:00:21 +0900, Fujii Masao <masao.fujii@oss.nttdata.com>
> > wrote in
> >>
> >>
> >> On 2021/01/04 12:06, Kyotaro Horiguchi wrote:
> >>> At Sat, 26 Dec 2020 02:15:06 +0900, Fujii Masao
> >>> <masao.fujii@oss.nttdata.com> wrote in
> >>>>
> >>>> On 2020/12/25 12:03, Kyotaro Horiguchi wrote:
> >>> The attached is the fixed version.
> >>
> >> Thanks for updating the patches!
> >>
> >>> In the first patch, the test added to 001_stream_rep.pl involves two
> >>> copied functions related to server-log investigation from
> >>> 019_repslot_limit.pl.
> >>
> >> So you're planning to define them commonly in TestLib.pm or elsewhere?
> > Yeah.. That's correct. Newly added as the first patch.
> > While making that change, I extended the interface of slurp_file to
> > allow reading from arbitrary position.
> 
> Is this extension really helpful for current use case?
> At least I'd like to avoid back-patching this since it's an exntesion...

Yeah, I felt a hesitattion about it a bit. It's less useful assuming
that log files won't get so large. Removed in this version.

>          OsFHandleOpen(my $fh = IO::Handle->new(), $fHandle, 'r')
>            or croak "could not read \"$filename\": $^E\n";
> +        seek($fh, $from, 0)
> +          or croak "could not seek \"$filename\" to $from: $^E\n";
> 
> I'm not familiar with this area, but SetFilePointer() is more suitable
> rather than seek()?

SetFilePointer() works for a native handle, IO::Handle->new()
here. seek() works on $fh, a perl handle.  If ReadFile is used later
SetFilePointer() might be needed separately.

Anyway, it is removed.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
From 39f284c2109b63e444eb8d437fe51ae3268d305b Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horikyoga.ntt@gmail.com>
Date: Tue, 5 Jan 2021 13:34:36 +0900
Subject: [PATCH v4 1/3] Move TAP log-searching feature to common modules

Some private functions in 019_repslot_limit.pl will be used in other
tests. Move them to common modules so that they are available to the
new tests.
---
 src/test/perl/PostgresNode.pm             | 36 ++++++++++++++++++++++
 src/test/recovery/t/019_replslot_limit.pl | 37 ++++-------------------
 2 files changed, 42 insertions(+), 31 deletions(-)

diff --git a/src/test/perl/PostgresNode.pm b/src/test/perl/PostgresNode.pm
index 9667f7667e..d78e9f93fb 100644
--- a/src/test/perl/PostgresNode.pm
+++ b/src/test/perl/PostgresNode.pm
@@ -2223,6 +2223,42 @@ sub pg_recvlogical_upto
 
 =pod
 
+=item $node->current_log_position()
+
+Return the current position of server log.
+
+=cut
+
+sub current_log_position
+{
+    my $self = shift;
+
+    return (stat $self->logfile)[7];
+}
+
+=pod
+
+=item $node->find_in_log($pattern, $startpos)
+
+Returns whether the $pattern occurs after $startpos in the server log.
+
+=cut
+
+sub find_in_log
+{
+    my ($self, $pattern, $startpos) = @_;
+
+    $startpos = 0 unless defined $startpos;
+    my $log = TestLib::slurp_file($self->logfile);
+    return 0 if (length($log) <= $startpos);
+
+    $log = substr($log, $startpos);
+
+    return $log =~ m/$pattern/;
+}
+
+=pod
+
 =back
 
 =cut
diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl
index 20f4a1bbc3..b298d9992f 100644
--- a/src/test/recovery/t/019_replslot_limit.pl
+++ b/src/test/recovery/t/019_replslot_limit.pl
@@ -165,19 +165,17 @@ $node_primary->wait_for_catchup($node_standby, 'replay', $start_lsn);
 
 $node_standby->stop;
 
-ok( !find_in_log(
-        $node_standby,
-        "requested WAL segment [0-9A-F]+ has already been removed"),
+ok( !$node_standby->find_in_log(
+         "requested WAL segment [0-9A-F]+ has already been removed"),
     'check that required WAL segments are still available');
 
 # Advance WAL again, the slot loses the oldest segment.
-my $logstart = get_log_size($node_primary);
+my $logstart = $node_primary->current_log_position();
 advance_wal($node_primary, 7);
 $node_primary->safe_psql('postgres', "CHECKPOINT;");
 
 # WARNING should be issued
-ok( find_in_log(
-        $node_primary,
+ok( $node_primary->find_in_log(
         "invalidating slot \"rep1\" because its restart_lsn [0-9A-F/]+ exceeds max_slot_wal_keep_size",
         $logstart),
     'check that the warning is logged');
@@ -190,14 +188,13 @@ is($result, "rep1|f|t|lost|",
     'check that the slot became inactive and the state "lost" persists');
 
 # The standby no longer can connect to the primary
-$logstart = get_log_size($node_standby);
+$logstart = $node_standby->current_log_position();
 $node_standby->start;
 
 my $failed = 0;
 for (my $i = 0; $i < 10000; $i++)
 {
-    if (find_in_log(
-            $node_standby,
+    if ($node_standby->find_in_log(
             "requested WAL segment [0-9A-F]+ has already been removed",
             $logstart))
     {
@@ -264,25 +261,3 @@ sub advance_wal
     }
     return;
 }
-
-# return the size of logfile of $node in bytes
-sub get_log_size
-{
-    my ($node) = @_;
-
-    return (stat $node->logfile)[7];
-}
-
-# find $pat in logfile of $node after $off-th byte
-sub find_in_log
-{
-    my ($node, $pat, $off) = @_;
-
-    $off = 0 unless defined $off;
-    my $log = TestLib::slurp_file($node->logfile);
-    return 0 if (length($log) <= $off);
-
-    $log = substr($log, $off);
-
-    return $log =~ m/$pat/;
-}
-- 
2.27.0

From 97d3cfbad959ca889f26fe0d666566340161bd39 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horikyoga.ntt@gmail.com>
Date: Wed, 9 Dec 2020 17:21:49 +0900
Subject: [PATCH v4 2/3] New test for timeline-tracking of walsender

Walsender should track timeline changes while sending a historic
timeline. Add a test for it.
---
 src/test/recovery/t/001_stream_rep.pl | 41 ++++++++++++++++++++++++++-
 1 file changed, 40 insertions(+), 1 deletion(-)

diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index 9e31a53de7..e78e98fb43 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -2,8 +2,9 @@
 use strict;
 use warnings;
 use PostgresNode;
+use Time::HiRes qw(usleep);
 use TestLib;
-use Test::More tests => 36;
+use Test::More tests => 37;
 
 # Initialize primary node
 my $node_primary = get_new_node('primary');
@@ -409,3 +410,41 @@ ok( ($phys_restart_lsn_pre cmp $phys_restart_lsn_post) == 0,
 my $primary_data = $node_primary->data_dir;
 ok(!-f "$primary_data/pg_wal/$segment_removed",
     "WAL segment $segment_removed recycled after physical slot advancing");
+
+#
+# Check if timeline-increment works while reading a historic timeline.
+my $node_primary_2 = get_new_node('primary_2');
+# archiving is needed to create .paritial segment
+$node_primary_2->init(allows_streaming => 1, has_archiving => 1);
+$node_primary_2->start;
+$node_primary_2->backup($backup_name);
+my $node_standby_3 = get_new_node('standby_3');
+$node_standby_3->init_from_backup($node_primary_2, $backup_name,
+                                  has_streaming => 1);
+$node_primary_2->stop;
+$node_primary_2->set_standby_mode; # increment primary timeline
+$node_primary_2->start;
+$node_primary_2->promote;
+my $logstart = $node_standby_3->current_log_position();
+$node_standby_3->start;
+
+my $success = 0;
+for (my $i = 0 ; $i < 1000; $i++)
+{
+    if ($node_standby_3->find_in_log(
+            "requested WAL segment [0-9A-F]+ has already been removed",
+            $logstart))
+    {
+        last;
+    }
+    elsif ($node_standby_3->find_in_log(
+            "End of WAL reached on timeline",
+               $logstart))
+    {
+        $success = 1;
+        last;
+    }
+    usleep(100_000);
+}
+
+ok($success, 'Timeline increment while reading a historic timeline');
-- 
2.27.0

From 29decc6dc2ccea5ef4e2c8b92f9c2b65a3898fcb Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horikyoga.ntt@gmail.com>
Date: Wed, 9 Dec 2020 17:22:41 +0900
Subject: [PATCH v4 3/3] Fix timeline-tracking failure while sending a historic
 timeline

Walsender should track timeline switches while sending a historic
timeline. Regain that behavior, which was broken in PG13, by a thinko
of 709d003fbd.
---
 src/backend/replication/walsender.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index fe0d368a35..8545c6c423 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -2491,7 +2491,7 @@ WalSndSegmentOpen(XLogReaderState *state, XLogSegNo nextSegNo,
         XLogSegNo    endSegNo;
 
         XLByteToSeg(sendTimeLineValidUpto, endSegNo, state->segcxt.ws_segsize);
-        if (state->seg.ws_segno == endSegNo)
+        if (nextSegNo == endSegNo)
             *tli_p = sendTimeLineNextTLI;
     }
 
-- 
2.27.0

From 933dd946ca547b7de2dfed84807cd9d871c12f6f Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horikyoga.ntt@gmail.com>
Date: Tue, 5 Jan 2021 13:34:36 +0900
Subject: [PATCH] Fix timeline-tracking failure while sending a historic
 timeline

Walsender should track timeline switches while sending a historic
timeline. Regain that behavior, which was broken in PG13, by a thinko
of 709d003fbd. Backpatch to PG13.
---
 src/backend/replication/walsender.c       |  2 +-
 src/test/perl/PostgresNode.pm             | 36 ++++++++++++++++++++
 src/test/recovery/t/001_stream_rep.pl     | 41 ++++++++++++++++++++++-
 src/test/recovery/t/019_replslot_limit.pl | 37 ++++----------------
 4 files changed, 83 insertions(+), 33 deletions(-)

diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 7f87eb7f19..04f6c3ebb4 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -2478,7 +2478,7 @@ WalSndSegmentOpen(XLogReaderState *state, XLogSegNo nextSegNo,
         XLogSegNo    endSegNo;
 
         XLByteToSeg(sendTimeLineValidUpto, endSegNo, state->segcxt.ws_segsize);
-        if (state->seg.ws_segno == endSegNo)
+        if (nextSegNo == endSegNo)
             *tli_p = sendTimeLineNextTLI;
     }
 
diff --git a/src/test/perl/PostgresNode.pm b/src/test/perl/PostgresNode.pm
index 980f1f1533..a08c71b549 100644
--- a/src/test/perl/PostgresNode.pm
+++ b/src/test/perl/PostgresNode.pm
@@ -2138,6 +2138,42 @@ sub pg_recvlogical_upto
 
 =pod
 
+=item $node->current_log_position()
+
+Return the current position of server log.
+
+=cut
+
+sub current_log_position
+{
+    my $self = shift;
+
+    return (stat $self->logfile)[7];
+}
+
+=pod
+
+=item $node->find_in_log($pattern, $startpos)
+
+Returns whether the $pattern occurs after $startpos in the server log.
+
+=cut
+
+sub find_in_log
+{
+    my ($self, $pattern, $startpos) = @_;
+
+    $startpos = 0 unless defined $startpos;
+    my $log = TestLib::slurp_file($self->logfile);
+    return 0 if (length($log) <= $startpos);
+
+    $log = substr($log, $startpos);
+
+    return $log =~ m/$pattern/;
+}
+
+=pod
+
 =back
 
 =cut
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index 778f11b28b..8d2b24fe55 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -2,8 +2,9 @@
 use strict;
 use warnings;
 use PostgresNode;
+use Time::HiRes qw(usleep);
 use TestLib;
-use Test::More tests => 36;
+use Test::More tests => 37;
 
 # Initialize master node
 my $node_master = get_new_node('master');
@@ -409,3 +410,41 @@ ok( ($phys_restart_lsn_pre cmp $phys_restart_lsn_post) == 0,
 my $master_data = $node_master->data_dir;
 ok(!-f "$master_data/pg_wal/$segment_removed",
     "WAL segment $segment_removed recycled after physical slot advancing");
+
+#
+# Check if timeline-increment works while reading a historic timeline.
+my $node_primary_2 = get_new_node('primary_2');
+# archiving is needed to create .paritial segment
+$node_primary_2->init(allows_streaming => 1, has_archiving => 1);
+$node_primary_2->start;
+$node_primary_2->backup($backup_name);
+my $node_standby_3 = get_new_node('standby_3');
+$node_standby_3->init_from_backup($node_primary_2, $backup_name,
+                                  has_streaming => 1);
+$node_primary_2->stop;
+$node_primary_2->set_standby_mode; # increment primary timeline
+$node_primary_2->start;
+$node_primary_2->promote;
+my $logstart = $node_standby_3->current_log_position();
+$node_standby_3->start;
+
+my $success = 0;
+for (my $i = 0 ; $i < 1000; $i++)
+{
+    if ($node_standby_3->find_in_log(
+            "requested WAL segment [0-9A-F]+ has already been removed",
+            $logstart))
+    {
+        last;
+    }
+    elsif ($node_standby_3->find_in_log(
+            "End of WAL reached on timeline",
+               $logstart))
+    {
+        $success = 1;
+        last;
+    }
+    usleep(100_000);
+}
+
+ok($success, 'Timeline increment while reading a historic timeline');
diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl
index a7231dcd47..8b3c5de057 100644
--- a/src/test/recovery/t/019_replslot_limit.pl
+++ b/src/test/recovery/t/019_replslot_limit.pl
@@ -165,19 +165,17 @@ $node_master->wait_for_catchup($node_standby, 'replay', $start_lsn);
 
 $node_standby->stop;
 
-ok( !find_in_log(
-        $node_standby,
-        "requested WAL segment [0-9A-F]+ has already been removed"),
+ok( !$node_standby->find_in_log(
+         "requested WAL segment [0-9A-F]+ has already been removed"),
     'check that required WAL segments are still available');
 
 # Advance WAL again, the slot loses the oldest segment.
-my $logstart = get_log_size($node_master);
+my $logstart = $node_master->current_log_position();
 advance_wal($node_master, 7);
 $node_master->safe_psql('postgres', "CHECKPOINT;");
 
 # WARNING should be issued
-ok( find_in_log(
-        $node_master,
+ok( $node_master->find_in_log(
         "invalidating slot \"rep1\" because its restart_lsn [0-9A-F/]+ exceeds max_slot_wal_keep_size",
         $logstart),
     'check that the warning is logged');
@@ -190,14 +188,13 @@ is($result, "rep1|f|t|lost|",
     'check that the slot became inactive and the state "lost" persists');
 
 # The standby no longer can connect to the master
-$logstart = get_log_size($node_standby);
+$logstart = $node_standby->current_log_position();
 $node_standby->start;
 
 my $failed = 0;
 for (my $i = 0; $i < 10000; $i++)
 {
-    if (find_in_log(
-            $node_standby,
+    if ($node_standby->find_in_log(
             "requested WAL segment [0-9A-F]+ has already been removed",
             $logstart))
     {
@@ -264,25 +261,3 @@ sub advance_wal
     }
     return;
 }
-
-# return the size of logfile of $node in bytes
-sub get_log_size
-{
-    my ($node) = @_;
-
-    return (stat $node->logfile)[7];
-}
-
-# find $pat in logfile of $node after $off-th byte
-sub find_in_log
-{
-    my ($node, $pat, $off) = @_;
-
-    $off = 0 unless defined $off;
-    my $log = TestLib::slurp_file($node->logfile);
-    return 0 if (length($log) <= $off);
-
-    $log = substr($log, $off);
-
-    return $log =~ m/$pat/;
-}
-- 
2.27.0


Re: A failure of standby to follow timeline switch

От
Alvaro Herrera
Дата:
Masao-san: Are you intending to act as committer for these?  Since the
bug is mine I can look into it, but since you already did all the
reviewing work, I'm good with you giving it the final push.

0001 looks good to me; let's get that one committed quickly so that we
can focus on the interesting stuff.  While the implementation of
find_in_log is quite dumb (not this patch's fault), it seems sufficient
to deal with small log files.  We can improve the implementation later,
if needed, but we have to get the API right on the first try.

0003: The fix looks good to me.  I verified that the test fails without
the fix, and it passes with the fix.


The test added in 0002 is a bit optimistic regarding timing, as well as
potentially slow; it loops 1000 times and sleeps 100 milliseconds each
time.  In a very slow server (valgrind or clobber_cache animals) this
could not be sufficient time, while on fast servers it may end up
waiting longer than needed.  Maybe we can do something like this:

for (my $i = 0 ; $i < 1000; $i++)
{
    my $current_log_size = determine_current_log_size()

    if ($node_standby_3->find_in_log(
            "requested WAL segment [0-9A-F]+ has already been removed",
            $logstart))
    {
        last;
    }
    elsif ($node_standby_3->find_in_log(
            "End of WAL reached on timeline",
               $logstart))
    {
        $success = 1;
        last;
    }
    $logstart = $current_log_size;

    while (determine_current_log_size() == current_log_size)
    {
        usleep(10_000);
        # with a retry count?
    }
}

With test patch, make check PROVE_FLAGS="--timer" PROVE_TESTS=t/001_stream_rep.pl

ok     6386 ms ( 0.00 usr  0.00 sys +  1.14 cusr  0.93 csys =  2.07 CPU)
ok     6352 ms ( 0.00 usr  0.00 sys +  1.10 cusr  0.94 csys =  2.04 CPU)
ok     6255 ms ( 0.01 usr  0.00 sys +  0.99 cusr  0.97 csys =  1.97 CPU)

without test patch:

ok     4954 ms ( 0.00 usr  0.00 sys +  0.71 cusr  0.64 csys =  1.35 CPU)
ok     5033 ms ( 0.01 usr  0.00 sys +  0.71 cusr  0.73 csys =  1.45 CPU)
ok     4991 ms ( 0.01 usr  0.00 sys +  0.73 cusr  0.59 csys =  1.33 CPU)

-- 
Álvaro Herrera



Re: A failure of standby to follow timeline switch

От
Fujii Masao
Дата:
On Sat, Jan 9, 2021 at 5:08 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> Masao-san: Are you intending to act as committer for these?  Since the
> bug is mine I can look into it, but since you already did all the
> reviewing work, I'm good with you giving it the final push.

Thanks! I'm thinking to push the patch.


> 0001 looks good to me; let's get that one committed quickly so that we
> can focus on the interesting stuff.  While the implementation of
> find_in_log is quite dumb (not this patch's fault), it seems sufficient
> to deal with small log files.  We can improve the implementation later,
> if needed, but we have to get the API right on the first try.
>
> 0003: The fix looks good to me.  I verified that the test fails without
> the fix, and it passes with the fix.

Yes.


> The test added in 0002 is a bit optimistic regarding timing, as well as
> potentially slow; it loops 1000 times and sleeps 100 milliseconds each
> time.  In a very slow server (valgrind or clobber_cache animals) this
> could not be sufficient time, while on fast servers it may end up
> waiting longer than needed.  Maybe we can do something like this:

On second thought, I think that the regression test should be in
004_timeline_switch.pl instead of 001_stream_rep.pl because it's
the test about timeline switch. Also I'm thinking that it's better to
test the timeline switch by checking whether some data is successfully
replicatead like the existing regression test for timeline switch in
004_timeline_switch.pl does, instead of finding the specific message
in the log file. I attached the POC patch. Thought?

Regards,

-- 
Fujii Masao

Вложения

Re: A failure of standby to follow timeline switch

От
Kyotaro Horiguchi
Дата:
At Tue, 12 Jan 2021 10:47:21 +0900, Fujii Masao <masao.fujii@gmail.com> wrote in 
> On Sat, Jan 9, 2021 at 5:08 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> >
> > Masao-san: Are you intending to act as committer for these?  Since the
> > bug is mine I can look into it, but since you already did all the
> > reviewing work, I'm good with you giving it the final push.
> 
> Thanks! I'm thinking to push the patch.
> 
> 
> > 0001 looks good to me; let's get that one committed quickly so that we
> > can focus on the interesting stuff.  While the implementation of
> > find_in_log is quite dumb (not this patch's fault), it seems sufficient
> > to deal with small log files.  We can improve the implementation later,
> > if needed, but we have to get the API right on the first try.
> >
> > 0003: The fix looks good to me.  I verified that the test fails without
> > the fix, and it passes with the fix.
> 
> Yes.
> 
> 
> > The test added in 0002 is a bit optimistic regarding timing, as well as
> > potentially slow; it loops 1000 times and sleeps 100 milliseconds each
> > time.  In a very slow server (valgrind or clobber_cache animals) this
> > could not be sufficient time, while on fast servers it may end up
> > waiting longer than needed.  Maybe we can do something like this:
> 
> On second thought, I think that the regression test should be in
> 004_timeline_switch.pl instead of 001_stream_rep.pl because it's

Agreed. It's definitely the right place.

> the test about timeline switch. Also I'm thinking that it's better to
> test the timeline switch by checking whether some data is successfully
> replicatead like the existing regression test for timeline switch in
> 004_timeline_switch.pl does, instead of finding the specific message
> in the log file. I attached the POC patch. Thought?

It's practically a check on this issue, and looks better. The 180s
timeout in the failure case seems a bit annoying but it's the way all
of this kind of test follow.

The last check on table content is actually useless but it might make
sense to confirm that replication is actually working. However, I
don't think the test don't need to insert as many as 1000 tuples. Just
a single tuple would suffice.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: A failure of standby to follow timeline switch

От
Fujii Masao
Дата:
On Wed, Jan 13, 2021 at 10:48 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
>
> At Tue, 12 Jan 2021 10:47:21 +0900, Fujii Masao <masao.fujii@gmail.com> wrote in
> > On Sat, Jan 9, 2021 at 5:08 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> > >
> > > Masao-san: Are you intending to act as committer for these?  Since the
> > > bug is mine I can look into it, but since you already did all the
> > > reviewing work, I'm good with you giving it the final push.
> >
> > Thanks! I'm thinking to push the patch.
> >
> >
> > > 0001 looks good to me; let's get that one committed quickly so that we
> > > can focus on the interesting stuff.  While the implementation of
> > > find_in_log is quite dumb (not this patch's fault), it seems sufficient
> > > to deal with small log files.  We can improve the implementation later,
> > > if needed, but we have to get the API right on the first try.
> > >
> > > 0003: The fix looks good to me.  I verified that the test fails without
> > > the fix, and it passes with the fix.
> >
> > Yes.
> >
> >
> > > The test added in 0002 is a bit optimistic regarding timing, as well as
> > > potentially slow; it loops 1000 times and sleeps 100 milliseconds each
> > > time.  In a very slow server (valgrind or clobber_cache animals) this
> > > could not be sufficient time, while on fast servers it may end up
> > > waiting longer than needed.  Maybe we can do something like this:
> >
> > On second thought, I think that the regression test should be in
> > 004_timeline_switch.pl instead of 001_stream_rep.pl because it's
>
> Agreed. It's definitely the right place.
>
> > the test about timeline switch. Also I'm thinking that it's better to
> > test the timeline switch by checking whether some data is successfully
> > replicatead like the existing regression test for timeline switch in
> > 004_timeline_switch.pl does, instead of finding the specific message
> > in the log file. I attached the POC patch. Thought?
>
> It's practically a check on this issue, and looks better. The 180s
> timeout in the failure case seems a bit annoying but it's the way all
> of this kind of test follow.

Yes.

>
> The last check on table content is actually useless but it might make
> sense to confirm that replication is actually working. However, I
> don't think the test don't need to insert as many as 1000 tuples. Just
> a single tuple would suffice.

Thanks for the review!
I'm ok with this change (i.e., insert only single row).
Attached is the updated version of the patch.

Regards,

-- 
Fujii Masao

Вложения

Re: A failure of standby to follow timeline switch

От
Alvaro Herrera
Дата:
On 2021-Jan-13, Fujii Masao wrote:

> Thanks for the review!
> I'm ok with this change (i.e., insert only single row).
> Attached is the updated version of the patch.

Looks good to me, thanks!

-- 
Álvaro Herrera                            39°49'30"S 73°17'W



Re: A failure of standby to follow timeline switch

От
Kyotaro Horiguchi
Дата:
At Wed, 13 Jan 2021 16:51:32 -0300, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote in 
> On 2021-Jan-13, Fujii Masao wrote:
> 
> > Thanks for the review!
> > I'm ok with this change (i.e., insert only single row).
> > Attached is the updated version of the patch.
> 
> Looks good to me, thanks!

+1

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: A failure of standby to follow timeline switch

От
Fujii Masao
Дата:
On Thu, Jan 14, 2021 at 10:10 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
>
> At Wed, 13 Jan 2021 16:51:32 -0300, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote in
> > On 2021-Jan-13, Fujii Masao wrote:
> >
> > > Thanks for the review!
> > > I'm ok with this change (i.e., insert only single row).
> > > Attached is the updated version of the patch.
> >
> > Looks good to me, thanks!
>
> +1

Thanks Alvaro and Horiguchi for the review! I pushed the patch.

Regards,

-- 
Fujii Masao



Re: A failure of standby to follow timeline switch

От
Kyotaro Horiguchi
Дата:
At Thu, 14 Jan 2021 12:34:01 +0900, Fujii Masao <masao.fujii@gmail.com> wrote in 
> On Thu, Jan 14, 2021 at 10:10 AM Kyotaro Horiguchi
> <horikyota.ntt@gmail.com> wrote:
> >
> > At Wed, 13 Jan 2021 16:51:32 -0300, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote in
> > > On 2021-Jan-13, Fujii Masao wrote:
> > >
> > > > Thanks for the review!
> > > > I'm ok with this change (i.e., insert only single row).
> > > > Attached is the updated version of the patch.
> > >
> > > Looks good to me, thanks!
> >
> > +1
> 
> Thanks Alvaro and Horiguchi for the review! I pushed the patch.

Thanks for commiting this fix!

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center