BUG #14109: pg_rewind fails to update target control file in one scenario

Поиск
Список
Период
Сортировка
От johnlumby@hotmail.com
Тема BUG #14109: pg_rewind fails to update target control file in one scenario
Дата
Msg-id 20160424192549.2725.71787@wrigleys.postgresql.org
обсуждение исходный текст
Ответы Re: BUG #14109: pg_rewind fails to update target control file in one scenario  (Michael Paquier <michael.paquier@gmail.com>)
Список pgsql-bugs
The following bug has been logged on the website:

Bug reference:      14109
Logged by:          John Lumby
Email address:      johnlumby@hotmail.com
PostgreSQL version: 9.5.1
Operating system:   linux 64-bit
Description:

scenario :
 two systems currently in an operating streaming replication relationship :
       Primary systemA             Standby SystemB
 with no WAL queued and no inserts/updates/deletes now being performed on
systemA

  then in chronological sequence :
   .  shut down SystemA
   .  pg_ctl promote SystemB
              and verify systemB is running correctly stand-alone
   .  pg_rewind SystemA
              output is something like
                    connected to server
                    fetched file "global/pg_control", length 8192
                    fetched file "pg_xlog/0000000D.history", length 388
                    servers diverged at WAL position 9/A90002A8 on timeline
12
                    no rewind required

   .  set up correct recovery.conf on SystemA
   .  start SystemA postgres server

 At this point,  both systemB and systemA appear to be running correctly,
 but any insert/update/delete now performed on systemB is not replicated to
systemA.
 Also pg_stat_replication view on systemB shows state 'startup' ,  not
'streaming'

I believe there is a bug in pg_rewind for this scenario,   where it finds
that
the following conditions are true :
  1 - source and target cluster are not on the same timeline
  2 - the histories diverged exactly at the end of the
      shutdown checkpoint record on the target,
      so there are no WAL records in the target
      that don't belong in the source's history

The code then concludes that no rewind is needed.

Which is true  --
However,  what I believe *is* needed is to update the target control file
with the new timeline and other information from the source.

This patch seems to fix the problem on my system :

--- src/bin/pg_rewind/pg_rewind.c.orig    2016-02-08 16:12:28.000000000 -0500
+++ src/bin/pg_rewind/pg_rewind.c    2016-04-24 14:50:52.646737233 -0400
@@ -247,7 +247,14 @@ main(int argc, char **argv)
              * needed.
              */
             if (chkptendrec == divergerec)
+            {
                 rewind_needed = false;
+                /*  however we must still copy the control file from source
to target
+                 *  because of the timeline change.
+                 */
+                printf(_("no rewind required but will update global control file from
source for increase in timeline.\n"));
+                goto updateControlFile;
+            }
             else
                 rewind_needed = true;
         }
@@ -318,6 +325,7 @@ main(int argc, char **argv)
     pg_log(PG_PROGRESS, "\ncreating backup label and updating control
file\n");
     createBackupLabel(chkptredo, chkpttli, chkptrec);

+  updateControlFile:
     /*
      * Update control file of target. Make it ready to perform archive
      * recovery when restarting.

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Noah Misch
Дата:
Сообщение: Re: BUG #14081: System LC_COLLATE changed
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: BUG #14109: pg_rewind fails to update target control file in one scenario