Обсуждение: async streaming and recovery_target_timeline=latest

Поиск
Список
Период
Сортировка

async streaming and recovery_target_timeline=latest

От
Ben Chobot
Дата:
We have an async streaming setup using 9.1.9 and 3 nodes - let's call them A, B, and C. A is the master, B and C are slaves. Today, A crashed, so we made B be the master and told C to follow along with the switch by changing the primary_conninfo in it's recovery.conf, making sure the history file had made it to the WAL archive, then restarting it. That's worked very well for us in the past, but not so much today. When C came back online, it started complaining about missing WALs:

2013-07-03T21:22:42.441347+00:00 pgdb41-vpc postgres[25779]: [18184-1] db=,user= LOG:  shutting down
2013-07-03T21:22:42.457728+00:00 pgdb41-vpc postgres[25779]: [18185-1] db=,user= LOG:  database system is shut down
2013-07-03T21:22:46.852845+00:00 pgdb41-vpc postgres[28942]: [1-1] db=,user= LOG:  database system was shut down in recovery at 2013-07-03 21:22:42 UTC
2013-07-03T21:22:46.866127+00:00 pgdb41-vpc postgres[28947]: [1-1] db=[unknown],user=[unknown] LOG:  incomplete startup packet
2013-07-03T21:22:47.368871+00:00 pgdb41-vpc postgres[28942]: [2-1] db=,user= LOG:  restored log file "00000010.history" from archive
2013-07-03T21:22:47.413588+00:00 pgdb41-vpc postgres[28956]: [1-1] db=postgres,user=postgres FATAL:  the database system is starting up
2013-07-03T21:22:47.767182+00:00 pgdb41-vpc postgres[28942]: [3-1] db=,user= LOG:  restored log file "00000010.history" from archive
2013-07-03T21:22:47.767289+00:00 pgdb41-vpc postgres[28942]: [4-1] db=,user= LOG:  entering standby mode
2013-07-03T21:22:47.930394+00:00 pgdb41-vpc postgres[28978]: [1-1] db=postgres,user=postgres FATAL:  the database system is starting up
2013-07-03T21:22:48.410056+00:00 pgdb41-vpc postgres[28942]: [5-1] db=,user= LOG:  redo starts at 1469/A2604868
2013-07-03T21:22:48.445921+00:00 pgdb41-vpc postgres[28986]: [1-1] db=postgres,user=postgres FATAL:  the database system is starting up
2013-07-03T21:22:48.962090+00:00 pgdb41-vpc postgres[28994]: [1-1] db=postgres,user=postgres FATAL:  the database system is starting up
2013-07-03T21:22:49.477279+00:00 pgdb41-vpc postgres[29020]: [1-1] db=postgres,user=postgres FATAL:  the database system is starting up
2013-07-03T21:22:49.993021+00:00 pgdb41-vpc postgres[29027]: [1-1] db=postgres,user=postgres FATAL:  the database system is starting up
2013-07-03T21:22:50.508848+00:00 pgdb41-vpc postgres[29034]: [1-1] db=postgres,user=postgres FATAL:  the database system is starting up
2013-07-03T21:23:30.651775+00:00 pgdb41-vpc postgres[28942]: [6-1] db=,user= LOG:  consistent recovery state reached at 146A/14FFFA8
2013-07-03T21:23:30.651805+00:00 pgdb41-vpc postgres[28942]: [7-1] db=,user= LOG:  invalid magic number 0000 in log file 5226, segment 1, offset 5242880
2013-07-03T21:23:30.653214+00:00 pgdb41-vpc postgres[28917]: [1-1] db=,user= LOG:  database system is ready to accept read only connections
2013-07-03T21:23:31.123588+00:00 pgdb41-vpc postgres[29754]: [2-1] db=,user= LOG:  streaming replication successfully connected to primary
2013-07-03T21:23:31.123647+00:00 pgdb41-vpc postgres[29754]: [3-1] db=,user= FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000100000146A00000001 has already been removed

At this point, my understanding of postgres must be wrong, because it appears to me that the slave is looking for WAL 146A/01 because that's where it reached consistent state. However, that was in the previous timeline - we didn't get to the 10 history timeline till 146A/0C:

# cat 00000010.history 
15 0000000F0000146A0000000C no recovery target specified


Shouldn't postgres know to be looking for "0000000F0000146A00000001", not "000000100000146A00000001"? I'm trying to see what part of our process we have wrong to have ended up in this state but I'm missing it.


For what it's worth the new master (node B) certainly seems to have all the WAL files you might expect. Here's some snippets of an ls -l, but all the files are there in between the snippets.

-rw------- 1 postgres postgres 16777216 Jul  3 21:13 0000000F000014690000009F
-rw------- 1 postgres postgres 16777216 Jul  3 21:13 0000000F00001469000000A0
-rw------- 1 postgres postgres 16777216 Jul  3 21:13 0000000F00001469000000A1
-rw------- 1 postgres postgres 16777216 Jul  3 21:13 0000000F00001469000000A2
-rw------- 1 postgres postgres 16777216 Jul  3 21:13 0000000F00001469000000A3
-rw------- 1 postgres postgres 16777216 Jul  3 21:13 0000000F00001469000000A4
-rw------- 1 postgres postgres 16777216 Jul  3 21:13 0000000F00001469000000A5
-rw------- 1 postgres postgres 16777216 Jul  3 21:13 0000000F00001469000000A6
.
.
.
-rw------- 1 postgres postgres 16777216 Jul  3 21:15 0000000F0000146A00000000
-rw------- 1 postgres postgres 16777216 Jul  3 21:15 0000000F0000146A00000001 <- the consistent state seems to be found here
-rw------- 1 postgres postgres 16777216 Jul  3 21:15 0000000F0000146A00000002
-rw------- 1 postgres postgres 16777216 Jul  3 21:15 0000000F0000146A00000003
-rw------- 1 postgres postgres 16777216 Jul  3 21:15 0000000F0000146A00000004
-rw------- 1 postgres postgres 16777216 Jul  3 21:15 0000000F0000146A00000005
-rw------- 1 postgres postgres 16777216 Jul  3 21:15 0000000F0000146A00000006
-rw------- 1 postgres postgres 16777216 Jul  3 21:15 0000000F0000146A00000007
-rw------- 1 postgres postgres 16777216 Jul  3 21:15 0000000F0000146A00000008
-rw------- 1 postgres postgres 16777216 Jul  3 21:15 0000000F0000146A00000009
-rw------- 1 postgres postgres 16777216 Jul  3 21:15 0000000F0000146A0000000A
-rw------- 1 postgres postgres 16777216 Jul  3 21:15 0000000F0000146A0000000B
-rw------- 1 postgres postgres 16777216 Jul  3 21:15 0000000F0000146A0000000C
-rw------- 1 postgres postgres 16777216 Jul  3 20:56 0000000F0000146A0000000D
-rw------- 1 postgres postgres 16777216 Jul  3 20:57 0000000F0000146A0000000E
-rw------- 1 postgres postgres 16777216 Jul  3 20:57 0000000F0000146A0000000F
-rw------- 1 postgres postgres 16777216 Jul  3 20:57 0000000F0000146A00000010
-rw------- 1 postgres postgres 16777216 Jul  3 20:57 0000000F0000146A00000011
.
.
.
-rw------- 1 postgres postgres 16777216 Jul  3 20:58 0000000F0000146A000000CF
-rw------- 1 postgres postgres 16777216 Jul  3 20:58 0000000F0000146A000000D0
-rw------- 1 postgres postgres 16777216 Jul  3 20:58 0000000F0000146A000000D1
-rw------- 1 postgres postgres 16777216 Jul  3 20:58 0000000F0000146A000000D2
-rw------- 1 postgres postgres 16777216 Jul  3 20:58 0000000F0000146A000000D3
-rw------- 1 postgres postgres 16777216 Jul  3 20:58 0000000F0000146A000000D4
-rw------- 1 postgres postgres 16777216 Jul  3 20:58 0000000F0000146A000000D5
-rw------- 1 postgres postgres 16777216 Jul  3 21:23 000000100000146A0000000C <- timeline switches here
-rw------- 1 postgres postgres 16777216 Jul  3 21:25 000000100000146A0000000D
-rw------- 1 postgres postgres 16777216 Jul  3 21:27 000000100000146A0000000E
-rw------- 1 postgres postgres 16777216 Jul  3 21:28 000000100000146A0000000F
-rw------- 1 postgres postgres 16777216 Jul  3 21:30 000000100000146A00000010
-rw------- 1 postgres postgres 16777216 Jul  3 21:32 000000100000146A00000011
-rw------- 1 postgres postgres 16777216 Jul  3 21:34 000000100000146A00000012






Re: async streaming and recovery_target_timeline=latest

От
Ben Chobot
Дата:
Anybody?

On Jul 3, 2013, at 3:23 PM, Ben Chobot wrote:

We have an async streaming setup using 9.1.9 and 3 nodes - let's call them A, B, and C. A is the master, B and C are slaves. Today, A crashed, so we made B be the master and told C to follow along with the switch by changing the primary_conninfo in it's recovery.conf, making sure the history file had made it to the WAL archive, then restarting it. That's worked very well for us in the past, but not so much today. When C came back online, it started complaining about missing WALs:

2013-07-03T21:22:42.441347+00:00 pgdb41-vpc postgres[25779]: [18184-1] db=,user= LOG:  shutting down
2013-07-03T21:22:42.457728+00:00 pgdb41-vpc postgres[25779]: [18185-1] db=,user= LOG:  database system is shut down
2013-07-03T21:22:46.852845+00:00 pgdb41-vpc postgres[28942]: [1-1] db=,user= LOG:  database system was shut down in recovery at 2013-07-03 21:22:42 UTC
2013-07-03T21:22:46.866127+00:00 pgdb41-vpc postgres[28947]: [1-1] db=[unknown],user=[unknown] LOG:  incomplete startup packet
2013-07-03T21:22:47.368871+00:00 pgdb41-vpc postgres[28942]: [2-1] db=,user= LOG:  restored log file "00000010.history" from archive
2013-07-03T21:22:47.413588+00:00 pgdb41-vpc postgres[28956]: [1-1] db=postgres,user=postgres FATAL:  the database system is starting up
2013-07-03T21:22:47.767182+00:00 pgdb41-vpc postgres[28942]: [3-1] db=,user= LOG:  restored log file "00000010.history" from archive
2013-07-03T21:22:47.767289+00:00 pgdb41-vpc postgres[28942]: [4-1] db=,user= LOG:  entering standby mode
2013-07-03T21:22:47.930394+00:00 pgdb41-vpc postgres[28978]: [1-1] db=postgres,user=postgres FATAL:  the database system is starting up
2013-07-03T21:22:48.410056+00:00 pgdb41-vpc postgres[28942]: [5-1] db=,user= LOG:  redo starts at 1469/A2604868
2013-07-03T21:22:48.445921+00:00 pgdb41-vpc postgres[28986]: [1-1] db=postgres,user=postgres FATAL:  the database system is starting up
2013-07-03T21:22:48.962090+00:00 pgdb41-vpc postgres[28994]: [1-1] db=postgres,user=postgres FATAL:  the database system is starting up
2013-07-03T21:22:49.477279+00:00 pgdb41-vpc postgres[29020]: [1-1] db=postgres,user=postgres FATAL:  the database system is starting up
2013-07-03T21:22:49.993021+00:00 pgdb41-vpc postgres[29027]: [1-1] db=postgres,user=postgres FATAL:  the database system is starting up
2013-07-03T21:22:50.508848+00:00 pgdb41-vpc postgres[29034]: [1-1] db=postgres,user=postgres FATAL:  the database system is starting up
2013-07-03T21:23:30.651775+00:00 pgdb41-vpc postgres[28942]: [6-1] db=,user= LOG:  consistent recovery state reached at 146A/14FFFA8
2013-07-03T21:23:30.651805+00:00 pgdb41-vpc postgres[28942]: [7-1] db=,user= LOG:  invalid magic number 0000 in log file 5226, segment 1, offset 5242880
2013-07-03T21:23:30.653214+00:00 pgdb41-vpc postgres[28917]: [1-1] db=,user= LOG:  database system is ready to accept read only connections
2013-07-03T21:23:31.123588+00:00 pgdb41-vpc postgres[29754]: [2-1] db=,user= LOG:  streaming replication successfully connected to primary
2013-07-03T21:23:31.123647+00:00 pgdb41-vpc postgres[29754]: [3-1] db=,user= FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000100000146A00000001 has already been removed

At this point, my understanding of postgres must be wrong, because it appears to me that the slave is looking for WAL 146A/01 because that's where it reached consistent state. However, that was in the previous timeline - we didn't get to the 10 history timeline till 146A/0C:

# cat 00000010.history 
15 0000000F0000146A0000000C no recovery target specified


Shouldn't postgres know to be looking for "0000000F0000146A00000001", not "000000100000146A00000001"? I'm trying to see what part of our process we have wrong to have ended up in this state but I'm missing it.


For what it's worth the new master (node B) certainly seems to have all the WAL files you might expect. Here's some snippets of an ls -l, but all the files are there in between the snippets.

-rw------- 1 postgres postgres 16777216 Jul  3 21:13 0000000F000014690000009F
-rw------- 1 postgres postgres 16777216 Jul  3 21:13 0000000F00001469000000A0
-rw------- 1 postgres postgres 16777216 Jul  3 21:13 0000000F00001469000000A1
-rw------- 1 postgres postgres 16777216 Jul  3 21:13 0000000F00001469000000A2
-rw------- 1 postgres postgres 16777216 Jul  3 21:13 0000000F00001469000000A3
-rw------- 1 postgres postgres 16777216 Jul  3 21:13 0000000F00001469000000A4
-rw------- 1 postgres postgres 16777216 Jul  3 21:13 0000000F00001469000000A5
-rw------- 1 postgres postgres 16777216 Jul  3 21:13 0000000F00001469000000A6
.
.
.
-rw------- 1 postgres postgres 16777216 Jul  3 21:15 0000000F0000146A00000000
-rw------- 1 postgres postgres 16777216 Jul  3 21:15 0000000F0000146A00000001 <- the consistent state seems to be found here
-rw------- 1 postgres postgres 16777216 Jul  3 21:15 0000000F0000146A00000002
-rw------- 1 postgres postgres 16777216 Jul  3 21:15 0000000F0000146A00000003
-rw------- 1 postgres postgres 16777216 Jul  3 21:15 0000000F0000146A00000004
-rw------- 1 postgres postgres 16777216 Jul  3 21:15 0000000F0000146A00000005
-rw------- 1 postgres postgres 16777216 Jul  3 21:15 0000000F0000146A00000006
-rw------- 1 postgres postgres 16777216 Jul  3 21:15 0000000F0000146A00000007
-rw------- 1 postgres postgres 16777216 Jul  3 21:15 0000000F0000146A00000008
-rw------- 1 postgres postgres 16777216 Jul  3 21:15 0000000F0000146A00000009
-rw------- 1 postgres postgres 16777216 Jul  3 21:15 0000000F0000146A0000000A
-rw------- 1 postgres postgres 16777216 Jul  3 21:15 0000000F0000146A0000000B
-rw------- 1 postgres postgres 16777216 Jul  3 21:15 0000000F0000146A0000000C
-rw------- 1 postgres postgres 16777216 Jul  3 20:56 0000000F0000146A0000000D
-rw------- 1 postgres postgres 16777216 Jul  3 20:57 0000000F0000146A0000000E
-rw------- 1 postgres postgres 16777216 Jul  3 20:57 0000000F0000146A0000000F
-rw------- 1 postgres postgres 16777216 Jul  3 20:57 0000000F0000146A00000010
-rw------- 1 postgres postgres 16777216 Jul  3 20:57 0000000F0000146A00000011
.
.
.
-rw------- 1 postgres postgres 16777216 Jul  3 20:58 0000000F0000146A000000CF
-rw------- 1 postgres postgres 16777216 Jul  3 20:58 0000000F0000146A000000D0
-rw------- 1 postgres postgres 16777216 Jul  3 20:58 0000000F0000146A000000D1
-rw------- 1 postgres postgres 16777216 Jul  3 20:58 0000000F0000146A000000D2
-rw------- 1 postgres postgres 16777216 Jul  3 20:58 0000000F0000146A000000D3
-rw------- 1 postgres postgres 16777216 Jul  3 20:58 0000000F0000146A000000D4
-rw------- 1 postgres postgres 16777216 Jul  3 20:58 0000000F0000146A000000D5
-rw------- 1 postgres postgres 16777216 Jul  3 21:23 000000100000146A0000000C <- timeline switches here
-rw------- 1 postgres postgres 16777216 Jul  3 21:25 000000100000146A0000000D
-rw------- 1 postgres postgres 16777216 Jul  3 21:27 000000100000146A0000000E
-rw------- 1 postgres postgres 16777216 Jul  3 21:28 000000100000146A0000000F
-rw------- 1 postgres postgres 16777216 Jul  3 21:30 000000100000146A00000010
-rw------- 1 postgres postgres 16777216 Jul  3 21:32 000000100000146A00000011
-rw------- 1 postgres postgres 16777216 Jul  3 21:34 000000100000146A00000012







Re: async streaming and recovery_target_timeline=latest

От
Amit Langote
Дата:
On Mon, Jul 29, 2013 at 3:02 AM, Ben Chobot <bench@silentmedia.com> wrote:
> Anybody?
>
> On Jul 3, 2013, at 3:23 PM, Ben Chobot wrote:
>
> We have an async streaming setup using 9.1.9 and 3 nodes - let's call them
> A, B, and C. A is the master, B and C are slaves. Today, A crashed, so we
> made B be the master and told C to follow along with the switch by changing
> the primary_conninfo in it's recovery.conf, making sure the history file had
> made it to the WAL archive, then restarting it. That's worked very well for
> us in the past, but not so much today. When C came back online, it started
> complaining about missing WALs:
>
[...]
> LOG:  streaming replication successfully connected to primary
> 2013-07-03T21:23:31.123647+00:00 pgdb41-vpc postgres[29754]: [3-1] db=,user=
> FATAL:  could not receive data from WAL stream: FATAL:  requested WAL
> segment 000000100000146A00000001 has already been removed
>
> At this point, my understanding of postgres must be wrong, because it
> appears to me that the slave is looking for WAL 146A/01 because that's where
> it reached consistent state. However, that was in the previous timeline - we
> didn't get to the 10 history timeline till 146A/0C:
>
> # cat 00000010.history
> 15 0000000F0000146A0000000C no recovery target specified
>
>
> Shouldn't postgres know to be looking for "0000000F0000146A00000001", not
> "000000100000146A00000001"? I'm trying to see what part of our process we
> have wrong to have ended up in this state but I'm missing it.
>
>
> For what it's worth the new master (node B) certainly seems to have all the
> WAL files you might expect. Here's some snippets of an ls -l, but all the
> files are there in between the snippets.
>
[...]
> -rw------- 1 postgres postgres 16777216 Jul  3 21:15
> 0000000F0000146A00000000
> -rw------- 1 postgres postgres 16777216 Jul  3 21:15
> 0000000F0000146A00000001 <- the consistent state seems to be found here
> -rw------- 1 postgres postgres 16777216 Jul  3 21:15
> 0000000F0000146A00000002
> -rw------- 1 postgres postgres 16777216 Jul  3 21:15
[...]
> 000000100000146A0000000C <- timeline switches here
> -rw------- 1 postgres postgres 16777216 Jul  3 21:25
> 000000100000146A0000000D
> -rw------- 1 postgres postgres 16777216 Jul  3 21:27
> 000000100000146A0000000E
> -rw------- 1 postgres postgres 16777216 Jul  3 21:28
> 000000100000146A0000000F
> -rw------- 1 postgres postgres 16777216 Jul  3 21:30
> 000000100000146A00000010
> -rw------- 1 postgres postgres 16777216 Jul  3 21:32
> 000000100000146A00000011
> -rw------- 1 postgres postgres 16777216 Jul  3 21:34
> 000000100000146A00000012
>


I think, the WAL recycling on standby names the recycled segments with
the latest timelineID (in this case it's 0x10) which creates WALs that
there shouldn't have been like 000000100000146A00000001 instead of
0000000F0000146A00000001. This patch recently applied to 9.1.9 (but
not in any stable release so far) solves this problem as far as I can
see. Try and see if you can patch it:

http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=424cc31a3785bd01108e6f4b20941c6442d3d2d0


--
Amit Langote


Re: async streaming and recovery_target_timeline=latest

От
Ben Chobot
Дата:
On Jul 28, 2013, at 5:29 PM, Amit Langote wrote:

I think, the WAL recycling on standby names the recycled segments with
the latest timelineID (in this case it's 0x10) which creates WALs that
there shouldn't have been like 000000100000146A00000001 instead of
0000000F0000146A00000001. This patch recently applied to 9.1.9 (but
not in any stable release so far) solves this problem as far as I can
see. Try and see if you can patch it:

http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=424cc31a3785bd01108e6f4b20941c6442d3d2d0

Thanks for the link Amit. That makes for 2 patches we're eagerly awaiting in 9.1.10.