Обсуждение: Issue with recovery_target = 'immediate'
Hackers, We have noticed an issue when performing recovery with recovery_target = 'immediate' when the latest timeline cannot be replayed to from the backup (current) timeline. First, create two backups: $ pgbackrest --stanza=demo --type=full --start-fast backup $ pgbackrest --stanza=demo --type=full --start-fast backup $ pgbackrest info <snip> full backup: 20220713-175710F timestamp start/stop: 2022-07-13 17:57:10 / 2022-07-13 17:57:14 wal start/stop: 000000010000000000000003 / 000000010000000000000003 database size: 23.2MB, database backup size: 23.2MB repo1: backup set size: 2.8MB, backup size: 2.8MB full backup: 20220713-175748F timestamp start/stop: 2022-07-13 17:57:48 / 2022-07-13 17:57:52 wal start/stop: 000000010000000000000005 / 000000010000000000000005 database size: 23.2MB, database backup size: 23.2MB repo1: backup set size: 2.8MB, backup size: 2.8MB Restore the first backup: $ pg_ctlcluster 13 demo stop $ pgbackrest --stanza=demo --delta --set=20220713-175710F --type=immediate --target-action=promote restore Recovery settings: $ cat /var/lib/postgresql/13/demo/postgresql.auto.conf <snip> restore_command = 'pgbackrest --stanza=demo archive-get %f "%p"' recovery_target = 'immediate' recovery_target_action = 'promote' Starting PostgreSQL performs recovery as expected: $ pg_ctlcluster 13 demo start $ cat /var/log/postgresql/postgresql-13-demo.log LOG: database system was interrupted; last known up at 2022-07-13 17:57:10 UTC LOG: starting point-in-time recovery to earliest consistent point LOG: restored log file "000000010000000000000003" from archive LOG: redo starts at 0/3000028 LOG: consistent recovery state reached at 0/3000138 LOG: recovery stopping after reaching consistency LOG: redo done at 0/3000138 LOG: database system is ready to accept read only connections LOG: selected new timeline ID: 2 LOG: archive recovery complete LOG: database system is ready to accept connections Now restore the second backup (recovery settings are identical): $ pg_ctlcluster 13 demo stop $ pgbackrest --stanza=demo --delta --set=20220713-175748F --type=immediate --target-action=promote restore Recovery now fails: $ pg_ctlcluster 13 demo start $ cat /var/log/postgresql/postgresql-13-demo.log LOG: database system was interrupted; last known up at 2022-07-13 17:57:48 UTC LOG: restored log file "00000002.history" from archive LOG: starting point-in-time recovery to earliest consistent point LOG: restored log file "00000002.history" from archive LOG: restored log file "000000010000000000000005" from archive FATAL: requested timeline 2 is not a child of this server's history DETAIL: Latest checkpoint is at 0/5000060 on timeline 1, but in the history of the requested timeline, the server forked off from that timeline at 0/3000138. LOG: startup process (PID 511) exited with exit code 1 LOG: aborting startup due to startup process failure LOG: database system is shut down While it is certainly true that timeline 2 cannot be replayed to from timeline 1, it should not matter for an immediate recovery that stops at consistency. No timeline switch will occur until promotion. Of course the cluster could be shut down before promotion and the target changed, but in that case couldn't timeline be adjusted at that point? This works by default for PostgreSQL < 12 because the default timeline is current. Since PostgreSQL 12 the default has been latest, which does not work by default. When a user does a number of recoveries it is pretty common for the timelines to get in a state that will make most subsequent recoveries fail. We think it makes sense for recovery_target = 'immediate' to be a fail safe that always works no matter the state of the latest timeline. Our solution has been to force recovery_target_timeline = 'current' when recovery_target = 'immediate', but it seems like this is something that should be done in PostgreSQL instead. Thoughts? Regards, -David
At Wed, 13 Jul 2022 14:41:40 -0400, David Steele <david@pgmasters.net> wrote in > While it is certainly true that timeline 2 cannot be replayed to from > timeline 1, it should not matter for an immediate recovery that stops > at consistency. No timeline switch will occur until promotion. Of > course the cluster could be shut down before promotion and the target > changed, but in that case couldn't timeline be adjusted at that point? > > This works by default for PostgreSQL < 12 because the default timeline > is current. Since PostgreSQL 12 the default has been latest, which > does not work by default. > > When a user does a number of recoveries it is pretty common for the > timelines to get in a state that will make most subsequent recoveries > fail. We think it makes sense for recovery_target = 'immediate' to be > a fail safe that always works no matter the state of the latest > timeline. > > Our solution has been to force recovery_target_timeline = 'current' > when recovery_target = 'immediate', but it seems like this is > something that should be done in PostgreSQL instead. > > Thoughts? I think it is natural that recovery defaultly targets the most recent update. In that sense, at the time the admin decided to recover the server from the first backup, the second backup is kind of dead, at least which should be forgotten in the future operation. Even if we want "any" backup usable, just re-targeting to the current timeline after the timeline error looks kind of inconsistent to the behavior mentioned above. To make "dead" backups behave like the "live" ones, we would need to check if the backup is in the history of each "future" timelines, then choose the latest timeline from them. # Mmm. I remember about a recent patch for pg_rewind to do the same... regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On 7/14/22 04:26, Kyotaro Horiguchi wrote: > At Wed, 13 Jul 2022 14:41:40 -0400, David Steele <david@pgmasters.net> wrote in >> While it is certainly true that timeline 2 cannot be replayed to from >> timeline 1, it should not matter for an immediate recovery that stops >> at consistency. No timeline switch will occur until promotion. Of >> course the cluster could be shut down before promotion and the target >> changed, but in that case couldn't timeline be adjusted at that point? >> >> This works by default for PostgreSQL < 12 because the default timeline >> is current. Since PostgreSQL 12 the default has been latest, which >> does not work by default. >> >> When a user does a number of recoveries it is pretty common for the >> timelines to get in a state that will make most subsequent recoveries >> fail. We think it makes sense for recovery_target = 'immediate' to be >> a fail safe that always works no matter the state of the latest >> timeline. >> >> Our solution has been to force recovery_target_timeline = 'current' >> when recovery_target = 'immediate', but it seems like this is >> something that should be done in PostgreSQL instead. >> >> Thoughts? > > I think it is natural that recovery defaultly targets the most recent > update. In that sense, at the time the admin decided to recover the > server from the first backup, the second backup is kind of dead, at > least which should be forgotten in the future operation. Well, I dislike the idea of a dead backup. Certainly no backup can be followed along all timelines but it should still be recoverable. > Even if we want "any" backup usable, just re-targeting to the current > timeline after the timeline error looks kind of inconsistent to the > behavior mentioned above. To make "dead" backups behave like the > "live" ones, we would need to check if the backup is in the history of > each "future" timelines, then choose the latest timeline from them. I think this makes sense for for non-immediate targets. The idea is that recovering to the "latest" timeline would actually recover to the latest timeline that is valid for the backup. Is that what you had in mind? However, for immediate targets, only the current timeline makes sense so I feel like it would be better to simply force the current timeline. > # Mmm. I remember about a recent patch for pg_rewind to do the same... Do you have a link for this? Regards, -David