Simon Riggs wrote:
> On Thu, 2009-01-29 at 12:22 +0200, Heikki Linnakangas wrote:
>> It
>> comes from the fact that we set minSafeStartPoint beyond the actual end
>> of WAL, if the last WAL segment is only partially filled (= fails CRC
>> check at some point). If we crash after setting minSafeStartPoint like
>> that, and then restart recovery, we'll get the error.
>
> Look again please. My proposal would avoid the error when it is not
> relevant, yet keep it when it is (while recovering base backups).
I fail to see what base backups have to do with this. The problem arises
in this scenario:
0. A base backup is unzipped. recovery.conf is copied in place, and the
remaining unarchived WAL segments are copied from the primary server to
pg_xlog. The last WAL segment is only partially filled. Let's say that
redo point is in WAL segment 1. The last, partial, WAL segment is 3, and
WAL ends at 0/3500000
1. postmaster is started, recovery starts.
2. WAL segment 1 is restored from archive.
3. We reach consistent recovery point
4. We restore WAL segment 2 from archive. minSafeStartPoint is advanced
to 0/3000000
5. WAL segment 2 is completely replayed, we move on to WAL segment 3. It
is not in archive, but it's found in pg_xlog. minSafeStartPoint is
advanced to 0/4000000. Note that that's beyond end of WAL.
6. At replay of WAL record 0/3200000, the recovery is interrupted. For
example, by a fast shutdown request, or crash.
Now when we restart the recovery, we will never reach minSafeStartPoint,
which is now 0/4000000, and we'll fail with the error that Fujii-san
pointed out. We're already way past the min recovery point of base
backup by then.
-- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com