Обсуждение: are WAL file segment boundaries a point of consistency?

Поиск

Список

Период

Сортировка

are WAL file segment boundaries a point of consistency?

От

John Lumby

Дата:

06 сентября 2013 г., 23:26:16

We use logshipping replication,    and have recently noticed a nasty bug
 where, in certain very rare cases, the primary archive_command program
will fail to send the WAL file to the standby but report good return code 0 to postgresql.
In such cases,  if the standby then  triggers its termination of recovery mode,
it will come up in normal accessible mode but missing the log records from that last WAL file.

This is a bug in our code which we will fix,  but I am wondering if it means there is a possibility
of worse than missing some updates.      I.e. could it result in this was-standby cluster now having
a corrupt database  (e.g. an index entry with no matching heap slot or something like that  -  or worse)?

I think the question is whether the end of a WAL file is a point of consistency?
like the timestamp you can specify in the recovery.conf for a point-in-time recovery?
Or does postgresql xlogger just chop each WAL segment at the physical page boundary?

Cheers,     John Lumby

Re: are WAL file segment boundaries a point of consistency?

От

Amador Alvarez

Дата:

09 сентября 2013 г., 19:51:55

I would look at WAL files as a sequence of commits and not a sequence of files within timelines where you can specify either with recovery_target_time or recovery_target_xid the point of consistency you want to reach.

Cheers,

A.A.

On Fri, Sep 6, 2013 at 1:26 PM, John Lumby <johnlumby@hotmail.com> wrote:

We use logshipping replication,    and have recently noticed a nasty bug
where, in certain very rare cases, the primary archive_command program
will fail to send the WAL file to the standby but report good return code 0 to postgresql.
In such cases, if the standby then triggers its termination of recovery mode,
it will come up in normal accessible mode but missing the log records from that last WAL file.

This is a bug in our code which we will fix, but I am wondering if it means there is a possibility
of worse than missing some updates.    I.e. could it result in this was-standby cluster now having
a corrupt database (e.g. an index entry with no matching heap slot or something like that - or worse)?

I think the question is whether the end of a WAL file is a point of consistency?
like the timestamp you can specify in the recovery.conf for a point-in-time recovery?
Or does postgresql xlogger just chop each WAL segment at the physical page boundary?

Cheers,     John Lumby

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: are WAL file segment boundaries a point of consistency?

От

Jeff Janes

Дата:

09 сентября 2013 г., 20:22:22

On Fri, Sep 6, 2013 at 1:26 PM, John Lumby <johnlumby@hotmail.com> wrote:
> We use logshipping replication,    and have recently noticed a nasty bug
>  where, in certain very rare cases, the primary archive_command program
> will fail to send the WAL file to the standby but report good return code 0 to postgresql.
> In such cases,  if the standby then  triggers its termination of recovery mode,
> it will come up in normal accessible mode but missing the log records from that last WAL file.
>
> This is a bug in our code which we will fix,  but I am wondering if it means there is a possibility
> of worse than missing some updates.      I.e. could it result in this was-standby cluster now having
> a corrupt database  (e.g. an index entry with no matching heap slot or something like that  -  or worse)?

As long as the standby ever reached consistency in the first place,
then it should not lose it due to this issue. Once consistency is
reached, changes to the data files are driven only by replay of the
WAL records, and those should only take the database from one
consistent state to another.

Where you risk corruption is if the problem occured while you are
taking the base backup.  Then some of the base files that were copied
might already have data in them which is from the "future", but that
future cannot be reached because recovery stops early due to the lost
file.  The database should detect this situation and refuse to start,
forcing you to retake the base backup or use an earlier one.  But
there were known bugs in this general area, some fixed in 9.2.3.

Cheers,

Jeff

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: are WAL file segment boundaries a point of consistency?

are WAL file segment boundaries a point of consistency?

Re: are WAL file segment boundaries a point of consistency?

Re: are WAL file segment boundaries a point of consistency?