Re: Missing pg_control crashes postmaster

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: Missing pg_control crashes postmaster
Дата
Msg-id 20180725150952.qsviniepf3m4gqzg@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: Missing pg_control crashes postmaster  (David Steele <david@pgmasters.net>)
Ответы Re: Missing pg_control crashes postmaster  (David Steele <david@pgmasters.net>)
Список pgsql-hackers
Hi,

On 2018-07-25 10:52:08 -0400, David Steele wrote:
> On 7/25/18 10:37 AM, Andres Freund wrote:
> > On July 25, 2018 7:18:30 AM PDT, David Steele <david@pgmasters.net> wrote:
> > > 
> > > It seems like an easy win if we can find a safe way to do it, though I
> > > admit that this is only a benefit in corner cases.
> > 
> > What would we win here? Which scenario that's not contrived would be less bad due to the proposed change.  This
seemscomplexity for it's own sake.
 
> 
> I think it's worth preserving pg_control even in the case where there is
> other damage to the cluster.  The alternative in this case (if no backup
> exists) is to run pg_resetwal which means data since the last checkpoint
> will not be written out causing even more data loss.  I have run clusters
> with checkpoint_timeout = 60m so data loss in this case is a real concern.

Wait, what? How is "data loss in this case is a real concern." - no
even a remotely realistic scenario has been described where this matters
so far.


> I favor the contrived scenario that helps preserve the current cluster
> instead of a hypothetical newly init'd one.  I also don't think that users
> deleting files out of a cluster is all that contrived.

But trying to limp on in that case, and that being helpful, is.


> Adding O_CREATE to open() doesn't seem too complex to me.  I'm not really in
> favor of the renaming idea, but I'm not against it either if it gets me a
> copy of the pg_control file.

The problem is that that'll just hide the issue for a bit longer, while
continuing (due to the O_CREAT we'll not PANIC anymore).  Which can lead
to a lot of followup issues, like checkpoints removing old WAL that'd
have been useful for data recovery.

Greetings,

Andres Freund


В списке pgsql-hackers по дате отправления:

Предыдущее
От: David Fetter
Дата:
Сообщение: Re: How can we submit code patches that implement our (pending)patents?
Следующее
От: Andrew Gierth
Дата:
Сообщение: Re: Early WIP/PoC for inlining CTEs