Re: Updated backup APIs for non-exclusive backups

Поиск
Список
Период
Сортировка
От Magnus Hagander
Тема Re: Updated backup APIs for non-exclusive backups
Дата
Msg-id CABUevEy38oXGHaSfa=SgZGjpDVDnLugPcw+X7SshwdDi1FX7Jw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Updated backup APIs for non-exclusive backups  (David Steele <david@pgmasters.net>)
Ответы Re: Updated backup APIs for non-exclusive backups  (David Steele <david@pgmasters.net>)
Список pgsql-hackers


On Wed, Mar 30, 2016 at 4:10 AM, David Steele <david@pgmasters.net> wrote:
On 3/29/16 2:09 PM, Magnus Hagander wrote:

> I had a chat with Heikki, and here's another suggestion:
>
> 1. We don't touch the current exclusive backups at all, as previously
> discussed, other than deprecating their use. For backwards compat.
>
> 2. For new backups, we return the contents of pg_control as a bytea from
> pg_stop_backup(). We tell backup programs they are supposed to write
> this out as pg_control.backup, *not* as pg_control.
>
> 3a. On recovery, if it's an exclusive backup, we do as we did before.
>
> 3b. on recovery, in non-exclusive backups (determined from
> backup_label), we check that pg_control.backup exists *and* that
> pg_control does *not* exist. That guards us reasonably against backup
> programs that do the wrong thing, and we know we get the correct version
> of pg_control.
>
> 4. (we can still add the stop location to the backup_label file in case
> backup programs find it useful, but we don't use it in recovery)
>
> Thoughts about this approach?

This certainly looks like it would work but it raises the barrier for
implementing backups by quite a lot.  It's fine for backrest or barman
but it won't be pleasant for anyone who has home-grown scripts.


How much does it really raise the bar, though?

It would go from "copy all files and make damn sure you copy pg_control last, and rename it to pg_control.backup" to "take this binary blob you got from the server and write it to pg_control.backup"?

Also, the target of these APIs is specifically the backup tools and not homewritten scripts. A simple shellscript will have trouble enough using it in the first place since it requires a persistent connection to the database. But those scripts are likely broken anyway.

You can of course keep the current requirements which is just "copy pg_control last", but if we do that we have zero way of checking that that happened, and you may end up with subtly broken restores if the backup software gets it wrong. (Of course it can get the rename/writeout thing wrong as well, but that's going to be a lot more obvious if you're doing it wrong).

The main reason for Heikki to suggest this one over the other basic one is that it brings protection against the "backup script/program crashed halfway through but the user still tried to restore from that".They will outright fail becuase there is no pg_control.backup in that case. If we don't care about that, then we can go back to just saying "copy pg_control last and we're done". But you yourself complained about that requirement because it's too easy to get wrong (though you advocated using backup_label to transfer the data over -- but that has the potential for getting more complicated if we now or at any point in the future want more than one field to transfer, for example).


--

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Artur Zakirov
Дата:
Сообщение: Re: unexpected result from to_tsvector
Следующее
От: Magnus Hagander
Дата:
Сообщение: Re: Updated backup APIs for non-exclusive backups