Re: replay of CREATE TABLESPACE eats data at wal_level=minimal

Поиск
Список
Период
Сортировка
От Noah Misch
Тема Re: replay of CREATE TABLESPACE eats data at wal_level=minimal
Дата
Msg-id 20210810012307.GB2693@rfd.leadboat.com
обсуждение исходный текст
Ответ на replay of CREATE TABLESPACE eats data at wal_level=minimal  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: replay of CREATE TABLESPACE eats data at wal_level=minimal  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On Mon, Aug 09, 2021 at 01:08:42PM -0400, Robert Haas wrote:
> To reproduce, initialize a cluster with wal_level=minimal and
> max_wal_senders=0. Then from psql:
> 
> \! mkdir /tmp/goose
> 
> CHECKPOINT;
> CREATE TABLESPACE goose LOCATION '/tmp/goose';
> SET wal_skip_threshold=0;
> BEGIN;
> CREATE TABLE wild (a int, b text) TABLESPACE goose;
> INSERT INTO wild VALUES (1, 'chase');
> COMMIT;
> SELECT * FROM wild;
> 
> As expected, you will see one row in table 'wild'. Now perform an
> immediate shutdown. Restart the server. Table 'wild' is now empty.

Thanks for finding the problem.  It's a bad problem.

> The problem appears to be that tblspc_redo() calls
> create_tablespace_directories(), which says:
> 
>         /*
>          * Our theory for replaying a CREATE is to forcibly drop the target
>          * subdirectory if present, and then recreate it. This may be more
>          * work than needed, but it is simple to implement.
>          */
> 
> Unfortunately, this theory (which dates to
> c86f467d18aa58e18fd85b560b46d8de014e6017, vintage 2010, by Bruce) is
> correct only with wal_level>minimal. At wal_level='minimal', we can
> replay the record to recreate the relfilenode, but not the records
> that would have created the contents. However, note that if the table
> is smaller than wal_skip_threshold, then we'll log full-page images of
> the contents at commit time even at wal_level='minimal' after which we
> have no problem. As far as I can see, this bug has "always" existed,
> but before c6b92041d38512a4176ed76ad06f713d2e6c01a8 (2020, Noah) you
> would have needed a different test case. Specifically, you would have
> needed to use COPY to put the row in the table, and you would have
> needed to omit setting wal_skip_threshold since it didn't exist yet.

Agreed.

> I don't presently have a specific idea about how to fix this.

Can't recovery just not delete the directory, create it if doesn't exist, and
be happy if it does exist?  Like the attached WIP.  If we think it's possible
for a crash during mkdir to leave a directory having the wrong permissions,
adding a chmod would be in order.

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Mark Dilger
Дата:
Сообщение: Re: Another regexp performance improvement: skip useless paren-captures
Следующее
От: Mark Dilger
Дата:
Сообщение: Re: Another regexp performance improvement: skip useless paren-captures