Re: Error with index on unlogged table

Поиск
Список
Период
Сортировка
От Michael Paquier
Тема Re: Error with index on unlogged table
Дата
Msg-id CAB7nPqSMENEK7nqmwGsiyLTSbrZNJKx80tBX3qF6cQsS49sjag@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Error with index on unlogged table  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Ответы Re: Error with index on unlogged table
Re: Error with index on unlogged table
Список pgsql-hackers
On Tue, Dec 1, 2015 at 11:11 AM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> Hello, I studied your latest patch.

Thanks!

> I feel quite uncomfortable that it solves the problem from a kind
> of nature of unlogged object by arbitrary flagging which is not
> fully corresponds to the nature. If we can deduce the necessity
> of fsync from some nature, it would be preferable.

INIT_FORKNUM is not something only related to unlogged relations,
indexes use them as well. And that's actually
If you look at for example BRIN indexes that do not sync immediately
their INIT_FORKNUM when index is created, I think that we still are
going to need a new flag to control the sync at WAL replay because
startup process cannot know a relation's persistence, thing that we
can know when the XLOG_FPI record is created. For BRIN indexes, we
want particularly to not sync the INIT_FORKNUM when the relation is
not an unlogged one.

> In the current patch, is_sync for log_newpage is generally true
> for and only for INIT_FORKNUM pages. Exceptions as far as I can
> see are,

Yep, that's not always the case.

> copy_relation_data: called with arbitrary forknum but it doesn't
>    set is_fsync even for coying INIT_FORKNUM. (Is this not a
>    problem?)

Oh. And actually, it seems that use_wal is broken there as well. I
think that we would still want to issue a XLOG_FPI record for an
unlogged relation should it be moved to a new tablespace to copy its
INIT_FORKNUM correctly to its new home.
After moving an unlogged relation to a new tablespace, and after
promoting a standby the standby fails to start because of this error:
FATAL:  could not create file
"pg_tblspc/16388/PG_9.6_201511071/16384/16389": File exists
This could be fixed separately though.

> spgbuildempty, ginbuildempty: these emits two or three newpage
>   logs at once so only the last one is set is_fsync for
>   performance reason.

It doesn't matter to fsync just at the last one. Each one of them
would be replayed either way, the last one triggering the sync, no?

> In short, it seems to me that the reason to choose using
> XLOG_FPI_FOR_SYNC here is only performance of processing
> successive FPIs for INIT_FORKNUM.

Yeah, there is a one-way link between this WAL record a INIT_FORKNUM.
However please note that having a INIT_FORKNUM does not always imply
that a sync is wanted. copy_relation_data is an example of that.

> INIT_FORKNUM is generated only for unlogged tables and their
> belongings. I suppose such successive fsyncs doesn't cause
> observable performance drop assuming that the number of unlogged
> tables and belongings is not so high, especially with smarter
> storages. All we should do is that just fsync only for
> INIT_FORKNUM's FPIs for the case. If the performance does matter
> even so, we still can fsync the last md-file when any wal record
> other than FPI for INIT_FORK comes. (But this would be a bit
> complex..)

Hm. If a system uses a bunch of temporary relations with brin index or
other included I would not say so. For back branches we may have to do
it unconditionally using INIT_FORKNUM, but having a control flag to
have it only done for unlogged relations would leverage that.
-- 
Michael



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Pavel Stehule
Дата:
Сообщение: Re: proposal: multiple psql option -c
Следующее
От: David Rowley
Дата:
Сообщение: Removing Functionally Dependent GROUP BY Columns