Re: logical decoding vs. VACUUM FULL / CLUSTER on table with TOAST-ed data

Поиск
Список
Период
Сортировка
От Masahiko Sawada
Тема Re: logical decoding vs. VACUUM FULL / CLUSTER on table with TOAST-ed data
Дата
Msg-id CAD21AoB0EfKjgvo3wsT8qhbUpUR=e-B0iqxcv16A-EQujxp0xg@mail.gmail.com
обсуждение исходный текст
Ответ на logical decoding vs. VACUUM FULL / CLUSTER on table with TOAST-eddata  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Ответы Re: logical decoding vs. VACUUM FULL / CLUSTER on table with TOAST-eddata  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Список pgsql-hackers
On Mon, Nov 19, 2018 at 6:52 AM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
>
> Hi,
>
> It seems we have pretty annoying problem with logical decoding when
> performing VACUUM FULL / CLUSTER on a table with toast-ed data.
>
> The trouble is that the rewritten heap is WAL-logged using XLOG/FPI
> records, the TOAST data is logged as regular INSERT records. XLOG/FPI is
> ignored in logical decoding, and so reorderbuffer never gets those
> records. But we do decode the TOAST data, and reorderbuffer stashes them
> in toast_hash hash table. Which gets reset only when handling a row from
> the main heap, and that never arrives. So we end up stashing all the
> TOAST data in memory :-(
>
> So do VACUUM FULL (or CLUSTER) on a sufficiently large table, and you're
> likely to break any logical replication connection. And it does not
> matter if you replicate this particular table.
>
> Luckily enough, this can leverage some of the pieces introduced by
> e9edc1ba which was meant to deal with rewrites of system tables, and in
> raw_heap_insert it added this:
>
>     /*
>      * The new relfilenode's relcache entrye doesn't have the necessary
>      * information to determine whether a relation should emit data for
>      * logical decoding.  Force it to off if necessary.
>      */
>     if (!RelationIsLogicallyLogged(state->rs_old_rel))
>         options |= HEAP_INSERT_NO_LOGICAL;
>
> As raw_heap_insert is used only for heap rewrites, we can simply remove
> the if condition and use the HEAP_INSERT_NO_LOGICAL flag for all TOAST
> data logged from here.
>

This fix seems fine to me.

> This does fix the issue, because we still decode the TOAST changes but
> there are no data and so
>
>     if (change->data.tp.newtuple != NULL)
>     {
>         dlist_delete(&change->node);
>         ReorderBufferToastAppendChunk(rb, txn, relation,
>                                       change);
>     }
>
> ends up not stashing the change in the hash table.

With the below change you proposed can we remove the above condition
because toast-insertion changes being processed by the reorder buffer
always have a new tuple? If a toast-insertion record doesn't have a
new tuple it has already ignored when decoding.

> It's imperfect,
> because we still decode the changes (and stash them to disk), but ISTM
> that can be fixed by tweaking DecodeInsert a bit to just ignore those
> changes entirely.
>
> Attached is a PoC patch with these two fixes.
>

I think this change is also fine.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Pavel Stehule
Дата:
Сообщение: Re: ToDo: show size of partitioned table
Следующее
От: Amit Khandekar
Дата:
Сообщение: Trigger tuple slotification (extracted from pluggable storage patch)