Re: BUG #17257: (auto)vacuum hangs within lazy_scan_prune()

Поиск
Список
Период
Сортировка
От Melanie Plageman
Тема Re: BUG #17257: (auto)vacuum hangs within lazy_scan_prune()
Дата
Msg-id CAAKRu_adrsViY6pNQcGLF555dbp9C6f5OO7nwZavTfECSM5kow@mail.gmail.com
обсуждение исходный текст
Ответ на Re: BUG #17257: (auto)vacuum hangs within lazy_scan_prune()  (Peter Geoghegan <pg@bowt.ie>)
Ответы Re: BUG #17257: (auto)vacuum hangs within lazy_scan_prune()  (Peter Geoghegan <pg@bowt.ie>)
Список pgsql-bugs
On Fri, Apr 26, 2024 at 6:21 PM Peter Geoghegan <pg@bowt.ie> wrote:
>
> On Fri, Apr 26, 2024 at 5:56 PM Melanie Plageman
> <melanieplageman@gmail.com> wrote:
> >
> > On Fri, Apr 26, 2024 at 5:28 PM Peter Geoghegan <pg@bowt.ie> wrote:
> > >
> > > On Fri, Apr 26, 2024 at 4:46 PM Melanie Plageman
> > > <melanieplageman@gmail.com> wrote:
> > > > I have a more basic question. How could GlobalVisState->maybe_needed
> > > > going backwards cause a problem with relfrozenxid? Yes, if
> > > > maybe_needed goes backwards, we may not remove a tuple whose xmin/xmax
> > > > are older than VacuumCutoffs->OldestXmin. But, if that tuple's
> > > > xmin/xmax are older than OldestXmin, then wouldn't we freeze it?
> > >
> > > You can't freeze every XID older than OldestXmin.
> > > heap_prepare_freeze_tuple() isn't prepared for HEAPTUPLE_DEAD tuples,
> > > and expects that those will be taken care of by the time it is called.
> >
> > But, the tuple isn't HEAPTUPLE_DEAD -- it's HEAPTUPLE_RECENTLY_DEAD.
>
> Why? What tuple is this?

In 17, we don't ever get a new HTSV_Result, so if the tuple is not
removed, it would be because HeapTupleSatisfiesVacuumHorizon()
returned HEAPTUPLE_RECENTLY_DEAD and, if GlobalVisTestIsRemovableXid()
was called, dead_after did not precede GlobalVisState->maybe_needed.
This tuple, during this vacuum of the relation, would never be
determined to be HEAPTUPLE_DEAD or it would have been removed.

> > It will always be HEAPTUPLE_RECENTLY_DEAD in 17 and in <= 16, if
> > HeapTupleSatisfiesVacuum() returns HEAPTUPLE_DEAD, we wouldn't call
> > heap_prepare_freeze_tuple() because of the retry loop.
>
> The retry loop exists precisely because heap_prepare_freeze_tuple()
> isn't prepared to deal with HEAPTUPLE_DEAD tuples. So I agree that
> that won't be allowed to happen on versions that have the retry loop
> (14 - 16).

So, it can't happen in back branches. Let's just address 17. Help me
understand how this can happen in 17.

We are talking about a tuple with xmin/xmax older than
VacuumCutoffs->OldestXmin in heap_prepare_freeze_tuple(). So,
freeze_xmin is true:
        freeze_xmin = TransactionIdPrecedes(xid, cutoffs->OldestXmin);

Then, in heap_tuple_should_freeze(), we may ratchet back
NoFreezePageRelfrozenXid:
        if (TransactionIdPrecedes(xid, *NoFreezePageRelfrozenXid))
            *NoFreezePageRelfrozenXid = xid;
NoFreezePageRelfrozenXid was initialized with
VacuumCutoffs->OldestXmin and our tuple xmin/xmax is older than
VacuumCutoffs->OldestXmin.

We make a freeze plan for the tuple and move on. Assuming
HeapPageFreeze->freeze_required is never set to true and we don't end
up meeting the other criteria for opportunistic freezing, we may
decide not to freeze any tuples on the page (including this tuple). In
this case, we set relfrozenxid to NoFreezePageRelfrozenXid. This
should not be a value newer than our tuple's xids.

> As Andres pointed out, even if we were to call
> heap_prepare_freeze_tuple() with a HEAPTUPLE_DEAD tuple, we'd get a
> "can't happen" error (though it's hard to see this because it doesn't
> actually rely on the hint bits set in the tuple).

I just don't see how in 17 we could end up calling
heap_prepare_freeze_tuple() on a HEAPTUPLE_DEAD tuple. If
heap_prune_satisfies_vacuum() returned HEAPTUPLE_DEAD, we wouldn't
heap_prepare_freeze_tuple() the tuple.

I assume you are talking about a HEAPTUPLE_RECENTLY_DEAD tuple that
*should* be HEAPTUPLE_DEAD. But it doesn't really matter if it should
be HEAPTUPLE_DEAD and we fail to remove it, as long as we don't leave
it unfrozen and advance relfrozenxid past its xids. And I don't see
how we would do that.

- Melanie

- Melanie



В списке pgsql-bugs по дате отправления:

Предыдущее
От: PG Bug reporting form
Дата:
Сообщение: BUG #18450: The memory usage of the postgresql12.6 walsender process is abnormally high.
Следующее
От: "David G. Johnston"
Дата:
Сообщение: Re: BUG #18450: The memory usage of the postgresql12.6 walsender process is abnormally high.