Re: BUG #17257: (auto)vacuum hangs within lazy_scan_prune()

Поиск
Список
Период
Сортировка
От Alena Rybakina
Тема Re: BUG #17257: (auto)vacuum hangs within lazy_scan_prune()
Дата
Msg-id d1ca3a1d-7ead-41a7-bfd0-5b66ad97b1cd@yandex.ru
обсуждение исходный текст
Ответ на Re: BUG #17257: (auto)vacuum hangs within lazy_scan_prune()  (Peter Geoghegan <pg@bowt.ie>)
Ответы Re: BUG #17257: (auto)vacuum hangs within lazy_scan_prune()
Список pgsql-bugs
On 02.05.2024 19:52, Peter Geoghegan wrote:
On Sat, Apr 27, 2024 at 10:38 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:
In 17, we don't ever get a new HTSV_Result, so if the tuple is not
removed, it would be because HeapTupleSatisfiesVacuumHorizon()
returned HEAPTUPLE_RECENTLY_DEAD and, if GlobalVisTestIsRemovableXid()
was called, dead_after did not precede GlobalVisState->maybe_needed.
This tuple, during this vacuum of the relation, would never be
determined to be HEAPTUPLE_DEAD or it would have been removed.
That makes sense.

It will always be HEAPTUPLE_RECENTLY_DEAD in 17 and in <= 16, if
HeapTupleSatisfiesVacuum() returns HEAPTUPLE_DEAD, we wouldn't call
heap_prepare_freeze_tuple() because of the retry loop.
The retry loop exists precisely because heap_prepare_freeze_tuple()
isn't prepared to deal with HEAPTUPLE_DEAD tuples. So I agree that
that won't be allowed to happen on versions that have the retry loop
(14 - 16).
So, it can't happen in back branches. Let's just address 17. Help me
understand how this can happen in 17.
Just to be clear, I never said that it was possible in 17. If I
somehow implied it, then I didn't mean to.

Hi! I also investigated this issue and reproduced it using this test added to the isolated tests, where I added 2 tuples, deleted them and ran vacuum and printed the tuple_deleted and dead_tuples statistics (I attached test c to this email as a patch). Within 400 iterations or more, I got the results:

n_dead_tup|n_live_tup|n_tup_del ----------------+------------+------------- 0| 0| 0 (1 row)

After 400 or more running cycles, I felt the differences, as shown earlier:

 n_dead_tup|n_live_tup|n_tup_del
 ----------+----------+---------
-         0|         0|        0
+         2|         0|        0
 (1 row)


I debugged and found that the test produces results with 0 dead tuples if GlobalVisTempRels.maybe_needed is less than the x_max of the tuple. In the code, this condition works in heap_prune_satisfies_vacuum:

else if (GlobalVisTestIsRemovableXid(prstate->vistest, dead_after))
{
     res = HEAPTUPLE_DEAD;
} But when GlobalVisTempRels.maybe_needed is equal to the x_max xid of the tuple, vacuum does not touch this tuple, because the heap_prune_satisfies_vacuum function returns the status of the RECENTLY_DEAD tuple.

Unfortunately, I have not found any explanation why GlobalVisTempRels.maybe_needed does not change after 400 iterations or more. I'm still studying it. Perhaps this information will help you.

I reproduced the problem on REL_16_STABLE.

-- 
Regards,
Alena Rybakina
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Вложения

В списке pgsql-bugs по дате отправления:

Предыдущее
От: "Haifang Wang (Centific Technologies Inc)"
Дата:
Сообщение: Windows Application Issues | PostgreSQL | REF # 48475607
Следующее
От: Alena Rybakina
Дата:
Сообщение: Re: BUG #17257: (auto)vacuum hangs within lazy_scan_prune()