Re: relfrozenxid may disagree with row XIDs after 1ccc1e05ae

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: relfrozenxid may disagree with row XIDs after 1ccc1e05ae
Дата
Msg-id 20240516203838.hk5djwfa2dhpabdc@awork3.anarazel.de
обсуждение исходный текст
Ответ на Re: relfrozenxid may disagree with row XIDs after 1ccc1e05ae  (Andres Freund <andres@anarazel.de>)
Список pgsql-bugs
Hi,

On 2024-05-16 13:29:49 -0700, Andres Freund wrote:
> On 2024-05-16 16:13:35 -0400, Peter Geoghegan wrote:
> > > Now I wonder if there is some codepath triggering catalog lookups during bulk
> > > delete.
> >
> > I don't think that there's any rule that says that VACUUM cannot do
> > catalog lookups during bulk deletions. B-Tree page deletion needs to
> > generate an insertion scan key, so that it can "refind" a page
> > undergoing deletion. That might require catalog lookups.
>
> I'm not saying there's a hard rule against it. Just that there wasn't an
> immediately apparent, nor immediately observable, path for it. As I didn't see
> the path to the horizon recomputation, I didn't know how a btbulkdelete in the
> middle of the scan would potentially trigger the problem.

Hm. Actually. I think it might not be correct to do catalog lookups at that
point. But it's a bigger issue than just catalog lookups during bulkdelete:
Once we've done
     MyProc->statusFlags |= PROC_IN_VACUUM;

the current backend's snapshots don't prevent rows from being removed
anymore.

I first wrote:
> That's not a huge issue for the pg_class entry itself, as the locks should
> prevent it from being updated. But there are a lot of catalog lookups that
> aren't protected by locks, just normal snapshot semantics.

but as it turns out we haven't even locked the relation at the point we set
PROC_IN_VACUUM.

That seems quite broken.


WRT bulkdelete, there's this comment where we set PROC_IN_VACUUM:

         * In lazy vacuum, we can set the PROC_IN_VACUUM flag, which lets
         * other concurrent VACUUMs know that they can ignore this one while
         * determining their OldestXmin.  (The reason we don't set it during a
         * full VACUUM is exactly that we may have to run user-defined
         * functions for functional indexes, and we want to make sure that if
         * they use the snapshot set above, any tuples it requires can't get
         * removed from other tables.  An index function that depends on the
         * contents of other tables is arguably broken, but we won't break it
         * here by violating transaction semantics.)

the parenthetical explains that/why we can't evaluate user defined
functions. Which seems to be violated by doing key comparisons, no?

Greetings,

Andres Freund



В списке pgsql-bugs по дате отправления:

Предыдущее
От: Peter Geoghegan
Дата:
Сообщение: Re: relfrozenxid may disagree with row XIDs after 1ccc1e05ae
Следующее
От: Melanie Plageman
Дата:
Сообщение: Re: relfrozenxid may disagree with row XIDs after 1ccc1e05ae