Re: BUG #18014: Releasing catcache entries makes schema_to_xmlschema() fail when parallel workers are used

Поиск
Список
Период
Сортировка
От Kyotaro Horiguchi
Тема Re: BUG #18014: Releasing catcache entries makes schema_to_xmlschema() fail when parallel workers are used
Дата
Msg-id 20230726.112923.27361680552823861.horikyota.ntt@gmail.com
обсуждение исходный текст
Ответ на Re: BUG #18014: Releasing catcache entries makes schema_to_xmlschema() fail when parallel workers are used  (Alexander Lakhin <exclusion@gmail.com>)
Ответы Re: BUG #18014: Releasing catcache entries makes schema_to_xmlschema() fail when parallel workers are used  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: BUG #18014: Releasing catcache entries makes schema_to_xmlschema() fail when parallel workers are used  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Список pgsql-bugs
At Tue, 25 Jul 2023 13:00:00 +0300, Alexander Lakhin <exclusion@gmail.com> wrote in
> Hi Tom,
>
> 21.07.2023 22:21, Tom Lane wrote:
> > Yes, we certainly want to do that during LockRelationOid.  But what
> > seems to be happening here is an inval while we are closing/unlocking
> > the catalog we got the syscache entry from.  That is, the expected
> > behavior here is:
> >
> > SearchSysCacheExists:
> >
> >    * is entry present-and-valid?
> >      No, so...
> >
> >    * open and lock relevant catalog (with possible inval)
> >
> >    * scan catalog, find desired row, create valid syscache entry
> >
> >    * close and unlock catalog
> >
> >    * return success
> >
> > SearchSysCache1 (from pg_class_aclmask_ext):
> >
> >    * is entry present-and-valid?
> >      Yes, so increment its refcount and return it
> >
> > There is no inval in the entry-already-present code path in syscache
> > lookup.  So if we are seeing this failure, ISTM it must mean that an
> > inval is happening during "close and unlock catalog", which seems like
> > something that we don't want.  But I've not traced exactly how that
> > happens.
>
> Yes, but here we deal with -DCATCACHE_FORCE_RELEASE (added to
> config_env
> on prion), so the cache entry, that was just found in
> SearchSysCacheExists(), is removed immediately because of
> SearchSysCacheExists() ->  ReleaseSysCache(tuple) ->
> ReleaseCatCache(tuple).
>
> So, while the construction "if (SearchSysCacheExists())
> ... SearchSysCache1()"
> seems robust for normal conditions, it might be broken when catcache

I agree about the safety of the construct.

> entries
> released forcefully. Thus, if the worst consequence of the issue is
> sporadic
> test failures on prion, then may be fix it in a least invasive way (on
> level 1).

> 1) test xmlmap fails sporadically due to the catalog changes caused by
>  parallel tests activity
> 2) schema_to_xmlschemaX() can fail when parallel workers are used

> 3) has_table_privilegeX() can fail sporadically when executed within a
>  parallel worker

Doesn't this imply that the function isn't parallel-safe? The issue is
gone by marking it and all variants as parallel-restricted. It seems
to be a reasolable way to address this issue.

> 4) SearchSysCacheX(RELOID, ...) can switch to a newer catalog snapshot,
>  when repeated in a parallel worker

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center



В списке pgsql-bugs по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: BUG #18031: Segmentation fault after deadlock within VACUUM's parallel worker
Следующее
От: Tom Lane
Дата:
Сообщение: Re: BUG #18014: Releasing catcache entries makes schema_to_xmlschema() fail when parallel workers are used