Обсуждение: "out of relcache_callback_list slots" after multiple calls to pg_logical_slot_get_binary_changes

Поиск
Список
Период
Сортировка

"out of relcache_callback_list slots" after multiple calls to pg_logical_slot_get_binary_changes

От
"shiy.fnst@fujitsu.com"
Дата:
Hi hackers,

After multiple calls to the function pg_logical_slot_get_binary_changes() in
single client backend (the output plugin of the slot is pgoutput), I got the
following error:

client backend FATAL:  out of relcache_callback_list slots
client backend CONTEXT:  slot "testslot", output plugin "pgoutput", in the startup callback
client backend STATEMENT:  SELECT data FROM pg_logical_slot_get_binary_changes('testslot', NULL, NULL, 'proto_version',
'3','streaming', 'off', 'publication_names', 'pub');
 

I tried to look into it and found that it's because every time the function
(pg_logical_slot_get_binary_changes) is called, relcache callback and syscache
callbacks are registered when initializing pgoutput (see pgoutput_startup() and
init_rel_sync_cache()), but they are not unregistered when it shutdowns. So,
after multiple calls to the function, MAX_RELCACHE_CALLBACKS is exceeded. This
is mentioned in the following comment.

    /*
     * We can get here if the plugin was used in SQL interface as the
     * RelSchemaSyncCache is destroyed when the decoding finishes, but there
     * is no way to unregister the relcache invalidation callback.
     */
    if (RelationSyncCache == NULL)
        return;

Could we fix it by adding two new function to unregister relcache callback and
syscache callback? I tried to do so in the attached patch.

Regards,
Shi Yu

Вложения

Re: "out of relcache_callback_list slots" after multiple calls to pg_logical_slot_get_binary_changes

От
Kyotaro Horiguchi
Дата:
Good catch!

At Sun, 19 Feb 2023 02:40:31 +0000, "shiy.fnst@fujitsu.com" <shiy.fnst@fujitsu.com> wrote in 
> init_rel_sync_cache()), but they are not unregistered when it shutdowns. So,
> after multiple calls to the function, MAX_RELCACHE_CALLBACKS is exceeded. This
> is mentioned in the following comment.
> 
>     /*
>      * We can get here if the plugin was used in SQL interface as the
>      * RelSchemaSyncCache is destroyed when the decoding finishes, but there
>      * is no way to unregister the relcache invalidation callback.
>      */
>     if (RelationSyncCache == NULL)
>         return;
> 
> Could we fix it by adding two new function to unregister relcache callback and
> syscache callback? I tried to do so in the attached patch.

I'm pretty sure that everytime an output plugin is initialized on a
process, it installs the same set of syscache/relcache callbacks each
time.  Do you think we could simply stop duplicate registration of
those callbacks by using a static boolean?  It would be far simpler.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Kyotaro Horiguchi <horikyota.ntt@gmail.com> writes:
> I'm pretty sure that everytime an output plugin is initialized on a
> process, it installs the same set of syscache/relcache callbacks each
> time.  Do you think we could simply stop duplicate registration of
> those callbacks by using a static boolean?  It would be far simpler.

Yeah, I think that's the way it's done elsewhere.  Removing and
re-registering your callback seems expensive, and it also destroys
any reasoning that anyone might have made about the order in which
different callbacks will get called.  (Admittedly, that's probably not
important for invalidation callbacks, but it does matter for e.g.
process exit callbacks.)

            regards, tom lane



RE: "out of relcache_callback_list slots" after multiple calls to pg_logical_slot_get_binary_changes

От
"shiy.fnst@fujitsu.com"
Дата:
On Mon, Feb 20, 2023 11:31 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> 
> Kyotaro Horiguchi <horikyota.ntt@gmail.com> writes:
> > I'm pretty sure that everytime an output plugin is initialized on a
> > process, it installs the same set of syscache/relcache callbacks each
> > time.  Do you think we could simply stop duplicate registration of
> > those callbacks by using a static boolean?  It would be far simpler.
> 
> Yeah, I think that's the way it's done elsewhere.  Removing and
> re-registering your callback seems expensive, and it also destroys
> any reasoning that anyone might have made about the order in which
> different callbacks will get called.  (Admittedly, that's probably not
> important for invalidation callbacks, but it does matter for e.g.
> process exit callbacks.)
> 

Thanks for your reply. I agree that's expensive. Attach a new patch which adds a
static boolean to avoid duplicate registration.

Regards,
Shi Yu

Вложения

Re: "out of relcache_callback_list slots" after multiple calls to pg_logical_slot_get_binary_changes

От
Kyotaro Horiguchi
Дата:
At Tue, 21 Feb 2023 10:31:29 +0000, "shiy.fnst@fujitsu.com" <shiy.fnst@fujitsu.com> wrote in 
> Thanks for your reply. I agree that's expensive. Attach a new patch which adds a
> static boolean to avoid duplicate registration.

Thank you for the patch.  It is exactly what I had in my mind. But now
that I've had a chance to mull it over, I came to think it might be
better to register the callbacks at one place. I'm thinking we could
create a new function called register_callbacks() or something and
move all the calls to CacheRegisterSyscacheCallback() into that. What
do you think about that refactoring?

I guess you could say that that refactoring somewhat weakens the
connection or dependency between init_rel_sync_cache and
rel_sync_cache_relation_cb, but anyway the callback works even if
RelationSyncCache is not around.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: "out of relcache_callback_list slots" after multiple calls to pg_logical_slot_get_binary_changes

От
Peter Smith
Дата:
On Wed, Feb 22, 2023 at 12:03 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
>
> At Tue, 21 Feb 2023 10:31:29 +0000, "shiy.fnst@fujitsu.com" <shiy.fnst@fujitsu.com> wrote in
> > Thanks for your reply. I agree that's expensive. Attach a new patch which adds a
> > static boolean to avoid duplicate registration.
>
> Thank you for the patch.  It is exactly what I had in my mind. But now
> that I've had a chance to mull it over, I came to think it might be
> better to register the callbacks at one place. I'm thinking we could
> create a new function called register_callbacks() or something and
> move all the calls to CacheRegisterSyscacheCallback() into that. What
> do you think about that refactoring?
>
> I guess you could say that that refactoring somewhat weakens the
> connection or dependency between init_rel_sync_cache and
> rel_sync_cache_relation_cb, but anyway the callback works even if
> RelationSyncCache is not around.
>

If you are going to do that, then won't just copying the
CacheRegisterSyscacheCallback(PUBLICATIONOID...  into function
init_rel_sync_cache() be effectively the same as doing that?

Then almost nothing else to do...e.g. no need for a new extra static
boolean if static RelationSyncCache is acting as the one-time guard
anyway.

------
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: "out of relcache_callback_list slots" after multiple calls to pg_logical_slot_get_binary_changes

От
Kyotaro Horiguchi
Дата:
Thanks for the comment.

At Wed, 22 Feb 2023 12:29:59 +1100, Peter Smith <smithpb2250@gmail.com> wrote in 
> On Wed, Feb 22, 2023 at 12:03 PM Kyotaro Horiguchi
> <horikyota.ntt@gmail.com> wrote:
> >
> > At Tue, 21 Feb 2023 10:31:29 +0000, "shiy.fnst@fujitsu.com" <shiy.fnst@fujitsu.com> wrote in
> > > Thanks for your reply. I agree that's expensive. Attach a new patch which adds a
> > > static boolean to avoid duplicate registration.
> >
> > Thank you for the patch.  It is exactly what I had in my mind. But now
> > that I've had a chance to mull it over, I came to think it might be
> > better to register the callbacks at one place. I'm thinking we could
> > create a new function called register_callbacks() or something and
> > move all the calls to CacheRegisterSyscacheCallback() into that. What
> > do you think about that refactoring?
> >
> > I guess you could say that that refactoring somewhat weakens the
> > connection or dependency between init_rel_sync_cache and
> > rel_sync_cache_relation_cb, but anyway the callback works even if
> > RelationSyncCache is not around.
> >
> 
> If you are going to do that, then won't just copying the
> CacheRegisterSyscacheCallback(PUBLICATIONOID...  into function
> init_rel_sync_cache() be effectively the same as doing that?

I'm not sure if it has anything to do with the relation sync cache.
On the other hand, moving all the content of init_rel_sync_cache() up
to pgoutput_startup() doesn't seem like a good idea.. Another option,
as you see, was to separate callback registration code.

> Then almost nothing else to do...e.g. no need for a new extra static
> boolean if static RelationSyncCache is acting as the one-time guard
> anyway.

Unfortunately, RelationSyncCache doesn't work - it is set to NULL at
plugin shutdown.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: "out of relcache_callback_list slots" after multiple calls to pg_logical_slot_get_binary_changes

От
Michael Paquier
Дата:
On Wed, Feb 22, 2023 at 12:07:06PM +0900, Kyotaro Horiguchi wrote:
> At Wed, 22 Feb 2023 12:29:59 +1100, Peter Smith <smithpb2250@gmail.com> wrote in
>> If you are going to do that, then won't just copying the
>> CacheRegisterSyscacheCallback(PUBLICATIONOID...  into function
>> init_rel_sync_cache() be effectively the same as doing that?
>
> I'm not sure if it has anything to do with the relation sync cache.
> On the other hand, moving all the content of init_rel_sync_cache() up
> to pgoutput_startup() doesn't seem like a good idea.. Another option,
> as you see, was to separate callback registration code.

Both are kept separate in the code, so keeping this separation makes
sense to me.

+       /* Register callbacks if we didn't do that. */
+       if (!callback_registered)
+           CacheRegisterSyscacheCallback(PUBLICATIONOID,
+                                         publication_invalidation_cb,
+                                         (Datum) 0);

        /* Initialize relation schema cache. */
        init_rel_sync_cache(CacheMemoryContext);
+       callback_registered = true;
[...]
+   /* Register callbacks if we didn't do that. */
+   if (!callback_registered)

I am a bit confused by the use of one single flag called
callback_registered to track both the publication callback and the
relation callbacks.  Wouldn't it be cleaner to use two flags?  I don't
think that we'll have soon a second code path calling
init_rel_sync_cache(), but if we do then the callback load could again
be messed up.

(FYI, we use this method of callback registration for everything
that's not a one-time code path, like hash tables for RI triggers,
base backup callbacks, etc.)
--
Michael

Вложения

RE: "out of relcache_callback_list slots" after multiple calls to pg_logical_slot_get_binary_changes

От
"shiy.fnst@fujitsu.com"
Дата:
On Wed, Feb 22, 2023 2:20 PM Michael Paquier <michael@paquier.xyz> wrote:
> 
> On Wed, Feb 22, 2023 at 12:07:06PM +0900, Kyotaro Horiguchi wrote:
> > At Wed, 22 Feb 2023 12:29:59 +1100, Peter Smith <smithpb2250@gmail.com>
> wrote in
> >> If you are going to do that, then won't just copying the
> >> CacheRegisterSyscacheCallback(PUBLICATIONOID...  into function
> >> init_rel_sync_cache() be effectively the same as doing that?
> >
> > I'm not sure if it has anything to do with the relation sync cache.
> > On the other hand, moving all the content of init_rel_sync_cache() up
> > to pgoutput_startup() doesn't seem like a good idea.. Another option,
> > as you see, was to separate callback registration code.
> 
> Both are kept separate in the code, so keeping this separation makes
> sense to me.
> 
> +       /* Register callbacks if we didn't do that. */
> +       if (!callback_registered)
> +           CacheRegisterSyscacheCallback(PUBLICATIONOID,
> +                                         publication_invalidation_cb,
> +                                         (Datum) 0);
> 
>         /* Initialize relation schema cache. */
>         init_rel_sync_cache(CacheMemoryContext);
> +       callback_registered = true;
> [...]
> +   /* Register callbacks if we didn't do that. */
> +   if (!callback_registered)
> 
> I am a bit confused by the use of one single flag called
> callback_registered to track both the publication callback and the
> relation callbacks.  Wouldn't it be cleaner to use two flags?  I don't
> think that we'll have soon a second code path calling
> init_rel_sync_cache(), but if we do then the callback load could again
> be messed up.
> 

Thanks for your reply. Using two flags makes sense to me.
Attach the updated patch.

Regards,
Shi Yu

Вложения

Re: "out of relcache_callback_list slots" after multiple calls to pg_logical_slot_get_binary_changes

От
Michael Paquier
Дата:
On Wed, Feb 22, 2023 at 10:21:51AM +0000, shiy.fnst@fujitsu.com wrote:
> Thanks for your reply. Using two flags makes sense to me.
> Attach the updated patch.

Fine by me as far as it goes.  Any thoughts from others?
--
Michael

Вложения

Re: "out of relcache_callback_list slots" after multiple calls to pg_logical_slot_get_binary_changes

От
Peter Smith
Дата:
On Thu, Feb 23, 2023 at 11:28 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Wed, Feb 22, 2023 at 10:21:51AM +0000, shiy.fnst@fujitsu.com wrote:
> > Thanks for your reply. Using two flags makes sense to me.
> > Attach the updated patch.
>
> Fine by me as far as it goes.  Any thoughts from others?
> --

Should the 'relation_callback_registered' variable name be plural?

Otherwise, LGTM.

------
Kind Regards,
Peter Smith.
Fujitsu Australia



Peter Smith <smithpb2250@gmail.com> writes:
> Should the 'relation_callback_registered' variable name be plural?

Yeah, plural seems better to me too.  I fixed that and did a little
comment-editing and pushed it.

            regards, tom lane