Обсуждение: plruby: rb_iterate symbol clash with libruby.so

Поиск
Список
Период
Сортировка

plruby: rb_iterate symbol clash with libruby.so

От
Pavel Raiskup
Дата:
Hi, I'm curious how it worked before (seems like the function is defined
in both PostgreSQL and Ruby projects for quite some time) - but I recently
came across this situation:

    - /bin/postgres is build-time linked with 'ld -E'
    - /bin/postgres dlopen()s plruby.so
    - plruby.so calls rb_iterate, but linker prefers rb_iterate defined in
      /bin/postgres, instead of (the wanted one) rb_iterate from libruby.so

This means an ugly PG server crash.  I'm curious what to do with this;
ideally it would be solvable from plruby.so itself, but there doesn't seem
to be nice solution (one could do some hacks with dlopen('libruby.so')).

Is it realistic we could rename red-black tree methods from 'rb_*' to e.g.
'rbt_*' to avoid this clash?

Pavel





Re: plruby: rb_iterate symbol clash with libruby.so

От
Tom Lane
Дата:
Pavel Raiskup <praiskup@redhat.com> writes:
> Hi, I'm curious how it worked before (seems like the function is defined
> in both PostgreSQL and Ruby projects for quite some time) - but I recently
> came across this situation:
>     - /bin/postgres is build-time linked with 'ld -E'
>     - /bin/postgres dlopen()s plruby.so
>     - plruby.so calls rb_iterate, but linker prefers rb_iterate defined in
>       /bin/postgres, instead of (the wanted one) rb_iterate from libruby.so
> This means an ugly PG server crash.  I'm curious what to do with this;
> ideally it would be solvable from plruby.so itself, but there doesn't seem
> to be nice solution (one could do some hacks with dlopen('libruby.so')).

Bleah.

We recently noticed that using a --version-script symbol filter on shared
libraries fixes some cases of this problem, because a non-exported symbol
will be preferentially resolved inside the library.  I guess that's of no
use for this particular case though, since evidently Ruby has to export
its rb_iterate for Ruby extensions to use.

> Is it realistic we could rename red-black tree methods from 'rb_*' to e.g.
> 'rbt_*' to avoid this clash?

That's not terribly appetizing, because it essentially means we're giving
Ruby (and potentially every other library on the planet) veto power over
our function namespace.  That does not scale, especially not when the
feedback loop has a time constant measured in years :-(

I don't have a huge objection to renaming the rbtree functions, other
than the precedent it sets ...

            regards, tom lane


Re: plruby: rb_iterate symbol clash with libruby.so

От
Andres Freund
Дата:
Hi,

On 2018-11-03 14:19:46 -0400, Tom Lane wrote:
> Pavel Raiskup <praiskup@redhat.com> writes:
> > Hi, I'm curious how it worked before (seems like the function is defined
> > in both PostgreSQL and Ruby projects for quite some time) - but I recently
> > came across this situation:
> >     - /bin/postgres is build-time linked with 'ld -E'
> >     - /bin/postgres dlopen()s plruby.so
> >     - plruby.so calls rb_iterate, but linker prefers rb_iterate defined in
> >       /bin/postgres, instead of (the wanted one) rb_iterate from libruby.so
> > This means an ugly PG server crash.  I'm curious what to do with this;
> > ideally it would be solvable from plruby.so itself, but there doesn't seem
> > to be nice solution (one could do some hacks with dlopen('libruby.so')).
> 
> Bleah.

ISTM this specific case we could solve the issue by opening plruby.so /
extension sos with RTLD_DEEPBIND.  That doesn't work if ruby extensions
that are loaded later use rb_iterate, but should work for the case above.


> We recently noticed that using a --version-script symbol filter on shared
> libraries fixes some cases of this problem, because a non-exported symbol
> will be preferentially resolved inside the library.  I guess that's of no
> use for this particular case though, since evidently Ruby has to export
> its rb_iterate for Ruby extensions to use.

I wondered every now and then if we shouldn't use dlmopen to open
extension objects, when available. If we were to do it right we'd create
a namespace for every shared object, use dlmopen to expose postgres'
symbols, and then load the extension objects in separate namespaces.
But I think that's not feasible, because:
"The glibc implementation supports a maximum of 16 namespaces."


> > Is it realistic we could rename red-black tree methods from 'rb_*' to e.g.
> > 'rbt_*' to avoid this clash?
> 
> That's not terribly appetizing, because it essentially means we're giving
> Ruby (and potentially every other library on the planet) veto power over
> our function namespace.  That does not scale, especially not when the
> feedback loop has a time constant measured in years :-(
> 
> I don't have a huge objection to renaming the rbtree functions, other
> than the precedent it sets ...

I don't mind the precedent that much, but isn't that also not unlikely
to break other extensions that use those functions?

Greetings,

Andres Freund


Re: plruby: rb_iterate symbol clash with libruby.so

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
>> Pavel Raiskup <praiskup@redhat.com> writes:
>>> Is it realistic we could rename red-black tree methods from 'rb_*' to e.g.
>>> 'rbt_*' to avoid this clash?

> ISTM this specific case we could solve the issue by opening plruby.so /
> extension sos with RTLD_DEEPBIND.  That doesn't work if ruby extensions
> that are loaded later use rb_iterate, but should work for the case above.

Doesn't work on non-glibc platforms, though.

> On 2018-11-03 14:19:46 -0400, Tom Lane wrote:
>> That's not terribly appetizing, because it essentially means we're giving
>> Ruby (and potentially every other library on the planet) veto power over
>> our function namespace.  That does not scale, especially not when the
>> feedback loop has a time constant measured in years :-(
>> I don't have a huge objection to renaming the rbtree functions, other
>> than the precedent it sets ...

> I don't mind the precedent that much, but isn't that also not unlikely
> to break other extensions that use those functions?

I rather doubt there are any.  Also, if there are, the RTLD_DEEPBIND
solution would break them too, no?

            regards, tom lane


Re: plruby: rb_iterate symbol clash with libruby.so

От
Andres Freund
Дата:
On 2018-11-03 14:39:45 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> >> Pavel Raiskup <praiskup@redhat.com> writes:
> >>> Is it realistic we could rename red-black tree methods from 'rb_*' to e.g.
> >>> 'rbt_*' to avoid this clash?
> 
> > ISTM this specific case we could solve the issue by opening plruby.so /
> > extension sos with RTLD_DEEPBIND.  That doesn't work if ruby extensions
> > that are loaded later use rb_iterate, but should work for the case above.
> 
> Doesn't work on non-glibc platforms, though.

Yea, but I'm not sure there's anything portable to do about such cases :/

> > On 2018-11-03 14:19:46 -0400, Tom Lane wrote:
> >> That's not terribly appetizing, because it essentially means we're giving
> >> Ruby (and potentially every other library on the planet) veto power over
> >> our function namespace.  That does not scale, especially not when the
> >> feedback loop has a time constant measured in years :-(
> >> I don't have a huge objection to renaming the rbtree functions, other
> >> than the precedent it sets ...
> 
> > I don't mind the precedent that much, but isn't that also not unlikely
> > to break other extensions that use those functions?
> 
> I rather doubt there are any.  Also, if there are, the RTLD_DEEPBIND
> solution would break them too, no?

Why would it break? Deepbind just means the to-be-opened .so is put
first in the search path, if there's no match it'll still look in other
sos.

Greetings,

Andres Freund


Re: plruby: rb_iterate symbol clash with libruby.so

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> On 2018-11-03 14:39:45 -0400, Tom Lane wrote:
>> Andres Freund <andres@anarazel.de> writes:
>>> ISTM this specific case we could solve the issue by opening plruby.so /
>>> extension sos with RTLD_DEEPBIND.  That doesn't work if ruby extensions
>>> that are loaded later use rb_iterate, but should work for the case above.

>> Doesn't work on non-glibc platforms, though.

> Yea, but I'm not sure there's anything portable to do about such cases :/

The portable answer is to rename to avoid the symbol conflict.

>>> I don't mind the precedent that much, but isn't that also not unlikely
>>> to break other extensions that use those functions?

>> I rather doubt there are any.  Also, if there are, the RTLD_DEEPBIND
>> solution would break them too, no?

> Why would it break? Deepbind just means the to-be-opened .so is put
> first in the search path, if there's no match it'll still look in other
> sos.

Yeah, but once plruby is loaded, any subsequently loaded .so has
got two possible ways to resolve rb_iterate.  No matter what we do,
the behavior will be wrong for some of them.

            regards, tom lane


Re: plruby: rb_iterate symbol clash with libruby.so

От
Robert Haas
Дата:
On Sat, Nov 3, 2018 at 2:20 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Is it realistic we could rename red-black tree methods from 'rb_*' to e.g.
> > 'rbt_*' to avoid this clash?
>
> That's not terribly appetizing, because it essentially means we're giving
> Ruby (and potentially every other library on the planet) veto power over
> our function namespace.  That does not scale, especially not when the
> feedback loop has a time constant measured in years :-(
>
> I don't have a huge objection to renaming the rbtree functions, other
> than the precedent it sets ...

Maybe prefixing with pg_ would better than rb_ to rbt_.  That's our
semi-standard namespace prefix, I think.  Of course nothing keeps
somebody else from using it, too, but we can hope that they won't.
It's certainly not very surprising that Ruby has symbols starting with
rb_...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: plruby: rb_iterate symbol clash with libruby.so

От
Pavel Raiskup
Дата:
On Monday, November 5, 2018 9:06:41 PM CET Robert Haas wrote:
> On Sat, Nov 3, 2018 at 2:20 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > > Is it realistic we could rename red-black tree methods from 'rb_*' to e.g.
> > > 'rbt_*' to avoid this clash?
> >
> > That's not terribly appetizing, because it essentially means we're giving
> > Ruby (and potentially every other library on the planet) veto power over
> > our function namespace.  That does not scale, especially not when the
> > feedback loop has a time constant measured in years :-(
> >
> > I don't have a huge objection to renaming the rbtree functions, other
> > than the precedent it sets ...
> 
> Maybe prefixing with pg_ would better than rb_ to rbt_.  That's our
> semi-standard namespace prefix, I think.  Of course nothing keeps
> somebody else from using it, too, but we can hope that they won't.
> It's certainly not very surprising that Ruby has symbols starting with
> rb_...

I now realized that there's rb_block_call() alternative for rb_iterate()
Ruby call -- which fortunately doesn't collide with PostgreSQL internals.

It means that for sufficiently new Ruby there exists some solution (not
that something similar can not re-appear elsewhere).

Pavel





Re: plruby: rb_iterate symbol clash with libruby.so

От
Tom Lane
Дата:
Pavel Raiskup <praiskup@redhat.com> writes:
> On Monday, November 5, 2018 9:06:41 PM CET Robert Haas wrote:
>> On Sat, Nov 3, 2018 at 2:20 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>> Is it realistic we could rename red-black tree methods from 'rb_*' to e.g.
>>>> 'rbt_*' to avoid this clash?

>>> I don't have a huge objection to renaming the rbtree functions, other
>>> than the precedent it sets ...

>> Maybe prefixing with pg_ would better than rb_ to rbt_. ...
>> It's certainly not very surprising that Ruby has symbols starting with
>> rb_...

> I now realized that there's rb_block_call() alternative for rb_iterate()
> Ruby call -- which fortunately doesn't collide with PostgreSQL internals.
> It means that for sufficiently new Ruby there exists some solution (not
> that something similar can not re-appear elsewhere).

Yeah.  The long and short of this is that we're trampling on namespace
that reasonably belongs to Ruby --- if they had some functions named
"pg_something" and complained about a collision with libpq, would we
change?  Nope.  So really we should rename these.

After looking at the code a bit I like the idea of s/rb/rbt/g
better than s/rb/pg_rb/g.  The latter seems verbose, and it would
also open the question of whether we need to rename rbtree.h/.c,
which would be an additional level of complication I don't want.
The rbt approach will allow skipping some other renamings that
would be needed for consistency if we use pg_rb.

Barring objections I'll go make this happen shortly.

It's too late for this week's releases, but Pavel could pick up the commit
once it happens and carry it as a Fedora patch until the next releases.

            regards, tom lane


Re: plruby: rb_iterate symbol clash with libruby.so

От
Pavel Raiskup
Дата:
On Tuesday, November 6, 2018 3:49:46 PM CET Tom Lane wrote:
> Pavel Raiskup <praiskup@redhat.com> writes:
> > On Monday, November 5, 2018 9:06:41 PM CET Robert Haas wrote:
> >> On Sat, Nov 3, 2018 at 2:20 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >>>> Is it realistic we could rename red-black tree methods from 'rb_*' to e.g.
> >>>> 'rbt_*' to avoid this clash?
> 
> >>> I don't have a huge objection to renaming the rbtree functions, other
> >>> than the precedent it sets ...
> 
> >> Maybe prefixing with pg_ would better than rb_ to rbt_. ...
> >> It's certainly not very surprising that Ruby has symbols starting with
> >> rb_...
> 
> > I now realized that there's rb_block_call() alternative for rb_iterate()
> > Ruby call -- which fortunately doesn't collide with PostgreSQL internals.
> > It means that for sufficiently new Ruby there exists some solution (not
> > that something similar can not re-appear elsewhere).
> 
> Yeah.  The long and short of this is that we're trampling on namespace
> that reasonably belongs to Ruby --- if they had some functions named
> "pg_something" and complained about a collision with libpq, would we
> change?  Nope.  So really we should rename these.
> 
> After looking at the code a bit I like the idea of s/rb/rbt/g
> better than s/rb/pg_rb/g.  The latter seems verbose, and it would
> also open the question of whether we need to rename rbtree.h/.c,
> which would be an additional level of complication I don't want.
> The rbt approach will allow skipping some other renamings that
> would be needed for consistency if we use pg_rb.
> 
> Barring objections I'll go make this happen shortly.
> 
> It's too late for this week's releases, but Pavel could pick up the commit
> once it happens and carry it as a Fedora patch until the next releases.

Thanks for looking at that, and no problem with this release.  To make
plruby work against supported PostgreSQL versions we'll have to first fix
quite a few _other_ legacy problems first.

Pavel





Re: plruby: rb_iterate symbol clash with libruby.so

От
Tom Lane
Дата:
I wrote:
> Yeah.  The long and short of this is that we're trampling on namespace
> that reasonably belongs to Ruby --- if they had some functions named
> "pg_something" and complained about a collision with libpq, would we
> change?  Nope.  So really we should rename these.

> Barring objections I'll go make this happen shortly.

Done.  I realized that the immediate problem, rb_iterate(), was only
added as of PG v10, which may explain why we hadn't heard complaints
about this till now.  So I've made the change only as far back as v10.
In principle we could change the rbtree code in 9.5/9.6 as well, but
I think that's more likely to create problems than fix any.

            regards, tom lane


Re: plruby: rb_iterate symbol clash with libruby.so

От
Pavel Raiskup
Дата:
On Tuesday, November 6, 2018 7:28:21 PM CET Tom Lane wrote:
> I wrote:
> > Yeah.  The long and short of this is that we're trampling on namespace
> > that reasonably belongs to Ruby --- if they had some functions named
> > "pg_something" and complained about a collision with libpq, would we
> > change?  Nope.  So really we should rename these.
> 
> > Barring objections I'll go make this happen shortly.
> 
> Done.  I realized that the immediate problem, rb_iterate(), was only
> added as of PG v10, which may explain why we hadn't heard complaints
> about this till now.  So I've made the change only as far back as v10.
> In principle we could change the rbtree code in 9.5/9.6 as well, but
> I think that's more likely to create problems than fix any.

The 'rb_iterate' seems to exist at least in REL9_2_STABLE branch, so it is
probably much older.  No need to backpatch of course -- just saying.  That
said, I'm still not sure how this could work before ...  Maybe it has not
been working for some time.

Pavel





Re: plruby: rb_iterate symbol clash with libruby.so

От
Tom Lane
Дата:
Pavel Raiskup <praiskup@redhat.com> writes:
> On Tuesday, November 6, 2018 7:28:21 PM CET Tom Lane wrote:
>> Done.  I realized that the immediate problem, rb_iterate(), was only
>> added as of PG v10, which may explain why we hadn't heard complaints
>> about this till now.  So I've made the change only as far back as v10.

> The 'rb_iterate' seems to exist at least in REL9_2_STABLE branch, so it is
> probably much older.

Oh!  Hmm ... I think I jumped to conclusions when the part of my patch
that touched struct RBTreeIterator failed to apply.  But you're right,
rb_iterate has been there since 9.0 now that I look more carefully.
So we really ought to back-patch further.  However:

> That
> said, I'm still not sure how this could work before ...  Maybe it has not
> been working for some time.

Yeah, I'm now mighty confused about this as well.  PL/Ruby is pretty old
too, so how come nobody noticed this before?  Is its rb_iterate call in
someplace that hardly gets any use?

            regards, tom lane


Re: plruby: rb_iterate symbol clash with libruby.so

От
Pavel Raiskup
Дата:
On Wednesday, November 7, 2018 3:25:31 PM CET Tom Lane wrote:
> Pavel Raiskup <praiskup@redhat.com> writes:
> > On Tuesday, November 6, 2018 7:28:21 PM CET Tom Lane wrote:
> >> Done.  I realized that the immediate problem, rb_iterate(), was only
> >> added as of PG v10, which may explain why we hadn't heard complaints
> >> about this till now.  So I've made the change only as far back as v10.
> 
> > The 'rb_iterate' seems to exist at least in REL9_2_STABLE branch, so it is
> > probably much older.
> 
> Oh!  Hmm ... I think I jumped to conclusions when the part of my patch
> that touched struct RBTreeIterator failed to apply.  But you're right,
> rb_iterate has been there since 9.0 now that I look more carefully.
> So we really ought to back-patch further.  However:
> 
> > That
> > said, I'm still not sure how this could work before ...  Maybe it has not
> > been working for some time.
> 
> Yeah, I'm now mighty confused about this as well.  PL/Ruby is pretty old
> too, so how come nobody noticed this before?  Is its rb_iterate call in
> someplace that hardly gets any use?

I can not authoritatively answer this (getting familiar with the code) but
many of the rb_iterate() calls were replaced with alternative API (if
available), so you might be right.

That said, I reproduced this by trigger function from tests/plt test-case
on the first try, more info and related fixes in [1].  But it is truth
that we haven't run the test-suite for RPM builds so far.

Might the reason be that nobody used plruby at all for a very long time?

[1] https://github.com/devrimgunduz/postgresql-plruby/pull/3

Pavel





Re: plruby: rb_iterate symbol clash with libruby.so

От
Tom Lane
Дата:
Pavel Raiskup <praiskup@redhat.com> writes:
> On Wednesday, November 7, 2018 3:25:31 PM CET Tom Lane wrote:
>> Yeah, I'm now mighty confused about this as well.  PL/Ruby is pretty old
>> too, so how come nobody noticed this before?  Is its rb_iterate call in
>> someplace that hardly gets any use?

> ... I reproduced this by trigger function from tests/plt test-case
> on the first try, more info and related fixes in [1].  But it is truth
> that we haven't run the test-suite for RPM builds so far.
> Might the reason be that nobody used plruby at all for a very long time?

I'm starting to think that.  I just browsed through the code of plruby,
and while I don't know that language at all, there are rb_iterate calls
in places that look like they'd be hard to avoid in typical use.

The fact that the upstream git repo hasn't been touched in nine years
doesn't exactly indicate a lively project, either :-(

If you guys want to pursue this, I'll finish up the back-patch,
but I wonder if we're just wasting our time.

            regards, tom lane


Re: plruby: rb_iterate symbol clash with libruby.so

От
Pavel Raiskup
Дата:
On Wednesday, November 7, 2018 4:55:13 PM CET Tom Lane wrote:
> Pavel Raiskup <praiskup@redhat.com> writes:
> > On Wednesday, November 7, 2018 3:25:31 PM CET Tom Lane wrote:
> >> Yeah, I'm now mighty confused about this as well.  PL/Ruby is pretty old
> >> too, so how come nobody noticed this before?  Is its rb_iterate call in
> >> someplace that hardly gets any use?
> 
> > ... I reproduced this by trigger function from tests/plt test-case
> > on the first try, more info and related fixes in [1].  But it is truth
> > that we haven't run the test-suite for RPM builds so far.
> > Might the reason be that nobody used plruby at all for a very long time?
>
> I'm starting to think that.  I just browsed through the code of plruby,
> and while I don't know that language at all, there are rb_iterate calls
> in places that look like they'd be hard to avoid in typical use.
>
> The fact that the upstream git repo hasn't been touched in nine years
> doesn't exactly indicate a lively project, either :-(
>
> If you guys want to pursue this, I'll finish up the back-patch, but I
> wonder if we're just wasting our time.

FWIW, I plan to help with fixing the plruby project, but to be honest - I
still think that the code can be fixed to not use rb_iterate() anyways.
So back-patch is not needed by me - and certainly not now.

Pavel