Обсуждение: Revert: Remove useless self-joins *and* -DREALLOCATE_BITMAPSETS make server crash, regress test fail.

Поиск
Список
Период
Сортировка
hi,

SELECT table_name, column_name, is_updatable
  FROM information_schema.columns
 WHERE table_name LIKE E'r_\\_view%'
 ORDER BY table_name, ordinal_position;

at d1d286d83c0eed695910cb20d970ea9bea2e5001,
this query in src/test/regress/sql/updatable_views.sql
makes regress tests fail. maybe other query also,
but this is the first one that invokes the server crash.


src3=# SELECT table_name, column_name, is_updatable
  FROM information_schema.columns
 WHERE table_name LIKE E'r_\\_view%'
 ORDER BY table_name, ordinal_position;
TRAP: failed Assert("bms_is_valid_set(a)"), File:
"../../Desktop/pg_src/src3/postgres/src/backend/nodes/bitmapset.c",
Line: 515, PID: 158266
postgres: jian src3 [local] SELECT(ExceptionalCondition+0x106)[0x5579188c0b6f]
postgres: jian src3 [local] SELECT(bms_is_member+0x56)[0x5579183581c7]
postgres: jian src3 [local]
SELECT(join_clause_is_movable_to+0x72)[0x557918439711]
postgres: jian src3 [local] SELECT(+0x73e26c)[0x5579183a126c]
postgres: jian src3 [local] SELECT(create_index_paths+0x3b8)[0x55791839d0ce]
postgres: jian src3 [local] SELECT(+0x719b4d)[0x55791837cb4d]
postgres: jian src3 [local] SELECT(+0x719400)[0x55791837c400]
postgres: jian src3 [local] SELECT(+0x718e90)[0x55791837be90]
postgres: jian src3 [local] SELECT(make_one_rel+0x187)[0x55791837bac5]
postgres: jian src3 [local] SELECT(query_planner+0x4e8)[0x5579183d2dc2]
postgres: jian src3 [local] SELECT(+0x7734ad)[0x5579183d64ad]
postgres: jian src3 [local] SELECT(subquery_planner+0x14b9)[0x5579183d57e4]
postgres: jian src3 [local] SELECT(standard_planner+0x365)[0x5579183d379e]
postgres: jian src3 [local] SELECT(planner+0x81)[0x5579183d3426]
postgres: jian src3 [local] SELECT(pg_plan_query+0xbb)[0x5579186100c8]
postgres: jian src3 [local] SELECT(pg_plan_queries+0x11a)[0x5579186102de]
postgres: jian src3 [local] SELECT(+0x9ad8f1)[0x5579186108f1]
postgres: jian src3 [local] SELECT(PostgresMain+0xd4a)[0x557918618603]
postgres: jian src3 [local] SELECT(+0x9a76a8)[0x55791860a6a8]
postgres: jian src3 [local]
SELECT(postmaster_child_launch+0x14d)[0x5579184d3430]
postgres: jian src3 [local] SELECT(+0x879c28)[0x5579184dcc28]
postgres: jian src3 [local] SELECT(+0x875278)[0x5579184d8278]
postgres: jian src3 [local] SELECT(PostmasterMain+0x205f)[0x5579184d7837]


                                    version
--------------------------------------------------------------------------------
 PostgreSQL 17devel_debug_build on x86_64-linux, compiled by gcc-11.4.0, 64-bit


meson config:
-Dpgport=5458 \
-Dplperl=enabled \
-Dplpython=enabled \
-Dssl=openssl \
-Dldap=enabled \
-Dlibxml=enabled \
-Dlibxslt=enabled \
-Duuid=e2fs \
-Dzstd=enabled \
-Dlz4=enabled \
-Dcassert=true \
-Db_coverage=true \
-Dicu=enabled \
-Dbuildtype=debug \
-Dwerror=true \
-Dc_args='-Wunused-variable
-Wuninitialized
-Werror=maybe-uninitialized
-Wreturn-type
-DWRITE_READ_PARSE_PLAN_TREES
-DREALLOCATE_BITMAPSETS
-DCOPY_PARSE_PLAN_TREES
-DRAW_EXPRESSION_COVERAGE_TEST -fno-omit-frame-pointer' \
-Ddocs_pdf=disabled \
-Ddocs_html_style=website \
-Dllvm=disabled \
-Dtap_tests=enabled \
-Dextra_version=_debug_build


This commit: d1d286d83c0eed695910cb20d970ea9bea2e5001
Revert: Remove useless self-joins make it fail.

the preceding commit (81b2252e609cfa74550dd6804949485c094e4b85)
won't make the regress fail.

i also found that not specifying c_args: `-DREALLOCATE_BITMAPSETS`,
the regress test won't fail.


later, i found out that `select 1 from information_schema.columns`
would also crash the server.

information_schema.columns view is very complex.
I get the view information_schema.columns definitions,
omit unnecessary const and where qual parts of the it
so the minimum reproducer is:

SELECT 1
FROM (((((
    (pg_attribute a LEFT JOIN pg_attrdef ad ON (((a.attrelid =
ad.adrelid) AND (a.attnum = ad.adnum))))
JOIN (pg_class c JOIN pg_namespace nc ON (c.relnamespace = nc.oid)) ON
(a.attrelid = c.oid))
JOIN (pg_type t JOIN pg_namespace nt ON ((t.typnamespace = nt.oid)))
ON (a.atttypid = t.oid))
LEFT JOIN (pg_type bt JOIN pg_namespace nbt ON (bt.typnamespace =
nbt.oid))  ON ( t.typbasetype = bt.oid  ))
LEFT JOIN (pg_collation co JOIN pg_namespace nco ON ( co.collnamespace
= nco.oid)) ON (a.attcollation = co.oid))
LEFT JOIN (pg_depend dep JOIN pg_sequence seq ON (dep.objid =
seq.seqrelid )) ON (((dep.refobjid = c.oid) AND (dep.refobjsubid =
a.attnum))));




On Tue, May 7, 2024 at 11:35 AM jian he <jian.universality@gmail.com> wrote:
hi,

SELECT table_name, column_name, is_updatable
  FROM information_schema.columns
 WHERE table_name LIKE E'r_\\_view%'
 ORDER BY table_name, ordinal_position;

at d1d286d83c0eed695910cb20d970ea9bea2e5001,
this query in src/test/regress/sql/updatable_views.sql
makes regress tests fail. maybe other query also,
but this is the first one that invokes the server crash.

Thank you for the report.  I looked at this a little bit and I think
here is what happened.  In deconstruct_distribute_oj_quals we call
distribute_quals_to_rels using the uncopied sjinfo->syn_lefthand as
outerjoin_nonnullable, which eventually becomes rinfo->outer_relids.
Later on, when we remove useless left joins, we modify
sjinfo->syn_lefthand using bms_del_member and recycle
sjinfo->syn_lefthand.  And that causes the rinfo->outer_relids becomes
invalid, and finally triggers this issue in join_clause_is_movable_to.

Maybe we want to bms_copy sjinfo->syn_lefthand first before using it as
nonnullable_rels.

--- a/src/backend/optimizer/plan/initsplan.c
+++ b/src/backend/optimizer/plan/initsplan.c
@@ -1888,7 +1888,7 @@ deconstruct_distribute_oj_quals(PlannerInfo *root,
    qualscope = bms_union(sjinfo->syn_lefthand, sjinfo->syn_righthand);
    qualscope = bms_add_member(qualscope, sjinfo->ojrelid);
    ojscope = bms_union(sjinfo->min_lefthand, sjinfo->min_righthand);
-   nonnullable_rels = sjinfo->syn_lefthand;
+   nonnullable_rels = bms_copy(sjinfo->syn_lefthand);

I will take a closer look in the afternoon.

Thanks
Richard
On Tue, 7 May 2024 at 16:47, Richard Guo <guofenglinux@gmail.com> wrote:
> --- a/src/backend/optimizer/plan/initsplan.c
> +++ b/src/backend/optimizer/plan/initsplan.c
> @@ -1888,7 +1888,7 @@ deconstruct_distribute_oj_quals(PlannerInfo *root,
>     qualscope = bms_union(sjinfo->syn_lefthand, sjinfo->syn_righthand);
>     qualscope = bms_add_member(qualscope, sjinfo->ojrelid);
>     ojscope = bms_union(sjinfo->min_lefthand, sjinfo->min_righthand);
> -   nonnullable_rels = sjinfo->syn_lefthand;
> +   nonnullable_rels = bms_copy(sjinfo->syn_lefthand);

I was busy looking at this too and I came to the same conclusion.

David



Richard Guo <guofenglinux@gmail.com> writes:
> Thank you for the report.  I looked at this a little bit and I think
> here is what happened.  In deconstruct_distribute_oj_quals we call
> distribute_quals_to_rels using the uncopied sjinfo->syn_lefthand as
> outerjoin_nonnullable, which eventually becomes rinfo->outer_relids.
> Later on, when we remove useless left joins, we modify
> sjinfo->syn_lefthand using bms_del_member and recycle
> sjinfo->syn_lefthand.  And that causes the rinfo->outer_relids becomes
> invalid, and finally triggers this issue in join_clause_is_movable_to.

Hmm, the SJE code didn't really touch any of this logic, so why
didn't we see the failure before?

            regards, tom lane



On Tue, 7 May 2024 at 17:01, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Richard Guo <guofenglinux@gmail.com> writes:
> > Thank you for the report.  I looked at this a little bit and I think
> > here is what happened.  In deconstruct_distribute_oj_quals we call
> > distribute_quals_to_rels using the uncopied sjinfo->syn_lefthand as
> > outerjoin_nonnullable, which eventually becomes rinfo->outer_relids.
> > Later on, when we remove useless left joins, we modify
> > sjinfo->syn_lefthand using bms_del_member and recycle
> > sjinfo->syn_lefthand.  And that causes the rinfo->outer_relids becomes
> > invalid, and finally triggers this issue in join_clause_is_movable_to.
>
> Hmm, the SJE code didn't really touch any of this logic, so why
> didn't we see the failure before?

The bms_free() occurs in remove_rel_from_query() on:

sjinf->syn_lefthand = bms_del_member(sjinf->syn_lefthand, relid);

I've not looked, but I assumed the revert must have removed some
common code that was added and reverted with SJE.

David



On Tue, 7 May 2024 at 17:11, David Rowley <dgrowleyml@gmail.com> wrote:
> sjinf->syn_lefthand = bms_del_member(sjinf->syn_lefthand, relid);
>
> I've not looked, but I assumed the revert must have removed some
> common code that was added and reverted with SJE.

Yeah, before the revert, that did:

-       sjinf->syn_lefthand = replace_relid(sjinf->syn_lefthand, relid, subst);

That replace code seems to have always done a bms_copy()

-static Bitmapset *
-replace_relid(Relids relids, int oldId, int newId)
-{
-   if (oldId < 0)
-       return relids;
-
-   /* Delete relid without substitution. */
-   if (newId < 0)
-       return bms_del_member(bms_copy(relids), oldId);
-
-   /* Substitute newId for oldId. */
-   if (bms_is_member(oldId, relids))
-       return bms_add_member(bms_del_member(bms_copy(relids), oldId), newId);
-
-   return relids;
-}


David



On Tue, 7 May 2024 at 15:35, jian he <jian.universality@gmail.com> wrote:
> i also found that not specifying c_args: `-DREALLOCATE_BITMAPSETS`,
> the regress test won't fail.

It would be good to get some build farm coverage of this so we don't
have to rely on manual testing.  I wonder if it's a good idea to just
define REALLOCATE_BITMAPSETS when #ifdef CLOBBER_FREED_MEMORY... or if
we should ask on the buildfarm-members list if someone wouldn't mind
defining it?

David



David Rowley <dgrowleyml@gmail.com> writes:
> Yeah, before the revert, that did:
> -       sjinf->syn_lefthand = replace_relid(sjinf->syn_lefthand, relid, subst);
> That replace code seems to have always done a bms_copy()

Hmm, not always; see e0477837c.

What I'm trying to figure out here is whether we have a live bug
in this area in released branches; and if so, why we've not seen
reports of that.

            regards, tom lane



On Tue, 7 May 2024 at 17:28, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> David Rowley <dgrowleyml@gmail.com> writes:
> > Yeah, before the revert, that did:
> > -       sjinf->syn_lefthand = replace_relid(sjinf->syn_lefthand, relid, subst);
> > That replace code seems to have always done a bms_copy()
>
> Hmm, not always; see e0477837c.

It was the discussion on that thread that led to the invention of
REALLOCATE_BITMAPSETS

> What I'm trying to figure out here is whether we have a live bug
> in this area in released branches; and if so, why we've not seen
> reports of that.

We could check what portions of REALLOCATE_BITMAPSETS are
backpatchable. It may not be applicable very far back because of v16's
00b41463c. The bms_del_member() would have left a zero set rather than
doing bms_free() prior to that commit.  There could be a bug in v16.

David




On Tue, May 7, 2024 at 1:22 PM David Rowley <dgrowleyml@gmail.com> wrote:
It would be good to get some build farm coverage of this so we don't
have to rely on manual testing.  I wonder if it's a good idea to just
define REALLOCATE_BITMAPSETS when #ifdef CLOBBER_FREED_MEMORY... or if
we should ask on the buildfarm-members list if someone wouldn't mind
defining it?

+1 to have build farm coverage of REALLOCATE_BITMAPSETS.  This flag
seems quite useful.

Thanks
Richard

On Tue, May 7, 2024 at 1:46 PM David Rowley <dgrowleyml@gmail.com> wrote:
On Tue, 7 May 2024 at 17:28, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> What I'm trying to figure out here is whether we have a live bug
> in this area in released branches; and if so, why we've not seen
> reports of that.

We could check what portions of REALLOCATE_BITMAPSETS are
backpatchable. It may not be applicable very far back because of v16's
00b41463c. The bms_del_member() would have left a zero set rather than
doing bms_free() prior to that commit.  There could be a bug in v16.

I also think there might be a bug in v16, as long as
'sjinfo->syn_lefthand' and 'rinfo->outer_relids' are referencing the
same bitmapset and the content of this bitmapset is altered through
'sjinfo->syn_lefthand' without 'rinfo->outer_relids' being aware of
these changes.  I tried to compose a query that can trigger this bug but
failed though.

Another thing that comes to my mind is that this issue shows that
RestrictInfo.outer_relids could contain references to removed rels and
joins, and RestrictInfo.outer_relids could be referenced after the
removal of useless left joins.  Currently we do not have a mechanism to
clean out the bits in outer_relids during outer join removal.  That is
to say, RestrictInfo.outer_relids might be referenced while including
rels that should have been removed.  I'm not sure if this is a problem.

Thanks
Richard
On Tue, May 7, 2024 at 1:19 PM Richard Guo <guofenglinux@gmail.com> wrote:
> On Tue, May 7, 2024 at 1:46 PM David Rowley <dgrowleyml@gmail.com> wrote:
>>
>> On Tue, 7 May 2024 at 17:28, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> > What I'm trying to figure out here is whether we have a live bug
>> > in this area in released branches; and if so, why we've not seen
>> > reports of that.
>>
>> We could check what portions of REALLOCATE_BITMAPSETS are
>> backpatchable. It may not be applicable very far back because of v16's
>> 00b41463c. The bms_del_member() would have left a zero set rather than
>> doing bms_free() prior to that commit.  There could be a bug in v16.
>
>
> I also think there might be a bug in v16, as long as
> 'sjinfo->syn_lefthand' and 'rinfo->outer_relids' are referencing the
> same bitmapset and the content of this bitmapset is altered through
> 'sjinfo->syn_lefthand' without 'rinfo->outer_relids' being aware of
> these changes.  I tried to compose a query that can trigger this bug but
> failed though.

Can sjinfo->syn_lefthand became empty set after bms_del_member()?  If
so, rinfo->outer_relids will become an invalid pointer.  If so, it's
obviously a bug, while it still might be very hard to make this
trigger a segfault.

------
Regards,
Alexander Korotkov
Supabase



On Tue, May 7, 2024 at 8:29 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> David Rowley <dgrowleyml@gmail.com> writes:
> > Yeah, before the revert, that did:
> > -       sjinf->syn_lefthand = replace_relid(sjinf->syn_lefthand, relid, subst);
> > That replace code seems to have always done a bms_copy()
>
> Hmm, not always; see e0477837c.
>
> What I'm trying to figure out here is whether we have a live bug
> in this area in released branches; and if so, why we've not seen
> reports of that.

I didn't yet spot a particular bug.  But this place looks dangerous,
and it's very hard to prove there is no bug.  Even if there is no bug,
it seems very easy to unintentionally add a bug here.  Should we just
accept to always do bms_copy()?

------
Regards,
Alexander Korotkov
Supabase




On 2024-05-07 Tu 06:05, Richard Guo wrote:

On Tue, May 7, 2024 at 1:22 PM David Rowley <dgrowleyml@gmail.com> wrote:
It would be good to get some build farm coverage of this so we don't
have to rely on manual testing.  I wonder if it's a good idea to just
define REALLOCATE_BITMAPSETS when #ifdef CLOBBER_FREED_MEMORY... or if
we should ask on the buildfarm-members list if someone wouldn't mind
defining it?

+1 to have build farm coverage of REALLOCATE_BITMAPSETS.  This flag
seems quite useful.



I have added it to the CPPFLAGS on prion.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com
Andrew Dunstan <andrew@dunslane.net> writes:
> On 2024-05-07 Tu 06:05, Richard Guo wrote:
>> +1 to have build farm coverage of REALLOCATE_BITMAPSETS. This flag
>> seems quite useful.

> I have added it to the CPPFLAGS on prion.

... and as expected, prion fell over.

I find that Richard's proposed fix makes the core regression tests
pass, but we still fail check-world.  So I'm afraid we need something
more aggressive, like the attached which makes make_restrictinfo
copy all its input bitmapsets.  Without that, we still have sharing
of bitmapsets across different RestrictInfos, which seems pretty
scary given what we now see about the effects of 00b41463c.  This
seems annoyingly expensive, but maybe there's little choice?

Given this, we could remove ad-hoc bms_copy calls from the callers
of make_restrictinfo, distribute_quals_to_rels, etc.  I didn't go
looking for possible wins of that sort; there's unlikely to be a
lot of them.

            regards, tom lane

diff --git a/src/backend/optimizer/util/restrictinfo.c b/src/backend/optimizer/util/restrictinfo.c
index 0b406e9334..e81861bc8b 100644
--- a/src/backend/optimizer/util/restrictinfo.c
+++ b/src/backend/optimizer/util/restrictinfo.c
@@ -132,8 +132,8 @@ make_restrictinfo_internal(PlannerInfo *root,
     restrictinfo->is_clone = is_clone;
     restrictinfo->can_join = false; /* may get set below */
     restrictinfo->security_level = security_level;
-    restrictinfo->incompatible_relids = incompatible_relids;
-    restrictinfo->outer_relids = outer_relids;
+    restrictinfo->incompatible_relids = bms_copy(incompatible_relids);
+    restrictinfo->outer_relids = bms_copy(outer_relids);

     /*
      * If it's potentially delayable by lower-level security quals, figure out
@@ -191,7 +191,7 @@ make_restrictinfo_internal(PlannerInfo *root,

     /* required_relids defaults to clause_relids */
     if (required_relids != NULL)
-        restrictinfo->required_relids = required_relids;
+        restrictinfo->required_relids = bms_copy(required_relids);
     else
         restrictinfo->required_relids = restrictinfo->clause_relids;


On Wed, 8 May 2024 at 06:20, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I find that Richard's proposed fix makes the core regression tests
> pass, but we still fail check-world.  So I'm afraid we need something
> more aggressive, like the attached which makes make_restrictinfo
> copy all its input bitmapsets.  Without that, we still have sharing
> of bitmapsets across different RestrictInfos, which seems pretty
> scary given what we now see about the effects of 00b41463c.  This
> seems annoyingly expensive, but maybe there's little choice?

We could make the policy copy-on-modify.  If you put bms_copy around
the bms_del_member() calls in remove_rel_from_query(), does it pass
then?

David



On Wed, 8 May 2024 at 10:35, David Rowley <dgrowleyml@gmail.com> wrote:
>
> On Wed, 8 May 2024 at 06:20, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > I find that Richard's proposed fix makes the core regression tests
> > pass, but we still fail check-world.  So I'm afraid we need something
> > more aggressive, like the attached which makes make_restrictinfo
> > copy all its input bitmapsets.  Without that, we still have sharing
> > of bitmapsets across different RestrictInfos, which seems pretty
> > scary given what we now see about the effects of 00b41463c.  This
> > seems annoyingly expensive, but maybe there's little choice?
>
> We could make the policy copy-on-modify.  If you put bms_copy around
> the bms_del_member() calls in remove_rel_from_query(), does it pass
> then?

err, I mean bms_copy() the set before passing to bms_del_member().

David



David Rowley <dgrowleyml@gmail.com> writes:
> On Wed, 8 May 2024 at 06:20, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I find that Richard's proposed fix makes the core regression tests
>> pass, but we still fail check-world.  So I'm afraid we need something
>> more aggressive, like the attached which makes make_restrictinfo
>> copy all its input bitmapsets.  Without that, we still have sharing
>> of bitmapsets across different RestrictInfos, which seems pretty
>> scary given what we now see about the effects of 00b41463c.  This
>> seems annoyingly expensive, but maybe there's little choice?

> We could make the policy copy-on-modify.  If you put bms_copy around
> the bms_del_member() calls in remove_rel_from_query(), does it pass
> then?

Didn't test, but that route seems awfully invasive and fragile: how
will we find all the places to modify, or ensure that the policy
is followed by future patches?

            regards, tom lane



On Wed, 8 May 2024 at 10:40, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> David Rowley <dgrowleyml@gmail.com> writes:
> > We could make the policy copy-on-modify.  If you put bms_copy around
> > the bms_del_member() calls in remove_rel_from_query(), does it pass
> > then?
>
> Didn't test, but that route seems awfully invasive and fragile: how
> will we find all the places to modify, or ensure that the policy
> is followed by future patches?

REALLOCATE_BITMAPSETS was invented for this and IMO, it found exactly
the problem it was invented to find.

Copy-on-modify is our policy for node mutation. Why is it ok there but
awfully fragile here?

David



David Rowley <dgrowleyml@gmail.com> writes:
> On Wed, 8 May 2024 at 10:40, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Didn't test, but that route seems awfully invasive and fragile: how
>> will we find all the places to modify, or ensure that the policy
>> is followed by future patches?

> REALLOCATE_BITMAPSETS was invented for this and IMO, it found exactly
> the problem it was invented to find.

Not in a way that gives me any confidence that we found *all* the
problems.  If check-world finds a problem that the core tests did not,
then there's no reason to think there aren't still more issues that
check-world happened not to trip over either.

> Copy-on-modify is our policy for node mutation. Why is it ok there but
> awfully fragile here?

It's only partly our policy: there are all those places where we don't
do it that way.  The main problem that I see for trying to be 100%
consistent in that way is that once you modify a sub-node, full
copy-on-modify dictates replacing every ancestor node all the way to
the top of the tree.  That's clearly impractical in the planner data
structures.  So where are we going to stop exactly?

            regards, tom lane



On Wed, 8 May 2024 at 10:55, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> David Rowley <dgrowleyml@gmail.com> writes:
> > REALLOCATE_BITMAPSETS was invented for this and IMO, it found exactly
> > the problem it was invented to find.
>
> Not in a way that gives me any confidence that we found *all* the
> problems.

Here are some statements I believe to be true:
1. If REALLOCATE_BITMAPSETS is defined then modifications to a
Bitmapset will make a copy and free the original.
2. If a query runs successfully without REALLOCATE_BITMAPSETS and
Assert fails due to an invalid Bitmapset when REALLOCATE_BITMAPSETS is
defined, then we have > 1 pointer pointing to the same set and not all
of them are being updated when the members are added/removed.

Given the above, I can't see what Bitmapset sharing problems we won't
find with REALLOCATE_BITMAPSETS.

Can you share the exact scenario you're worried that we won't find so
I can understand your concern?

David



David Rowley <dgrowleyml@gmail.com> writes:
> On Wed, 8 May 2024 at 10:55, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Not in a way that gives me any confidence that we found *all* the
>> problems.

> Here are some statements I believe to be true:
> 1. If REALLOCATE_BITMAPSETS is defined then modifications to a
> Bitmapset will make a copy and free the original.
> 2. If a query runs successfully without REALLOCATE_BITMAPSETS and
> Assert fails due to an invalid Bitmapset when REALLOCATE_BITMAPSETS is
> defined, then we have > 1 pointer pointing to the same set and not all
> of them are being updated when the members are added/removed.

> Given the above, I can't see what Bitmapset sharing problems we won't
> find with REALLOCATE_BITMAPSETS.

Anything where the trouble spots are in a code path we fail to
exercise with our available test suites.  If you think there are
no such code paths, I'm sorry to disillusion you.

I spent a little bit of time wondering if we could find problems in a
more static way by marking bitmapset fields as "const", but I fear
that would create a huge number of false positives.

            regards, tom lane



I traced down the other failure I was seeing in check-world, and
found that it came from deconstruct_distribute doing this:

        distribute_quals_to_rels(root, my_quals,
                                 jtitem,
                                 sjinfo,
                                 root->qual_security_level,
                                 jtitem->qualscope,
                                 ojscope, jtitem->nonnullable_rels,
                                 NULL,    /* incompatible_relids */
                                 true,    /* allow_equivalence */
                                 false, false,    /* not clones */
                                 postponed_oj_qual_list);

where jtitem->nonnullable_rels is the same as the jtitem's left_rels,
which ends up as the syn_lefthand of the join's SpecialJoinInfo, and
then when remove_rel_from_query tries to adjust the syn_lefthand it
breaks the outer_relids of whatever RestrictInfos got made here.

I was able to fix that by not letting jtitem->nonnullable_rels be
the same as left_rels.  The attached alternative_1.patch does pass
check-world.  But I find it mighty unprincipled: the JoinTreeItem
data structures are full of shared relid sets, so why is this
particular sharing not OK?  I still don't have any confidence that
there aren't more problems.

Along about here I started to wonder how come we are only seeing
SpecialJoinInfo-vs-RestrictInfo sharing as a problem, when surely
there is plenty of cross-RestrictInfo sharing going on as well.
(The above call is perfectly capable of making a lot of RestrictInfos,
all with the same outer_relids.)  That thought led me to look at
remove_rel_from_restrictinfo, and darn if I didn't find this:

    /*
     * The clause_relids probably aren't shared with anything else, but let's
     * copy them just to be sure.
     */
    rinfo->clause_relids = bms_copy(rinfo->clause_relids);
    ...
    /* Likewise for required_relids */
    rinfo->required_relids = bms_copy(rinfo->required_relids);

So the reason we don't see cross-RestrictInfo breakage is that
analyzejoins.c is careful not to modify the original relid sets
when modifying a RestrictInfo.  (This comment is clearly wrong.)

And that leads to the thought that we can fix our current sharing
problems by similarly avoiding overwriting the original sets
in remove_rel_from_query.  The attached alternative-2.patch
attacks it that way, and also passes check-world.

I like alternative-2.patch a lot better, not least because it
only adds cycles when join removal actually fires.  Basically
this is putting the onus on the data structure modifier to
cope with shared bitmapsets, rather than trying to say that
sharing is disallowed.

Thoughts?

            regards, tom lane

diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c
index e2c68fe6f9..4198e9c83a 100644
--- a/src/backend/optimizer/plan/initsplan.c
+++ b/src/backend/optimizer/plan/initsplan.c
@@ -979,7 +979,7 @@ deconstruct_recurse(PlannerInfo *root, Node *jtnode,
                                                     right_item->inner_join_rels);
                 jtitem->left_rels = left_item->qualscope;
                 jtitem->right_rels = right_item->qualscope;
-                jtitem->nonnullable_rels = left_item->qualscope;
+                jtitem->nonnullable_rels = bms_copy(left_item->qualscope);
                 break;
             case JOIN_SEMI:
                 /* This node belongs to parent_domain, as do its children */
@@ -1053,7 +1053,7 @@ deconstruct_recurse(PlannerInfo *root, Node *jtnode,
                 jtitem->left_rels = left_item->qualscope;
                 jtitem->right_rels = right_item->qualscope;
                 /* each side is both outer and inner */
-                jtitem->nonnullable_rels = jtitem->qualscope;
+                jtitem->nonnullable_rels = bms_copy(jtitem->qualscope);
                 break;
             default:
                 /* JOIN_RIGHT was eliminated during reduce_outer_joins() */
@@ -1888,7 +1888,7 @@ deconstruct_distribute_oj_quals(PlannerInfo *root,
     qualscope = bms_union(sjinfo->syn_lefthand, sjinfo->syn_righthand);
     qualscope = bms_add_member(qualscope, sjinfo->ojrelid);
     ojscope = bms_union(sjinfo->min_lefthand, sjinfo->min_righthand);
-    nonnullable_rels = sjinfo->syn_lefthand;
+    nonnullable_rels = bms_copy(sjinfo->syn_lefthand);

     /*
      * If this join can commute with any other ones per outer-join identity 3,
diff --git a/src/backend/optimizer/plan/analyzejoins.c b/src/backend/optimizer/plan/analyzejoins.c
index aa72592567..1c9acf864c 100644
--- a/src/backend/optimizer/plan/analyzejoins.c
+++ b/src/backend/optimizer/plan/analyzejoins.c
@@ -390,6 +390,17 @@ remove_rel_from_query(PlannerInfo *root, int relid, SpecialJoinInfo *sjinfo)
     {
         SpecialJoinInfo *sjinf = (SpecialJoinInfo *) lfirst(l);

+        /*
+         * initsplan.c is fairly cavalier about allowing SpecialJoinInfos'
+         * lefthand/righthand relid sets to be shared with other data
+         * structures.  Ensure that we don't modify the original relid sets.
+         * (The commute_xxx sets are always per-SpecialJoinInfo though.)
+         */
+        sjinf->min_lefthand = bms_copy(sjinf->min_lefthand);
+        sjinf->min_righthand = bms_copy(sjinf->min_righthand);
+        sjinf->syn_lefthand = bms_copy(sjinf->syn_lefthand);
+        sjinf->syn_righthand = bms_copy(sjinf->syn_righthand);
+        /* Now remove relid and ojrelid bits from the sets: */
         sjinf->min_lefthand = bms_del_member(sjinf->min_lefthand, relid);
         sjinf->min_righthand = bms_del_member(sjinf->min_righthand, relid);
         sjinf->syn_lefthand = bms_del_member(sjinf->syn_lefthand, relid);
@@ -551,8 +562,10 @@ static void
 remove_rel_from_restrictinfo(RestrictInfo *rinfo, int relid, int ojrelid)
 {
     /*
-     * The clause_relids probably aren't shared with anything else, but let's
-     * copy them just to be sure.
+     * initsplan.c is fairly cavalier about allowing RestrictInfos to share
+     * relid sets with other RestrictInfos (and SpecialJoinInfos for that
+     * matter).  To avoid breaking things, make sure this RestrictInfo has its
+     * own clause_relids set before we modify it.
      */
     rinfo->clause_relids = bms_copy(rinfo->clause_relids);
     rinfo->clause_relids = bms_del_member(rinfo->clause_relids, relid);

BTW, now that I've wrapped my head around what's happening here,
I believe that -DREALLOCATE_BITMAPSETS is introducing a bug where
there was none before.  The changes that left-join removal makes
won't cause any of these sets to go to empty, so the bms_del_member
calls won't free the sets but just modify them in-place.  And the
same change will/should be made in every relevant relid set, so
the fact that the sets may be shared isn't hurting anything.

This is, of course, pretty fragile and I'm totally on board with
making it safer.  But there's no live bug in released branches,
and not in HEAD either unless you add -DREALLOCATE_BITMAPSETS.
That accounts for the lack of related field reports, and it means
we don't really need to back-patch anything.

This conclusion also reinforces my previously-vague feeling that
we should not consider making -DREALLOCATE_BITMAPSETS the default in
debug builds, as was proposed upthread.  It's really a fundamentally
different behavior, and I strongly suspect that it can mask bugs as
well as introduce them (by hiding sharing in cases that'd be less
benign than this turns out to be).  I'd rather not do development on
top of bitmapset infrastructure that acts entirely different from
production bitmapset infrastructure.

            regards, tom lane



On Thu, 9 May 2024 at 06:49, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> BTW, now that I've wrapped my head around what's happening here,
> I believe that -DREALLOCATE_BITMAPSETS is introducing a bug where
> there was none before.  The changes that left-join removal makes
> won't cause any of these sets to go to empty, so the bms_del_member
> calls won't free the sets but just modify them in-place.  And the
> same change will/should be made in every relevant relid set, so
> the fact that the sets may be shared isn't hurting anything.

FWIW, it just feels like we're willing to accept that the
bms_del_member() is not updating all copies of the set in this case as
that particular behaviour is ok for this particular case. I know
you're not proposing this, but I don't think that would warrant
relaxing REALLOCATE_BITMAPSETS to not reallocate Bitmapsets on
bms_del_member() and bms_del_members().

If all we have to do to make -DREALLOCATE_BITMAPSETS builds happy in
make check-world is to add a bms_copy inside the bms_del_member()
calls in remove_rel_from_query(), then I think it's a small price to
pay to allow us to maintain the additional coverage that
REALLOCATE_BITMAPSETS provides. That seems like a small price to pay
when the gains are removing an entire join.

> This conclusion also reinforces my previously-vague feeling that
> we should not consider making -DREALLOCATE_BITMAPSETS the default in
> debug builds, as was proposed upthread.  It's really a fundamentally
> different behavior, and I strongly suspect that it can mask bugs as
> well as introduce them (by hiding sharing in cases that'd be less
> benign than this turns out to be).  I'd rather not do development on
> top of bitmapset infrastructure that acts entirely different from
> production bitmapset infrastructure.

My primary interest in this feature is using it to catch bugs that
we're unlikely to ever hit in the regression tests.  For example, the
planner works when there are <= 63 RTEs but falls over when there are
64 because some bms_add_member() must reallocate more memory to store
the 64th RTI in a Bitmapset. I'd like to have something to make it
more likely we'll find bugs like this before the release instead of
someone having a crash when they run some obscure query shape
containing > 63 RTEs 2 or 4 years after the release.

I'm happy Andrew added this to prion. Thanks for doing that.

David



On Thu, 9 May 2024 at 06:24, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I like alternative-2.patch a lot better, not least because it
> only adds cycles when join removal actually fires.  Basically
> this is putting the onus on the data structure modifier to
> cope with shared bitmapsets, rather than trying to say that
> sharing is disallowed.
>
> Thoughts?

I'm fine with this one as it's the same as what I already mentioned
earlier.  I had imagined doing bms_del_member(bms_copy ... but maybe
the compiler is able to optimise away the additional store. Likely, it
does not matter much as pallocing memory likely adds far more overhead
anyway.

David



David Rowley <dgrowleyml@gmail.com> writes:
> On Thu, 9 May 2024 at 06:49, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> BTW, now that I've wrapped my head around what's happening here,
>> I believe that -DREALLOCATE_BITMAPSETS is introducing a bug where
>> there was none before.  The changes that left-join removal makes
>> won't cause any of these sets to go to empty, so the bms_del_member
>> calls won't free the sets but just modify them in-place.  And the
>> same change will/should be made in every relevant relid set, so
>> the fact that the sets may be shared isn't hurting anything.

> FWIW, it just feels like we're willing to accept that the
> bms_del_member() is not updating all copies of the set in this case as
> that particular behaviour is ok for this particular case. I know
> you're not proposing this,

No, I'm not.  I was just trying to explain how come there's not
a visible bug.  I quite agree that this is too fragile to leave
as-is going forward.  (One thing I'm wondering about is whether
we should back-patch despite the lack of visible bug, just in
case some future back-patch relies on the safer behavior.)

>> This conclusion also reinforces my previously-vague feeling that
>> we should not consider making -DREALLOCATE_BITMAPSETS the default in
>> debug builds, as was proposed upthread.

> My primary interest in this feature is using it to catch bugs that
> we're unlikely to ever hit in the regression tests.  For example, the
> planner works when there are <= 63 RTEs but falls over when there are
> 64 because some bms_add_member() must reallocate more memory to store
> the 64th RTI in a Bitmapset. I'd like to have something to make it
> more likely we'll find bugs like this before the release instead of
> someone having a crash when they run some obscure query shape
> containing > 63 RTEs 2 or 4 years after the release.

Again, I think -DREALLOCATE_BITMAPSETS adds a valuable testing weapon.
But if we make that the default in debug builds, then we'll get next
door to zero testing of the behavior without it, and that seems like
a really bad idea given how different the behavior is.

(Speaking of which, I wonder how many buildfarm members build without
--enable-cassert.  The answer could well be "zero", and that's likely
not good.)

> I'm happy Andrew added this to prion. Thanks for doing that.

+1

            regards, tom lane



David Rowley <dgrowleyml@gmail.com> writes:
> I'm fine with this one as it's the same as what I already mentioned
> earlier.  I had imagined doing bms_del_member(bms_copy ... but maybe
> the compiler is able to optimise away the additional store. Likely, it
> does not matter much as pallocing memory likely adds far more overhead
> anyway.

I actually wrote it that way to start with, but undid it after
noticing that the existing code in remove_rel_from_restrictinfo
does it in separate steps, and thinking that that was good for
both separation of concerns and a cleaner git history.  I too
can't believe that an extra fetch will be noticeable compared
to the cost of the adjacent bms_xxx operations.

            regards, tom lane




On Thu, May 9, 2024 at 6:40 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
David Rowley <dgrowleyml@gmail.com> writes:
> I'm fine with this one as it's the same as what I already mentioned
> earlier.  I had imagined doing bms_del_member(bms_copy ... but maybe
> the compiler is able to optimise away the additional store. Likely, it
> does not matter much as pallocing memory likely adds far more overhead
> anyway.

I actually wrote it that way to start with, but undid it after
noticing that the existing code in remove_rel_from_restrictinfo
does it in separate steps, and thinking that that was good for
both separation of concerns and a cleaner git history.  I too
can't believe that an extra fetch will be noticeable compared
to the cost of the adjacent bms_xxx operations.

I also think it seems better to do bms_copy in separate steps, not only
because this keeps consistent with the existing code in
remove_rel_from_restrictinfo, but also because we need to do
bms_del_member twice for each lefthand/righthand relid set.

Speaking of consistency, do you think it would improve the code's
readability if we rearrange the code in remove_rel_from_query so that
the modifications of the same relid set are grouped together, just like
what we do in remove_rel_from_restrictinfo?  I mean something like:

   sjinf->min_lefthand = bms_copy(sjinf->min_lefthand);
   sjinf->min_lefthand = bms_del_member(sjinf->min_lefthand, relid);
   sjinf->min_lefthand = bms_del_member(sjinf->min_lefthand, ojrelid);

   ...

Thanks
Richard
Richard Guo <guofenglinux@gmail.com> writes:
> I also think it seems better to do bms_copy in separate steps, not only
> because this keeps consistent with the existing code in
> remove_rel_from_restrictinfo, but also because we need to do
> bms_del_member twice for each lefthand/righthand relid set.

Yeah.  Of course, we don't need a bms_copy() in the second one,
but that'd just add even more asymmetry and chance for confusion.

> Speaking of consistency, do you think it would improve the code's
> readability if we rearrange the code in remove_rel_from_query so that
> the modifications of the same relid set are grouped together, just like
> what we do in remove_rel_from_restrictinfo?

I left it alone, just because it didn't seem worth cluttering "git
blame" here.

            regards, tom lane