Обсуждение: ExecTypeSetColNames is fundamentally broken

Поиск

Список

Период

Сортировка

ExecTypeSetColNames is fundamentally broken

От

Tom Lane

Дата:

05 декабря 2021 г., 21:45:47

Changing
Fcc: +inbox
--------
I looked into the failure reported at [1].  Basically what's happening
there is that we're allowing a composite datum of type RECORD to get
stored into a table, whereupon other backends can't make sense of it
since they lack the appropriate typcache entry.  The reason the datum
is marked as type RECORD is that ExecTypeSetColNames set things up
that way, after observing that the tuple descriptor obtained from
the current table definition didn't have the column names it thought
it should have.

Now, in the test case as given,    ExecTypeSetColNames is in error to
think that, because it fails to account for the possibility that the
tupdesc contains dropped columns that weren't dropped when the relevant
RTE was made.  However, if the test case is modified so that we just
rename rather than drop some pre-existing column, then even with a
fix for that problem ExecTypeSetColNames would do the wrong thing.

I made several attempts to work around this problem, but I've now
concluded that changing the type OID in ExecTypeSetColNames is just
fundamentally broken.  It can't be okay to decide that a Var that
claims to emit type A actually emits type B, and especially not to
do so as late as executor startup.  I speculated in the other thread
that we could do so during planning instead, but that turns out to
just move the problems around.  I think this must be so, because the
whole idea is bogus.  For example, if we have a function that is
declared to take type "ct", it can't be okay in general to pass it
type "record" instead.  We've mistakenly thought that we could fuzz
this as long as the two types are physically compatible --- but how
can we think that the column names of a composite type aren't a
basic part of its definition?

So 0001 attached fixes this by revoking the decision to apply
ExecTypeSetColNames in cases where a Var or RowExpr is declared
to return a named composite type.  This causes a couple of regression
test results to change, but they are ones that were specifically
added to exercise this behavior that we now see to be invalid.
(In passing, this also adjusts ExecEvalWholeRowVar to fail if the
Var claims to be of a domain-over-composite.  I'm not sure what
I was thinking when I changed it to allow that; the case should
not arise, and if it did, it'd be questionable because we're not
checking domain constraints here.)

0002 is some inessential mop-up that avoids storing useless column
name lists in RowExprs where they won't be used.

I'm not really thrilled about back-patching a behavioral change
of this sort, but I don't see any good alternative.  Corrupting
people's tables is not okay.

Thoughts?

            regards, tom lane

[1] https://www.postgresql.org/message-id/flat/CAOFAq3BeawPiw9pc3bVGZ%3DRint2txWEBCeDC2wNAhtCZoo2ZqA%40mail.gmail.com

diff --git a/src/backend/executor/execExpr.c b/src/backend/executor/execExpr.c
index 892b4e17e0..f0f3b4ffb0 100644
--- a/src/backend/executor/execExpr.c
+++ b/src/backend/executor/execExpr.c
@@ -1913,16 +1913,16 @@ ExecInitExprRec(Expr *node, ExprState *state,
                 {
                     /* generic record, use types of given expressions */
                     tupdesc = ExecTypeFromExprList(rowexpr->args);
+                    /* ... but adopt RowExpr's column aliases */
+                    ExecTypeSetColNames(tupdesc, rowexpr->colnames);
+                    /* Bless the tupdesc so it can be looked up later */
+                    BlessTupleDesc(tupdesc);
                 }
                 else
                 {
                     /* it's been cast to a named type, use that */
                     tupdesc = lookup_rowtype_tupdesc_copy(rowexpr->row_typeid, -1);
                 }
-                /* In either case, adopt RowExpr's column aliases */
-                ExecTypeSetColNames(tupdesc, rowexpr->colnames);
-                /* Bless the tupdesc in case it's now of type RECORD */
-                BlessTupleDesc(tupdesc);

                 /*
                  * In the named-type case, the tupdesc could have more columns
diff --git a/src/backend/executor/execExprInterp.c b/src/backend/executor/execExprInterp.c
index eb49817cee..b7b79e0b75 100644
--- a/src/backend/executor/execExprInterp.c
+++ b/src/backend/executor/execExprInterp.c
@@ -4021,12 +4021,8 @@ ExecEvalWholeRowVar(ExprState *state, ExprEvalStep *op, ExprContext *econtext)
              * generates an INT4 NULL regardless of the dropped column type).
              * If we find a dropped column and cannot verify that case (1)
              * holds, we have to use the slow path to check (2) for each row.
-             *
-             * If vartype is a domain over composite, just look through that
-             * to the base composite type.
              */
-            var_tupdesc = lookup_rowtype_tupdesc_domain(variable->vartype,
-                                                        -1, false);
+            var_tupdesc = lookup_rowtype_tupdesc(variable->vartype, -1);

             slot_tupdesc = slot->tts_tupleDescriptor;

@@ -4063,9 +4059,8 @@ ExecEvalWholeRowVar(ExprState *state, ExprEvalStep *op, ExprContext *econtext)

             /*
              * Use the variable's declared rowtype as the descriptor for the
-             * output values, modulo possibly assigning new column names
-             * below. In particular, we *must* absorb any attisdropped
-             * markings.
+             * output values.  In particular, we *must* absorb any
+             * attisdropped markings.
              */
             oldcontext = MemoryContextSwitchTo(econtext->ecxt_per_query_memory);
             output_tupdesc = CreateTupleDescCopy(var_tupdesc);
@@ -4083,39 +4078,38 @@ ExecEvalWholeRowVar(ExprState *state, ExprEvalStep *op, ExprContext *econtext)
             oldcontext = MemoryContextSwitchTo(econtext->ecxt_per_query_memory);
             output_tupdesc = CreateTupleDescCopy(slot->tts_tupleDescriptor);
             MemoryContextSwitchTo(oldcontext);
-        }

-        /*
-         * Construct a tuple descriptor for the composite values we'll
-         * produce, and make sure its record type is "blessed".  The main
-         * reason to do this is to be sure that operations such as
-         * row_to_json() will see the desired column names when they look up
-         * the descriptor from the type information embedded in the composite
-         * values.
-         *
-         * We already got the correct physical datatype info above, but now we
-         * should try to find the source RTE and adopt its column aliases, in
-         * case they are different from the original rowtype's names.  For
-         * example, in "SELECT foo(t) FROM tab t(x,y)", the first two columns
-         * in the composite output should be named "x" and "y" regardless of
-         * tab's column names.
-         *
-         * If we can't locate the RTE, assume the column names we've got are
-         * OK.  (As of this writing, the only cases where we can't locate the
-         * RTE are in execution of trigger WHEN clauses, and then the Var will
-         * have the trigger's relation's rowtype, so its names are fine.)
-         * Also, if the creator of the RTE didn't bother to fill in an eref
-         * field, assume our column names are OK.  (This happens in COPY, and
-         * perhaps other places.)
-         */
-        if (econtext->ecxt_estate &&
-            variable->varno <= econtext->ecxt_estate->es_range_table_size)
-        {
-            RangeTblEntry *rte = exec_rt_fetch(variable->varno,
-                                               econtext->ecxt_estate);
+            /*
+             * It's possible that the input slot is a relation scan slot and
+             * so is marked with that relation's rowtype.  But we're supposed
+             * to be returning RECORD, so reset to that.
+             */
+            output_tupdesc->tdtypeid = RECORDOID;
+            output_tupdesc->tdtypmod = -1;

-            if (rte->eref)
-                ExecTypeSetColNames(output_tupdesc, rte->eref->colnames);
+            /*
+             * We already got the correct physical datatype info above, but
+             * now we should try to find the source RTE and adopt its column
+             * aliases, since it's unlikely that the input slot has the
+             * desired names.
+             *
+             * If we can't locate the RTE, assume the column names we've got
+             * are OK.  (As of this writing, the only cases where we can't
+             * locate the RTE are in execution of trigger WHEN clauses, and
+             * then the Var will have the trigger's relation's rowtype, so its
+             * names are fine.)  Also, if the creator of the RTE didn't bother
+             * to fill in an eref field, assume our column names are OK. (This
+             * happens in COPY, and perhaps other places.)
+             */
+            if (econtext->ecxt_estate &&
+                variable->varno <= econtext->ecxt_estate->es_range_table_size)
+            {
+                RangeTblEntry *rte = exec_rt_fetch(variable->varno,
+                                                   econtext->ecxt_estate);
+
+                if (rte->eref)
+                    ExecTypeSetColNames(output_tupdesc, rte->eref->colnames);
+            }
         }

         /* Bless the tupdesc if needed, and save it in the execution state */
diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c
index f447802843..5004b3b165 100644
--- a/src/backend/executor/execTuples.c
+++ b/src/backend/executor/execTuples.c
@@ -2022,51 +2022,40 @@ ExecTypeFromExprList(List *exprList)
 }

 /*
- * ExecTypeSetColNames - set column names in a TupleDesc
+ * ExecTypeSetColNames - set column names in a RECORD TupleDesc
  *
  * Column names must be provided as an alias list (list of String nodes).
- *
- * For some callers, the supplied tupdesc has a named rowtype (not RECORD)
- * and it is moderately likely that the alias list matches the column names
- * already present in the tupdesc.  If we do change any column names then
- * we must reset the tupdesc's type to anonymous RECORD; but we avoid doing
- * so if no names change.
  */
 void
 ExecTypeSetColNames(TupleDesc typeInfo, List *namesList)
 {
-    bool        modified = false;
     int            colno = 0;
     ListCell   *lc;

+    /* It's only OK to change col names in a not-yet-blessed RECORD type */
+    Assert(typeInfo->tdtypeid == RECORDOID);
+    Assert(typeInfo->tdtypmod < 0);
+
     foreach(lc, namesList)
     {
         char       *cname = strVal(lfirst(lc));
         Form_pg_attribute attr;

-        /* Guard against too-long names list */
+        /* Guard against too-long names list (probably can't happen) */
         if (colno >= typeInfo->natts)
             break;
         attr = TupleDescAttr(typeInfo, colno);
         colno++;

-        /* Ignore empty aliases (these must be for dropped columns) */
-        if (cname[0] == '\0')
+        /*
+         * Do nothing for empty aliases or dropped columns (these cases
+         * probably can't arise in RECORD types, either)
+         */
+        if (cname[0] == '\0' || attr->attisdropped)
             continue;

-        /* Change tupdesc only if alias is actually different */
-        if (strcmp(cname, NameStr(attr->attname)) != 0)
-        {
-            namestrcpy(&(attr->attname), cname);
-            modified = true;
-        }
-    }
-
-    /* If we modified the tupdesc, it's now a new record type */
-    if (modified)
-    {
-        typeInfo->tdtypeid = RECORDOID;
-        typeInfo->tdtypmod = -1;
+        /* OK, assign the column name */
+        namestrcpy(&(attr->attname), cname);
     }
 }

diff --git a/src/test/regress/expected/rowtypes.out b/src/test/regress/expected/rowtypes.out
index 0e71baf8fb..3245fb6a35 100644
--- a/src/test/regress/expected/rowtypes.out
+++ b/src/test/regress/expected/rowtypes.out
@@ -1010,14 +1010,15 @@ select row_to_json(i) from int8_tbl i;
  {"q1":4567890123456789,"q2":-4567890123456789}
 (5 rows)

+-- since "i" is of type "int8_tbl", attaching aliases doesn't change anything:
 select row_to_json(i) from int8_tbl i(x,y);
-                 row_to_json
-----------------------------------------------
- {"x":123,"y":456}
- {"x":123,"y":4567890123456789}
- {"x":4567890123456789,"y":123}
- {"x":4567890123456789,"y":4567890123456789}
- {"x":4567890123456789,"y":-4567890123456789}
+                  row_to_json
+------------------------------------------------
+ {"q1":123,"q2":456}
+ {"q1":123,"q2":4567890123456789}
+ {"q1":4567890123456789,"q2":123}
+ {"q1":4567890123456789,"q2":4567890123456789}
+ {"q1":4567890123456789,"q2":-4567890123456789}
 (5 rows)

 create temp view vv1 as select * from int8_tbl;
@@ -1032,15 +1033,16 @@ select row_to_json(i) from vv1 i;
 (5 rows)

 select row_to_json(i) from vv1 i(x,y);
-                 row_to_json
-----------------------------------------------
- {"x":123,"y":456}
- {"x":123,"y":4567890123456789}
- {"x":4567890123456789,"y":123}
- {"x":4567890123456789,"y":4567890123456789}
- {"x":4567890123456789,"y":-4567890123456789}
+                  row_to_json
+------------------------------------------------
+ {"q1":123,"q2":456}
+ {"q1":123,"q2":4567890123456789}
+ {"q1":4567890123456789,"q2":123}
+ {"q1":4567890123456789,"q2":4567890123456789}
+ {"q1":4567890123456789,"q2":-4567890123456789}
 (5 rows)

+-- in these examples, we'll report the exposed column names of the subselect:
 select row_to_json(ss) from
   (select q1, q2 from int8_tbl) as ss;
                   row_to_json
diff --git a/src/test/regress/sql/rowtypes.sql b/src/test/regress/sql/rowtypes.sql
index 845e3305f3..b8786603d9 100644
--- a/src/test/regress/sql/rowtypes.sql
+++ b/src/test/regress/sql/rowtypes.sql
@@ -417,12 +417,14 @@ select longname(f) from fullname f;
 --

 select row_to_json(i) from int8_tbl i;
+-- since "i" is of type "int8_tbl", attaching aliases doesn't change anything:
 select row_to_json(i) from int8_tbl i(x,y);

 create temp view vv1 as select * from int8_tbl;
 select row_to_json(i) from vv1 i;
 select row_to_json(i) from vv1 i(x,y);

+-- in these examples, we'll report the exposed column names of the subselect:
 select row_to_json(ss) from
   (select q1, q2 from int8_tbl) as ss;
 select row_to_json(ss) from
diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optimizer/prep/prepjointree.c
index 387a35e112..43674aace4 100644
--- a/src/backend/optimizer/prep/prepjointree.c
+++ b/src/backend/optimizer/prep/prepjointree.c
@@ -2279,8 +2279,8 @@ pullup_replace_vars_callback(Var *var,
          * If generating an expansion for a var of a named rowtype (ie, this
          * is a plain relation RTE), then we must include dummy items for
          * dropped columns.  If the var is RECORD (ie, this is a JOIN), then
-         * omit dropped columns. Either way, attach column names to the
-         * RowExpr for use of ruleutils.c.
+         * omit dropped columns.  In the latter case, attach column names to
+         * the RowExpr for use of the executor and ruleutils.c.
          *
          * In order to be able to cache the results, we always generate the
          * expansion with varlevelsup = 0, and then adjust if needed.
@@ -2301,7 +2301,7 @@ pullup_replace_vars_callback(Var *var,
         rowexpr->args = fields;
         rowexpr->row_typeid = var->vartype;
         rowexpr->row_format = COERCE_IMPLICIT_CAST;
-        rowexpr->colnames = colnames;
+        rowexpr->colnames = (var->vartype == RECORDOID) ? colnames : NIL;
         rowexpr->location = var->location;
         newnode = (Node *) rowexpr;

diff --git a/src/backend/optimizer/util/var.c b/src/backend/optimizer/util/var.c
index e57c3aa124..b30e8c9be6 100644
--- a/src/backend/optimizer/util/var.c
+++ b/src/backend/optimizer/util/var.c
@@ -802,6 +802,7 @@ flatten_join_alias_vars_mutator(Node *node,
             rowexpr->args = fields;
             rowexpr->row_typeid = var->vartype;
             rowexpr->row_format = COERCE_IMPLICIT_CAST;
+            /* vartype will always be RECORDOID, so we always need colnames */
             rowexpr->colnames = colnames;
             rowexpr->location = var->location;

diff --git a/src/backend/rewrite/rewriteManip.c b/src/backend/rewrite/rewriteManip.c
index d4e0b8b4de..30922a006a 100644
--- a/src/backend/rewrite/rewriteManip.c
+++ b/src/backend/rewrite/rewriteManip.c
@@ -1424,8 +1424,8 @@ ReplaceVarsFromTargetList_callback(Var *var,
          * If generating an expansion for a var of a named rowtype (ie, this
          * is a plain relation RTE), then we must include dummy items for
          * dropped columns.  If the var is RECORD (ie, this is a JOIN), then
-         * omit dropped columns.  Either way, attach column names to the
-         * RowExpr for use of ruleutils.c.
+         * omit dropped columns.  In the latter case, attach column names to
+         * the RowExpr for use of the executor and ruleutils.c.
          */
         expandRTE(rcon->target_rte,
                   var->varno, var->varlevelsup, var->location,
@@ -1438,7 +1438,7 @@ ReplaceVarsFromTargetList_callback(Var *var,
         rowexpr->args = fields;
         rowexpr->row_typeid = var->vartype;
         rowexpr->row_format = COERCE_IMPLICIT_CAST;
-        rowexpr->colnames = colnames;
+        rowexpr->colnames = (var->vartype == RECORDOID) ? colnames : NIL;
         rowexpr->location = var->location;

         return (Node *) rowexpr;
diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h
index 433437643e..b039ea305a 100644
--- a/src/include/nodes/primnodes.h
+++ b/src/include/nodes/primnodes.h
@@ -1052,15 +1052,13 @@ typedef struct ArrayExpr
  * than vice versa.)  It is important not to assume that length(args) is
  * the same as the number of columns logically present in the rowtype.
  *
- * colnames provides field names in cases where the names can't easily be
- * obtained otherwise.  Names *must* be provided if row_typeid is RECORDOID.
- * If row_typeid identifies a known composite type, colnames can be NIL to
- * indicate the type's cataloged field names apply.  Note that colnames can
- * be non-NIL even for a composite type, and typically is when the RowExpr
- * was created by expanding a whole-row Var.  This is so that we can retain
- * the column alias names of the RTE that the Var referenced (which would
- * otherwise be very difficult to extract from the parsetree).  Like the
- * args list, colnames is one-for-one with physical fields of the rowtype.
+ * colnames provides field names if the ROW() result is of type RECORD.
+ * Names *must* be provided if row_typeid is RECORDOID; but if it is a
+ * named composite type, colnames will be ignored in favor of using the
+ * type's cataloged field names, so colnames should be NIL.  Like the
+ * args list, colnames is defined to be one-for-one with physical fields
+ * of the rowtype (although dropped columns shouldn't appear in the
+ * RECORD case, so this fine point is currently moot).
  */
 typedef struct RowExpr
 {

Re: ExecTypeSetColNames is fundamentally broken

От

Robert Haas

Дата:

06 декабря 2021 г., 23:30:28

On Sun, Dec 5, 2021 at 1:46 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> So 0001 attached fixes this by revoking the decision to apply
> ExecTypeSetColNames in cases where a Var or RowExpr is declared
> to return a named composite type.  This causes a couple of regression
> test results to change, but they are ones that were specifically
> added to exercise this behavior that we now see to be invalid.

I don't understand the code so I can't comment on the code, but I find
the regression test changes pretty suspect. Attaching any alias list
to the RTE ought to rename the output columns for all purposes, not
just the ones we as implementers find convenient. I understand that we
have to do *something* here and that the present behavior is buggy and
unacceptable ... but I'm not sure I accept that the only possible fix
is the one you propose here.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: ExecTypeSetColNames is fundamentally broken

От

Tom Lane

Дата:

07 декабря 2021 г., 00:05:32

Robert Haas <robertmhaas@gmail.com> writes:
> I don't understand the code so I can't comment on the code, but I find
> the regression test changes pretty suspect. Attaching any alias list
> to the RTE ought to rename the output columns for all purposes, not
> just the ones we as implementers find convenient.

Well, that was what I thought when I wrote bf7ca1587, but it leads
to logical contradictions.  Consider

create table t (a int, b int);

create function f(t) returns ... ;

select f(t) from t;

select f(t) from t(x,y);

If we adopt the "rename for all purposes" interpretation, then
the second SELECT must fail, because what f() is being passed is
no longer of type t.  If you ask me, that'll be a bigger problem
for users than the change I'm proposing (quite independently of
how hard it might be to implement).  It certainly will break
a behavior that goes back much further than bf7ca1587.

            regards, tom lane

Re: ExecTypeSetColNames is fundamentally broken

От

Robert Haas

Дата:

07 декабря 2021 г., 19:19:27

On Mon, Dec 6, 2021 at 4:05 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Well, that was what I thought when I wrote bf7ca1587, but it leads
> to logical contradictions.  Consider
>
> create table t (a int, b int);
>
> create function f(t) returns ... ;
>
> select f(t) from t;
>
> select f(t) from t(x,y);
>
> If we adopt the "rename for all purposes" interpretation, then
> the second SELECT must fail, because what f() is being passed is
> no longer of type t.  If you ask me, that'll be a bigger problem
> for users than the change I'm proposing (quite independently of
> how hard it might be to implement).  It certainly will break
> a behavior that goes back much further than bf7ca1587.

For me, the second SELECT does fail:

rhaas=# select f(t) from t(x,y);
ERROR:  column "x" does not exist
LINE 1: select f(t) from t(x,y);
                           ^

If it didn't fail, what would the behavior be? I suppose you could
make an argument for trying to match up the columns by position, but
if so this ought to work:

rhaas=# create table u(a int, b int);
CREATE TABLE
rhaas=# select f(u) from u;
ERROR:  function f(u) does not exist
rhaas=# select f(u::t) from u;
ERROR:  cannot cast type u to t

Matching columns by name can't work because the names don't match.
Matching columns by position does not work. So if that example
succeeds, the only real explanation is that we magically know that
we've still got a value of type t despite the user's best attempt to
decree otherwise. I know PostgreSQL sometimes ... does things like
that. I have no idea why anyone would consider it a desirable
behavior, though.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: ExecTypeSetColNames is fundamentally broken

От

Tom Lane

Дата:

07 декабря 2021 г., 20:30:48

Robert Haas <robertmhaas@gmail.com> writes:
> On Mon, Dec 6, 2021 at 4:05 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> select f(t) from t(x,y);
>> 
>> If we adopt the "rename for all purposes" interpretation, then
>> the second SELECT must fail, because what f() is being passed is
>> no longer of type t.

> For me, the second SELECT does fail:

> rhaas=# select f(t) from t(x,y);
> ERROR:  column "x" does not exist

Ah, sorry, I fat-fingered the alias syntax.  Here's a tested example:

regression=# create table t (a int, b int);
CREATE TABLE
regression=# insert into t values(11,12);
INSERT 0 1
regression=# create function f(t) returns int as 'select $1.a' language sql;
CREATE FUNCTION
regression=# select f(t) from t as t(x,y);
 f  
----
 11
(1 row)

If we consider that the alias renames the columns "for all purposes",
how is it okay for f() to select the "a" column?

Another way to phrase the issue is that the column names seen
by f() are currently different from those seen by row_to_json():

regression=# select row_to_json(t) from t as t(x,y);
   row_to_json   
-----------------
 {"x":11,"y":12}
(1 row)

and that seems hard to justify.

            regards, tom lane

Re: ExecTypeSetColNames is fundamentally broken

От

Robert Haas

Дата:

07 декабря 2021 г., 20:58:48

On Tue, Dec 7, 2021 at 12:30 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> If we consider that the alias renames the columns "for all purposes",
> how is it okay for f() to select the "a" column?

I'd say it isn't.

> Another way to phrase the issue is that the column names seen
> by f() are currently different from those seen by row_to_json():
>
> regression=# select row_to_json(t) from t as t(x,y);
>    row_to_json
> -----------------
>  {"x":11,"y":12}
> (1 row)
>
> and that seems hard to justify.

Yeah, I agree. The problem I have here is that, with your proposed
fix, it still won't be very consistent. True, row_to_json() and f()
will both see the unaliased column names ... but a straight select *
from t as t(x,y) will show the aliased names. That's unarguably better
than corrupting your data, but it seems "astonishing" in the POLA
sense.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: ExecTypeSetColNames is fundamentally broken

От

Tom Lane

Дата:

07 декабря 2021 г., 21:19:29

Robert Haas <robertmhaas@gmail.com> writes:
> On Tue, Dec 7, 2021 at 12:30 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> If we consider that the alias renames the columns "for all purposes",
>> how is it okay for f() to select the "a" column?

> I'd say it isn't.

In a green field I'd probably agree with you, but IMO that will
break far too much existing SQL code.

It'd cause problems for us too, not only end-users.  As an example,
ruleutils.c would have to avoid attaching new column aliases to
tables that are referenced as whole-row Vars.  I'm not very sure
that that's even possible without creating insurmountable ambiguity
issues.  There are also fun issues around what happens to a stored
query after a table column rename.  Right now the query acts as
though it uses the old name as a column alias, and that introduces
no semantic problem; but that behavior would no longer be
acceptable.

So the alternatives I see are to revert what bf7ca1587 tried to do
here, or to try to make it work that way across-the-board, which
implies (a) a very much larger amount of work, and (b) breaking
important behaviors that are decades older than that commit.
It's not even entirely clear that we could get to complete
consistency if we went down that path.

            regards, tom lane

Re: ExecTypeSetColNames is fundamentally broken

От

Robert Haas

Дата:

15 марта 2022 г., 23:23:56

On Tue, Dec 7, 2021 at 1:19 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> So the alternatives I see are to revert what bf7ca1587 tried to do
> here, or to try to make it work that way across-the-board, which
> implies (a) a very much larger amount of work, and (b) breaking
> important behaviors that are decades older than that commit.
> It's not even entirely clear that we could get to complete
> consistency if we went down that path.

Continuing my tour through the "bug fixes" section of the CommitFest,
I came upon this thread. Unfortunately there's not that much I can do
to progress it, because I've already expressed all the opinions that I
have on this thread. If we back-patch Tom's originally proposed fix, i
expect we might get a complaint or too, but the current behavior of
being able to create unreadable tables is indisputably poor, and I'm
not in a position to tell Tom that he has to go write a different fix
instead, or even that such a fix is possible. Unless somebody else
wants to comment, which IMHO would be good, I think it's up to Tom to
make a decision here on how he'd like to proceed and then, probably,
just do it.

Anyone else have thoughts?

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: ExecTypeSetColNames is fundamentally broken

От

Aleksander Alekseev

Дата:

16 марта 2022 г., 15:47:15

Hi hackers,

> Anyone else have thoughts?

I came across this thread while looking for the patches that need review.

My understanding of the code is limited, but I can say that I don't
see anything particularly wrong with it. I can also confirm that it
fixes the problem reported by the user while passing the rest of the
tests.

I understand the concern expressed by Robert in respect of backward
compatibility. From the user's perspective, personally I would prefer
a fixed bug over backward compatibility. Especially if we consider the
fact that the current behaviour of cases like `select row_to_json(i)
from int8_tbl i(x,y)` is not necessarily the correct one, at least it
doesn't look right to me.

So unless anyone has strong objections against the proposed fix or can
propose a better patch, I would suggest merging it.

-- 
Best regards,
Aleksander Alekseev

Re: ExecTypeSetColNames is fundamentally broken

От

Tom Lane

Дата:

18 марта 2022 г., 01:28:11

Aleksander Alekseev <aleksander@timescale.com> writes:
> I understand the concern expressed by Robert in respect of backward
> compatibility. From the user's perspective, personally I would prefer
> a fixed bug over backward compatibility. Especially if we consider the
> fact that the current behaviour of cases like `select row_to_json(i)
> from int8_tbl i(x,y)` is not necessarily the correct one, at least it
> doesn't look right to me.

It's debatable anyway.  I'd be less worried about changing this behavior
if we didn't have to back-patch it ... but I see no good alternative.

> So unless anyone has strong objections against the proposed fix or can
> propose a better patch, I would suggest merging it.

Done now.

            regards, tom lane

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: ExecTypeSetColNames is fundamentally broken