Обсуждение: Assigning ROW variable having NULL value to RECORD type variabledoesn't give any structure to the RECORD variable.

Поиск
Список
Период
Сортировка
Hi All,

When a ROW variable having NULL value is assigned to a RECORD
variable, it gives no structure to the RECORD type variable. Let's
consider the following example.

create table t1(a int, b text);

insert into t1 values(1, 'str1');

create or replace function f1() returns void as
$$
declare
   row t1%ROWTYPE;
   rec RECORD;
begin
   row := NULL;
   rec := row;
   raise info 'rec.a = %, rec.b = %', rec.a, rec.b;
end;
$$ language plpgsql;

In above example as 'row' variable is having NULL value, assigning
this to 'rec' didn't give any structure to it although 'row' is having
a predefined structure. Here is the error observed when above function
is executed.

select f1();
ERROR:  record "rec" is not assigned yet

This started happening from the following git commit onwards,

commit 4b93f57999a2ca9b9c9e573ea32ab1aeaa8bf496
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date:   Tue Feb 13 18:52:21 2018 -0500

    Make plpgsql use its DTYPE_REC code paths for composite-type variables.

I know this is expected to happen considering the changes done in
above commit because from this commit onwards, NULL value assigned to
any row variable represents a true NULL composite value before this
commit it used to be a tuple with each column having null value in it.
But, the point is, even if the row variable is having a NULL value it
still has a structure associated with it. Shouldn't that structure be
transferred to RECORD variable when it is assigned with a ROW type
variable ? Can we consider this behaviour change as a side effect of
the improvement done in the RECORD type of variable?

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com





st 1. 1. 2020 v 16:50 odesílatel Ashutosh Sharma <ashu.coek88@gmail.com> napsal:
Hi All,

When a ROW variable having NULL value is assigned to a RECORD
variable, it gives no structure to the RECORD type variable. Let's
consider the following example.

create table t1(a int, b text);

insert into t1 values(1, 'str1');

create or replace function f1() returns void as
$$
declare
   row t1%ROWTYPE;
   rec RECORD;
begin
   row := NULL;
   rec := row;
   raise info 'rec.a = %, rec.b = %', rec.a, rec.b;
end;
$$ language plpgsql;

In above example as 'row' variable is having NULL value, assigning
this to 'rec' didn't give any structure to it although 'row' is having
a predefined structure. Here is the error observed when above function
is executed.

select f1();
ERROR:  record "rec" is not assigned yet

This started happening from the following git commit onwards,

commit 4b93f57999a2ca9b9c9e573ea32ab1aeaa8bf496
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date:   Tue Feb 13 18:52:21 2018 -0500

    Make plpgsql use its DTYPE_REC code paths for composite-type variables.

I know this is expected to happen considering the changes done in
above commit because from this commit onwards, NULL value assigned to
any row variable represents a true NULL composite value before this
commit it used to be a tuple with each column having null value in it.
But, the point is, even if the row variable is having a NULL value it
still has a structure associated with it. Shouldn't that structure be
transferred to RECORD variable when it is assigned with a ROW type
variable ? Can we consider this behaviour change as a side effect of
the improvement done in the RECORD type of variable?

+1

Pavel


--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com


Further, if a table type (a.k.a. composite type or row type) having null value or holding no data in it is assigned to a record variable there is no structure provided to the record variable. However when the same table having no data in it is assigned to the record variable, it does provide structure to the record variable. I mean in both the cases we are assigning null value to the record type so it looks a bit weird to see that in one case we end up providing a proper structure to the record variable but not in the other case. Here is an example illustrating this scenario,

create table t1(a int, b text);

do $$
declare
   x t1;
   y record;
begin
   -- y := x;                           -- as mentioned earlier this doesn't provide any structure to variable 'y'.
   --select * into y from t1;   -- this does provide a structure to the variable 'y'.
   raise info 'y.a = %', y.a;    -- this errors out for 1st statement (y := x) but not for the later one (select ... into)
end;
$$ language plpgsql;


Investigating this revealed that in later case i.e. in case of into clause, although there is no tuple returned by the select query still a tuple descriptor is set by the query which provides the structure to the record variable being written because having non-null tuple descriptor allows the creation of an expanded object for the record variable eventually giving it a structure.

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com


On Wed, Jan 1, 2020 at 9:29 PM Pavel Stehule <pavel.stehule@gmail.com> wrote:


st 1. 1. 2020 v 16:50 odesílatel Ashutosh Sharma <ashu.coek88@gmail.com> napsal:
Hi All,

When a ROW variable having NULL value is assigned to a RECORD
variable, it gives no structure to the RECORD type variable. Let's
consider the following example.

create table t1(a int, b text);

insert into t1 values(1, 'str1');

create or replace function f1() returns void as
$$
declare
   row t1%ROWTYPE;
   rec RECORD;
begin
   row := NULL;
   rec := row;
   raise info 'rec.a = %, rec.b = %', rec.a, rec.b;
end;
$$ language plpgsql;

In above example as 'row' variable is having NULL value, assigning
this to 'rec' didn't give any structure to it although 'row' is having
a predefined structure. Here is the error observed when above function
is executed.

select f1();
ERROR:  record "rec" is not assigned yet

This started happening from the following git commit onwards,

commit 4b93f57999a2ca9b9c9e573ea32ab1aeaa8bf496
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date:   Tue Feb 13 18:52:21 2018 -0500

    Make plpgsql use its DTYPE_REC code paths for composite-type variables.

I know this is expected to happen considering the changes done in
above commit because from this commit onwards, NULL value assigned to
any row variable represents a true NULL composite value before this
commit it used to be a tuple with each column having null value in it.
But, the point is, even if the row variable is having a NULL value it
still has a structure associated with it. Shouldn't that structure be
transferred to RECORD variable when it is assigned with a ROW type
variable ? Can we consider this behaviour change as a side effect of
the improvement done in the RECORD type of variable?

+1

Pavel


--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com


On Wed, Jan 1, 2020 at 10:50 AM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> I know this is expected to happen considering the changes done in
> above commit because from this commit onwards, NULL value assigned to
> any row variable represents a true NULL composite value before this
> commit it used to be a tuple with each column having null value in it.
> But, the point is, even if the row variable is having a NULL value it
> still has a structure associated with it. Shouldn't that structure be
> transferred to RECORD variable when it is assigned with a ROW type
> variable ? Can we consider this behaviour change as a side effect of
> the improvement done in the RECORD type of variable?

I'm not an expert on this topic. However, I *think* that you're trying
to distinguish between two things that are actually the same. If it's
a "true NULL," it has no structure; it's just NULL. If it has a
structure, then it's really a composite value with a NULL in each
defined column, i.e. (NULL, NULL, NULL, ...) for some row type rather
than just NULL.

I have to admit that I've always found PL/pgsql to be a bit pedantic
about this whole thing. For instance:

rhaas=# do $$declare x record; begin raise notice '%', x.a; end;$$
language plpgsql;
ERROR:  record "x" is not assigned yet
DETAIL:  The tuple structure of a not-yet-assigned record is indeterminate.
CONTEXT:  SQL statement "SELECT x.a"
PL/pgSQL function inline_code_block line 1 at RAISE

But maybe it should just make x.a evaluate to NULL. It's one thing if
I have a record with columns 'a' and 'b' and I ask for column 'c'; I
guess you could call that NULL, but it feels reasonably likely to be a
programming error. But if we have no idea what the record columns are
at all, perhaps we could just assume that whatever column the user is
requesting is intended to be one of them, and that since the whole
thing is null, that column in particular is null.

On the other hand, maybe that would be too lenient and lead to subtle
and hard-to-find bugs in plpgsql programs.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company





pá 3. 1. 2020 v 19:57 odesílatel Robert Haas <robertmhaas@gmail.com> napsal:
On Wed, Jan 1, 2020 at 10:50 AM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> I know this is expected to happen considering the changes done in
> above commit because from this commit onwards, NULL value assigned to
> any row variable represents a true NULL composite value before this
> commit it used to be a tuple with each column having null value in it.
> But, the point is, even if the row variable is having a NULL value it
> still has a structure associated with it. Shouldn't that structure be
> transferred to RECORD variable when it is assigned with a ROW type
> variable ? Can we consider this behaviour change as a side effect of
> the improvement done in the RECORD type of variable?

I'm not an expert on this topic. However, I *think* that you're trying
to distinguish between two things that are actually the same. If it's
a "true NULL," it has no structure; it's just NULL. If it has a
structure, then it's really a composite value with a NULL in each
defined column, i.e. (NULL, NULL, NULL, ...) for some row type rather
than just NULL.

I have to admit that I've always found PL/pgsql to be a bit pedantic
about this whole thing. For instance:

rhaas=# do $$declare x record; begin raise notice '%', x.a; end;$$
language plpgsql;
ERROR:  record "x" is not assigned yet
DETAIL:  The tuple structure of a not-yet-assigned record is indeterminate.
CONTEXT:  SQL statement "SELECT x.a"
PL/pgSQL function inline_code_block line 1 at RAISE

But maybe it should just make x.a evaluate to NULL. It's one thing if
I have a record with columns 'a' and 'b' and I ask for column 'c'; I
guess you could call that NULL, but it feels reasonably likely to be a
programming error. But if we have no idea what the record columns are
at all, perhaps we could just assume that whatever column the user is
requesting is intended to be one of them, and that since the whole
thing is null, that column in particular is null.

I don't like this idea. We should not to invent record's fields created by reading or writing some field. At end it block any static code analyze and it can hide a errors. If we enhance a interface for json or jsonb, then this dynamic work can be done with these types.

We should to distinguish between typend and untyped NULL - it has sense for me (what was proposed by Ashutosh Sharma), but I don't see any sense to go far.

Regards

Pavel



On the other hand, maybe that would be too lenient and lead to subtle
and hard-to-find bugs in plpgsql programs.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, Jan 1, 2020 at 10:50 AM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>> I know this is expected to happen considering the changes done in
>> above commit because from this commit onwards, NULL value assigned to
>> any row variable represents a true NULL composite value before this
>> commit it used to be a tuple with each column having null value in it.
>> But, the point is, even if the row variable is having a NULL value it
>> still has a structure associated with it. Shouldn't that structure be
>> transferred to RECORD variable when it is assigned with a ROW type
>> variable ? Can we consider this behaviour change as a side effect of
>> the improvement done in the RECORD type of variable?

> I'm not an expert on this topic. However, I *think* that you're trying
> to distinguish between two things that are actually the same. If it's
> a "true NULL," it has no structure; it's just NULL. If it has a
> structure, then it's really a composite value with a NULL in each
> defined column, i.e. (NULL, NULL, NULL, ...) for some row type rather
> than just NULL.

Yeah.  In general, we can't do this, because a null value of type
RECORD simply hasn't got any information about what specific rowtype
might be involved.  In the case where the null is of a named composite
type, rather than RECORD, we could choose to act differently ... but
I'm not really sure that such a change would be an improvement and not
just a decrease in consistency.

In any case, plpgsql's prior behavior was an implementation artifact
with very little to recommend it.  As a concrete example, consider

create table t1(a int, b text);

do $$
declare x t1; r record;
begin
  x := null;
  r := x;
  raise notice 'r.a = %', r.a;
end $$;

do $$
declare r record;
begin
  r := null::t1;
  raise notice 'r.a = %', r.a;
end $$;

I assert that in any sanely-defined semantics, these two examples
should give the same result.  In v11 and up, they both give
'record "r" is not assigned yet' ... but in prior versions, they
gave different results.  I do not want to go back to that.

On the other hand, we now have

do $$
declare x t1; r record;
begin
  x := null;
  r := x;
  raise notice 'x.a = %', x.a;
  raise notice 'r.a = %', r.a;
end $$;

which gives

NOTICE:  x.a = <NULL>
ERROR:  record "r" is not assigned yet

which is certainly also inconsistent.  The variable declared as
being type t1 behaves, for this purpose, as if it contained
"row(null,null)" not just a simple null.  But if you print it,
or assign it to something else as a whole, you'll find it just
contains a simple null.  One way to see that these are different
states is to do

do $$ declare x t1; begin x := null; raise notice 'x = %', x; end$$;
NOTICE:  x = <NULL>

versus

do $$ declare x t1; begin x := row(null,null); raise notice 'x = %', x; end$$;
NOTICE:  x = (,)

And, if you assign a row of nulls to a record-type variable, that works:

do $$
declare x t1; r record;
begin
  x := row(null,null);
  r := x;
  raise notice 'x.a = %', x.a;
  raise notice 'r.a = %', r.a;
end $$;

which gives

NOTICE:  x.a = <NULL>
NOTICE:  r.a = <NULL>

If we were to change this behavior, I think it would be tantamount
to sometimes expanding a simple null to a row of nulls, and I'm
not sure that's a great idea.

The SQL standard is confusing in this respect, because it seems
that at least the "x IS [NOT] NULL" construct is defined to
consider both a "simple NULL" and ROW(NULL,NULL,...) as "null".
But we've concluded that other parts of the spec do allow for
a distinction (I'm too lazy to search the archives for relevant
discussions, but there have been some).  The two things are
definitely different implementation-wise, so it would be hard
to hide the difference completely.

Another fun fact is that right now, assignment of any null value
to a composite plpgsql variable works the same: you can assign a simple
null of some other composite type, or even a scalar null, and behold you
get a null composite value without any error.  That's because
exec_assign_value's DTYPE_REC case pays no attention to the declared
type of the source value once it's found to be null.  Thus

do $$ declare x t1; begin x := 42; raise notice 'x = %', x; end$$;
ERROR:  cannot assign non-composite value to a record variable

do $$ declare x t1; begin x := null::int; raise notice 'x = %', x; end$$;
NOTICE:  x = <NULL>

That's pretty bizarre, and I don't think I'd agree with adopting those
semantics if we were in a green field.  But if we start paying attention
to the specific type of a null source value, I bet we're going to break
some code that works today.

Anyway, maybe this area could be improved, but I'm not fully convinced.
I definitely do not subscribe to the theory that we need to make it
work like v10 again.

            regards, tom lane



On Sat, Jan 4, 2020 at 2:09 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Robert Haas <robertmhaas@gmail.com> writes:
> > On Wed, Jan 1, 2020 at 10:50 AM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> >> I know this is expected to happen considering the changes done in
> >> above commit because from this commit onwards, NULL value assigned to
> >> any row variable represents a true NULL composite value before this
> >> commit it used to be a tuple with each column having null value in it.
> >> But, the point is, even if the row variable is having a NULL value it
> >> still has a structure associated with it. Shouldn't that structure be
> >> transferred to RECORD variable when it is assigned with a ROW type
> >> variable ? Can we consider this behaviour change as a side effect of
> >> the improvement done in the RECORD type of variable?
>
> > I'm not an expert on this topic. However, I *think* that you're trying
> > to distinguish between two things that are actually the same. If it's
> > a "true NULL," it has no structure; it's just NULL. If it has a
> > structure, then it's really a composite value with a NULL in each
> > defined column, i.e. (NULL, NULL, NULL, ...) for some row type rather
> > than just NULL.
>
> Yeah.  In general, we can't do this, because a null value of type
> RECORD simply hasn't got any information about what specific rowtype
> might be involved.  In the case where the null is of a named composite
> type, rather than RECORD, we could choose to act differently ... but
> I'm not really sure that such a change would be an improvement and not
> just a decrease in consistency.
>
> In any case, plpgsql's prior behavior was an implementation artifact
> with very little to recommend it.  As a concrete example, consider
>
> create table t1(a int, b text);
>
> do $$
> declare x t1; r record;
> begin
>   x := null;
>   r := x;
>   raise notice 'r.a = %', r.a;
> end $$;
>
> do $$
> declare r record;
> begin
>   r := null::t1;
>   raise notice 'r.a = %', r.a;
> end $$;
>
> I assert that in any sanely-defined semantics, these two examples
> should give the same result.  In v11 and up, they both give
> 'record "r" is not assigned yet' ... but in prior versions, they
> gave different results.  I do not want to go back to that.
>
> On the other hand, we now have
>
> do $$
> declare x t1; r record;
> begin
>   x := null;
>   r := x;
>   raise notice 'x.a = %', x.a;
>   raise notice 'r.a = %', r.a;
> end $$;
>
> which gives
>
> NOTICE:  x.a = <NULL>
> ERROR:  record "r" is not assigned yet
>
> which is certainly also inconsistent.  The variable declared as
> being type t1 behaves, for this purpose, as if it contained
> "row(null,null)" not just a simple null.  But if you print it,
> or assign it to something else as a whole, you'll find it just
> contains a simple null.  One way to see that these are different
> states is to do
>
> do $$ declare x t1; begin x := null; raise notice 'x = %', x; end$$;
> NOTICE:  x = <NULL>
>
> versus
>
> do $$ declare x t1; begin x := row(null,null); raise notice 'x = %', x; end$$;
> NOTICE:  x = (,)
>
> And, if you assign a row of nulls to a record-type variable, that works:
>
> do $$
> declare x t1; r record;
> begin
>   x := row(null,null);
>   r := x;
>   raise notice 'x.a = %', x.a;
>   raise notice 'r.a = %', r.a;
> end $$;
>
> which gives
>
> NOTICE:  x.a = <NULL>
> NOTICE:  r.a = <NULL>
>
> If we were to change this behavior, I think it would be tantamount
> to sometimes expanding a simple null to a row of nulls, and I'm
> not sure that's a great idea.
>
> The SQL standard is confusing in this respect, because it seems
> that at least the "x IS [NOT] NULL" construct is defined to
> consider both a "simple NULL" and ROW(NULL,NULL,...) as "null".
> But we've concluded that other parts of the spec do allow for
> a distinction (I'm too lazy to search the archives for relevant
> discussions, but there have been some).  The two things are
> definitely different implementation-wise, so it would be hard
> to hide the difference completely.
>
> Another fun fact is that right now, assignment of any null value
> to a composite plpgsql variable works the same: you can assign a simple
> null of some other composite type, or even a scalar null, and behold you
> get a null composite value without any error.  That's because
> exec_assign_value's DTYPE_REC case pays no attention to the declared
> type of the source value once it's found to be null.  Thus
>
> do $$ declare x t1; begin x := 42; raise notice 'x = %', x; end$$;
> ERROR:  cannot assign non-composite value to a record variable
>
> do $$ declare x t1; begin x := null::int; raise notice 'x = %', x; end$$;
> NOTICE:  x = <NULL>
>
> That's pretty bizarre, and I don't think I'd agree with adopting those
> semantics if we were in a green field.  But if we start paying attention
> to the specific type of a null source value, I bet we're going to break
> some code that works today.
>
> Anyway, maybe this area could be improved, but I'm not fully convinced.
> I definitely do not subscribe to the theory that we need to make it
> work like v10 again.

Okay. Thanks for sharing your thoughts on this.

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com