Обсуждение: null values in non-nullable column

Поиск
Список
Период
Сортировка

null values in non-nullable column

От
"George Pavlov"
Дата:
In 8.1 I have a situation where nullability of user defined datatypes
does not seem to be enforced. Using the following steps I end up with a
table that has a column that should not be nullable, but has nulls in
it. Here's a pared down outline of the steps:

-- create a datatype that should enforce not null values
gp_test=# create domain boolean_not_null as boolean not null;

-- a table with a field using the datatype
gp_test=# create table a (id integer, test boolean_not_null);
gp_test=# insert into a values (1, true);
gp_test=# select * from a;id | test
----+------ 1 | t

-- as expected no nulls are allowed
gp_test=# insert into a values (100, null);
ERROR:  domain boolean_not_null does not allow null values

-- a second table (a parent)
gp_test=# create table b (id integer);
gp_test=# insert into b values (1);
gp_test=# insert into b values (2);
gp_test=# select * from b;id
---- 1 2

-- now create a table based on a left join
-- this creates a table with a not-nullable column (datatype
-- inherited from the original table) which contains nulls;
-- even though insertion of new nulls is not allowed
gp_test=# create table m as select id, test from b left join a using
(id);
gp_test=# \d m;          Table "public.m"Column |       Type       | Modifiers
--------+------------------+-----------id     | integer          | test   | boolean_not_null |

gp_test=# select * from m;id |  test
----+-------- 1 | t 2 | <NULL>

gp_test=# insert into m values (100, null);
ERROR:  domain boolean_not_null does not allow null values
gp_test=# insert into m (id) values (100);
ERROR:  domain boolean_not_null does not allow null values
gp_test=# update m set test = test; -- note no error here!
gp_test=# update m set test = (test and true);
ERROR:  domain boolean_not_null does not allow null values

I would have expected failure at the table creation step, but it
proceeds (and inserts the nulls). Interestingly, I do see a failure
after I try to restore the table from a dump (using pg_dump/pg_restore).


George




Re: null values in non-nullable column

От
Tom Lane
Дата:
"George Pavlov" <gpavlov@mynewplace.com> writes:
> -- this creates a table with a not-nullable column (datatype 
> -- inherited from the original table) which contains nulls;

Hm.  Arguably we should discard domain-ness in any SELECT result,
but I'm sure some people would complain about that ...
        regards, tom lane


Re: null values in non-nullable column

От
Michael Glaesemann
Дата:
On Dec 5, 2006, at 11:52 , Tom Lane wrote:

> "George Pavlov" <gpavlov@mynewplace.com> writes:
>> -- this creates a table with a not-nullable column (datatype  
>> -- inherited from the original table) which contains nulls;
>
> Hm.  Arguably we should discard domain-ness in any SELECT result,
> but I'm sure some people would complain about that ...

Tom,

Could you expand on that a bit? Here's what I've interpreted:

The column types of the select are assumed to be (int,  
boolean_not_null), and so the values aren't checked again before the  
insert during CREATE TABLE AS. "discarding domain-ness" would mean  
considering the results as their base type, and rechecking the domain  
would be checked when inserting into the table.

Just trying to fit my head around this.

Michael Glaesemann
grzm seespotcode net




Re: null values in non-nullable column

От
Tom Lane
Дата:
Michael Glaesemann <grzm@seespotcode.net> writes:
> Could you expand on that a bit? Here's what I've interpreted:

> The column types of the select are assumed to be (int,  
> boolean_not_null), and so the values aren't checked again before the  
> insert during CREATE TABLE AS. "discarding domain-ness" would mean  
> considering the results as their base type, and rechecking the domain  
> would be checked when inserting into the table.

To be clear: the problem is not with CREATE TABLE AS.  The problem is
with the definition of what a SELECT ... LEFT JOIN ... is supposed to
return.  C.T.A. is supposed to create a table matching the defined
column types of the SELECT result; either those types allow null-ness,
or they don't.

Actually I think this is a bug in the SQL spec :-(  The description of
<joined table> says that output columns are "possibly nullable" if
they're on the nullable side of the outer join, but it's not apparent
that that idea is meant to negate a domain constraint.  And yet, if it
does not, then an outer join with a NOT NULL domain column on the
nullable side is just invalid.
        regards, tom lane


Re: null values in non-nullable column

От
Michael Glaesemann
Дата:
On Dec 5, 2006, at 12:49 , Tom Lane wrote:

> Actually I think this is a bug in the SQL spec :-(  The description of
> <joined table> says that output columns are "possibly nullable" if
> they're on the nullable side of the outer join, but it's not apparent
> that that idea is meant to negate a domain constraint.  And yet, if it
> does not, then an outer join with a NOT NULL domain column on the
> nullable side is just invalid.

Very interesting. Thanks for the explanation. I should dig into the  
spec more.

Michael Glaesemann
grzm seespotcode net




Re: null values in non-nullable column

От
Markus Schaber
Дата:
Hi, George,

"George Pavlov" <gpavlov@mynewplace.com> wrote:

> In 8.1 I have a situation where nullability of user defined datatypes
> does not seem to be enforced. Using the following steps I end up with a
> table that has a column that should not be nullable, but has nulls in
> it.

Ouch. That hurts!

Now all those MySQL freaks can argue "Well, PostgreSQL does allow
constraint violation, too.", despite the fact that most of them don't
even basically understand the special case we have here.

So, please, keep silent about it :-)

How difficult is it to fix this (e. G. by disallowing NULL-generating
JOINs on NOT NULL domains?)

Regards,
Markus

-- 
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf.     | Software Development GIS

Fight against software patents in Europe! www.ffii.org
www.nosoftwarepatents.org


Re: null values in non-nullable column

От
"George Pavlov"
Дата:
Yes, the thread did not seem to go very far. The SQL standard does seem
inconsistent in this area, but that is not an argument for allowing data
constraint violation. Until the standard is modified I think it would be
good for the reputation of the DBMS we all love to come up with a fix...


Even though, as Tom Lane explained, CREATE TABLE AS is not the problem
here, it seems to me that might be the cleanest, least obtrusive place
to add validation. If C.T.A failed at the table creation step because of
the JOIN-produces NULLs that would be an early and decent warning.
Fixing it from the JOIN side (e.g. disallowing NULL-generating JOINs on
NOT NULL domains) seems too strict -- JOINs are mostly used for result
sets that are not materialized and you hardly have the potential for a
problem until they are materialized as a table. Similarly, removing the
domain-ness of the JOIN column strikes me as too drastic and as having
the potential of breaking existing functionality. I am sure I am missing
something, just my two cents...

George



Re: null values in non-nullable column

От
Richard Huxton
Дата:
George Pavlov wrote:
> Even though, as Tom Lane explained, CREATE TABLE AS is not the problem
> here, it seems to me that might be the cleanest, least obtrusive place
> to add validation. If C.T.A failed at the table creation step because of
> the JOIN-produces NULLs that would be an early and decent warning.
> Fixing it from the JOIN side (e.g. disallowing NULL-generating JOINs on
> NOT NULL domains) seems too strict -- JOINs are mostly used for result
> sets that are not materialized and you hardly have the potential for a
> problem until they are materialized as a table. Similarly, removing the
> domain-ness of the JOIN column strikes me as too drastic and as having
> the potential of breaking existing functionality. I am sure I am missing
> something, just my two cents...

I think it has to go in the join...

If a result-set has nulls in a particular column, that column can't be 
NOT NULL (by definition). Therefore, either the column has its not-null 
constraint removed (through type-casting away the domain) or the query 
fails on that NOT NULL constraint.

Any query could result in this sort of problem, not just an explicit 
JOIN with NULLs. Imagine a domain "even_numbers_only" and a "SELECT 
my_even_numbers+1 FROM foo".

Hmm - it strikes me that any result-set should perhaps have the domain 
removed and substituted with its parent type, except perhaps in the 
simplest "pass column through" case.

--   Richard Huxton  Archonet Ltd