Re: BUG #14235: inconsistencies with IS NULL / IS NOT NULL
От | Andrew Gierth |
---|---|
Тема | Re: BUG #14235: inconsistencies with IS NULL / IS NOT NULL |
Дата | |
Msg-id | 87zip9qti4.fsf@news-spur.riddles.org.uk обсуждение исходный текст |
Ответ на | Re: BUG #14235: inconsistencies with IS NULL / IS NOT NULL (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: BUG #14235: inconsistencies with IS NULL / IS NOT NULL
(Tom Lane <tgl@sss.pgh.pa.us>)
|
Список | pgsql-bugs |
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: >> The executor, when doing IS [NOT] NULL on a composite value, looks >> at each column to see if it is the null value. It does NOT recurse >> into nested composite values, and my reading of the spec suggests >> that this is correct. Tom> Hmm. Of course the $64 question is whether that really is Tom> correct, or sensible. Tom> I went to look at the spec, and discovered that SQL:2011 actually Tom> has wording that is different from SQL99, which I think is what we Tom> relied on last time we considered this issue. I've been mainly using the 2008 spec, but it seems to have identical wording to the 2011 one. My reading is as follows, based on these parts of the spec: [Framework] 4.4.2 The null value Every data type includes a special value, called the null value, sometimes denoted by the keyword NULL. This value differs from other values in the following respects: [...] [Foundation] 8.8 <null predicate> (as you quoted) It seems to me that where the spec refers to "the null value", this is exactly what we do with the "isnull" flag, and that <null predicate> is a separate thing. Where the spec refers to the "fields" of a row, it at least strongly implies that these are never taken recursively: [Framework] 4.4.5.2 Row types A row type is a sequence of one or more (field name, data type) pairs, known as fields. A value of a row type consists of one value for each of its fields. Also in 6.2 <field definition> the "degree" of a row type is defined in such a way that a row type with one declared field, whose type is also a row type, has a degree of 1. So the comment in 8.8 about degree-1 rows only makes sense if 8.8 isn't supposed to recurse. Plus, if 8.8 had been intended to be recursive, it could easily have been written that way. Most of the references to nullity in the spec talk specifically about "the null value" rather than referencing IS [NOT] NULL. In particular, constructs like count(x) refer to "eliminating the null value", which in turn implies that count(x) is not the same thing as count(*) filter (where x is not null) if x is a row type. Also, IS DISTINCT FROM only refers to "the null value", so (x IS DISTINCT FROM NULL) is not the same as either (x IS NOT NULL) or (NOT (x IS NULL)). Likewise the rules for <routine invocation> only talk about "the null value" in deciding whether a routine declared RETURNS NULL ON NULL INPUT (i.e. STRICT) is actually called. The places that explicitly use IS [NOT] NULL are: - some definitions about whether values are known-not-null, which is ok because (x IS NOT NULL) clearly implies that x is not the null value - some rules for multiset operations where the operands must be multisets, not rows - NOT NULL constraints on columns are transformed into CHECK (col IS NOT NULL), which isn't what we do: our not-null constraint currently only excludes the null value, though this has been discussed in some past threads - COALESCE is implemented as CASE WHEN x IS NOT NULL THEN ... which again is not what we do. The logic of the spec's version of this is truly bizarre, and I can't imagine any possible use for it; having COALESCE(row(1,null),row(2,null)) return (2,null) as the spec demands seems to be all kinds of wrong. Tom> <null predicate> ::= <row value predicand> <null predicate part 2> Tom> <null predicate part 2> ::= IS [ NOT ] NULL Tom> (Oddly, SQL does not seem to allow IS [NOT] NULL on non-composite Tom> values, which is just silly.) <row value predicand> can be a <row value constructor predicand> which can be a <common value expression> or a <boolean predicand>, either of which can be a <value expression primary>, and in both of these cases the syntax rules specify that they are to be replaced by ROW(X). So non-composite values are allowed and are indistinguishable from composite values of degree 1. (Note that <row value predicand> can also be a <row value special case> which can be a <nonparenthesized value expression primary>, but that is required to be of a row type. The spec here seems to rely on constraints about the declared types to disambiguate between the case of a <row value special case> and a <common value expression> which is a <nonparenthesized value expression primary>.) Tom> Rule (2a) was not there in SQL99. But look at what this is doing: Tom> it is admitting straight out that a null composite value is not Tom> the same as a composite value all of whose fields are null. It is Tom> only asserting that a <null predicate> will not distinguish them. Right, that's my reading too. Tom> The implication is that it's just fine if, say, COALESCE() doesn't Tom> act that way. Not COALESCE, because COALESCE is defined as a syntactic transform to a CASE expression that uses IS NOT NULL. But I really don't think we should follow the spec on that one. >> It seems possible that this could be fixed by simply setting >> argisrow=false for all the null tests generated in such cases. Tom> I concur that this is an appropriate fix if we believe that Tom> ExecEvalNullTest's behavior is correct. I believe it is, as explained above. That said, there may also be merit in arguing that the spec is just broken on this point. I think back when IS NULL was originally changed to conform to the spec, it might have been better to just say "no, that's silly" and stick with the old behavior. We wouldn't strictly speaking be violating the spec if we did that, because we don't claim to support feature T051 "Row types", and so we can just say that our row types are a nonstandard extension without the standard IS NULL behavior. We do in fact violate the spec here in many ways; for example we treat types from CREATE TYPE foo AS (fields) as being row types, but in fact the spec says these are user-defined structured types and not row types. But whether it would be a good idea to actually revert to the old way... that's another question. Does anyone actually _want_ the spec behavior of IS [NOT] NULL? -- Andrew (irc:RhodiumToad)
В списке pgsql-bugs по дате отправления: