Обсуждение: AW: [HACKERS] having and union in v7beta
>> play=> explain select * from comuni union select * from comuni; > *However*, we have not fixed the bug that causes "select foo union > select foo" to be incorrectly simplified --- the UNION code is still > applying cnfify. (Which it probably shouldn't, but I haven't wanted > to touch that code until I have the time to rewrite it completely.) > The reason 7.0beta1 generates the "right" answer is that it has a > recently-introduced bug in the comparison routines that causes it to > think the two select subqueries aren't the same. But if the two queries are the same, the union CAN be simplified, since the union of two identical masses (I don't know the correct word here) is still that one mass. Thus 6.5 simplification is correct in this particular case. Andreas
Zeugswetter Andreas SB <ZeugswetterA@Wien.Spardat.at> writes: >>> play=> explain select * from comuni union select * from comuni; >> *However*, we have not fixed the bug that causes "select foo union >> select foo" to be incorrectly simplified --- the UNION code is still >> applying cnfify. > But if the two queries are the same, the union CAN be simplified, > since the union of two identical masses (I don't know the correct word here) > is still that one mass. > Thus 6.5 simplification is correct in this particular case. No, it is NOT right, because we're dealing with multisets not sets (I think "set" is the English word you were looking for). The SQL spec specifies that UNION implies DISTINCT, ie, removal of duplicate rows: i) Let R be a row that is a duplicate of some row in T1 or of some row in T2 or both. Let m bethe number of duplicates of R in T1 and let n be the number of duplicates of R in T2, wherem ii) If ALL is not specified, then Case: 1) If UNION is specified, then Case: A) If m > 0 or n > 0, then T contains exactly one dupli- cate of R. B) Otherwise, T contains no duplicate of R. If query "select foo" would produce X, Y, Y, Z, then the correct result of "select foo UNION select foo" is X, Y, Z. But that's not what 6.5 will give you. I think it would be correct to simplify the union to "select DISTINCT foo" but that requires all-new simplification code, as well as some thought about how it'd interact with any DISTINCT or DISTINCT ON already present. regards, tom lane
On Thu, 2 Mar 2000, Zeugswetter Andreas SB wrote: > But if the two queries are the same, the union CAN be simplified, > since the union of two identical masses (I don't know the correct word here) > is still that one mass. "set" :) > > Thus 6.5 simplification is correct in this particular case. The issue here seems to be that the queries could have side-effects, such as select nextval('sequence1') union select nextval('sequence1') which should arguably return two distinct rows. I gotta reread SQL's opinion on this, but I'm sure Tom has already done that. From a mathematical point of view, I believe your assumption "lexically equal queries yield mathematically equal sets" is wrong. -- Peter Eisentraut Sernanders väg 10:115 peter_e@gmx.net 75262 Uppsala http://yi.org/peter-e/ Sweden