>>>>> "Tomas" == Tomas Vondra <tv@fuzzy.cz> writes:
>> As for computing it all twice, there's currently no attempt to>> optimize multiple identical grouping sets into
multiple>>projections of a single grouping set result. CUBE(a,b,c,a) has>> twice as many grouping sets as CUBE(a,b,c)
does,even though all>> the extra ones are duplicates.
Tomas> Shouldn't this be solved by eliminating the excessiveTomas> ChainAggregate? Although it probably changes
GROUPING(...),Tomas>so it's not just about removing the duplicate column(s) fromTomas> the CUBE.
Eliminating the excess ChainAggregate would not change the number of
grouping sets, only where they are computed.
Tomas> Maybe preventing this completely (i.e. raising an ERROR withTomas> "duplicate columns in CUBE/ROLLUP/...
clauses")would beTomas> appropriate. Does the standard says anything about this?
The spec does not say anything explicitly about duplicates, so they
are allowed (and duplicate grouping _sets_ can't be removed, only
duplicate columns within a single GROUP BY clause after the grouping
sets have been eliminated by transformation). I have checked my
reading of the spec against oracle 11 and MSSQL using sqlfiddle.
The way the spec handles grouping sets is to define a sequence of
syntactic transforms that result in a query which is a UNION ALL of
ordinary GROUP BY queries. (We haven't tried to implement the
additional optional feature of GROUP BY DISTINCT.) Since it's UNION
ALL, any duplicates must be preserved, so a query with GROUPING SETS
((a),(a)) reduces to:
SELECT ... GROUP BY a UNION ALL SELECT ... GROUP BY a;
and therefore has duplicates of all its result rows.
I'm quite prepared to concede that I may have read the spec wrong
(wouldn't be the first time), but in this case I require any such
claim to be backed up by an example from some other db showing an
actual difference in behavior.
--
Andrew (irc:RhodiumToad)