Обсуждение: GROUP BY / HAVING - am I being thick?
I have a question about GROUP BY / HAVING. It seems I can't do something I *think* I should be able to do! Give the following table called 'summary': ------------------------ (int) (bool) (int) codon expl nexamples ------------------------286 t 1286 f 1286 f 4286 t 13286 t 37286 t 1287 f 3287 f 3287 f 4287 f 1288 f 2288 f 3288 f 4288 f 2289 f 1289 t 3 ------------------------ For each codon number, I want to obtain sum(nexamples) where expl = 'f'; i.e. I want to get the result: ------------------------ (int) (int) codon sum ------------------------286 5287 11288 11289 1 ------------------------ I can do this using a temporary table (or equally using a view): SELECT codon, nexamples INTO summary_tmp FROMsummary WHERE expl = 'f'; SELECT codon, sum(nexamples) FROM summary_tmp GROUP BY codon; DROP TABLE summary_tmp; but I thought I should be able to do it using GROUP BY and HAVING. The following looks as though it should work according to "The Practical SQL Handbook" (Bowman, Emerson & Darnovsky) SELECT codon, sum(nexamples) FROM summary WHERE expl = 'f' GROUP BY codon HAVING expl = 'f'; I've tried various contortions of putting expl in the SELECT clause and in the GROUP BY clause, but nothing seems to work. The docs say that PostgreSQL requires the HAVING clause "unambiguously reference a grouping column" whatever that means - I tried putting expl in the GROUP BY, but it didn't help. Any suggestions gratefully received! Thanks, Andrew -- Dr. Andrew C.R. Martin EMail: a.c.r.martin@reading.ac.uk (work) Lecturer in Bioinformatics andrew@stagleys.demon.co.uk (home) University of Reading Tel.: +44 (0)118 987 5123x7022 Fax: +44 (0)118 931 0180
"Andrew C.R. Martin" <a.c.r.martin@reading.ac.uk> writes: > For each codon number, I want to obtain sum(nexamples) where expl = 'f'; > The following looks as though it should work according to "The > Practical SQL Handbook" (Bowman, Emerson & Darnovsky) > SELECT codon, sum(nexamples) > FROM summary > WHERE expl = 'f' > GROUP BY codon > HAVING expl = 'f'; Just drop the HAVING clause and that will work --- you're already filtering by expl at the WHERE stage, so there's no need to do it again after grouping (quite aside from the fact that you can't reference an ungrouped column in HAVING). The thing to remember about WHERE vs HAVING is that WHERE filters tuples before they are aggregated into groups, while HAVING filters afterwards. Whatever tuples get past the WHERE filter are collected into groups, and then HAVING is applied to the per-group output tuples to filter out whole groups that you don't want. In practice, you only need HAVING if you want to filter on aggregate functions of the groups. For instance, if in your example you didn't want to hear about codons with sum(nexamples) < 100, you'd do SELECT codon, sum(nexamples) FROM summary WHERE expl = 'f' GROUP BY codon HAVING sum(nexamples) >= 100; Once you realize that HAVING applies post-group, it should be pretty clear why the HAVING clause can only reference group-column values and aggregate results. Anything else is not well-defined at the group level. For instance, the group for codon=286 has several different values for nexamples, so you can't ask about a specific value of nexamples in the HAVING clause --- which one would you get? regards, tom lane
> I have a question about GROUP BY / HAVING. It seems I can't do > something I *think* I should be able to do! [...] > SELECT codon, sum(nexamples) > FROM summary > WHERE expl = 'f' > GROUP BY codon > HAVING expl = 'f'; OK! As I thought, I was being thick - helps to read the book properly instead of dipping into it. Thanks Tom for clearing this up! Andrew -- Dr. Andrew C.R. Martin EMail: a.c.r.martin@reading.ac.uk (work) Lecturer in Bioinformatics andrew@stagleys.demon.co.uk (home) University of Reading Tel.: +44 (0)118 987 5123x7022 Fax: +44 (0)118 931 0180