Обсуждение: pgsql aggregate: conditional max

Поиск
Список
Период
Сортировка

pgsql aggregate: conditional max

От
Weimao Ke
Дата:
Hi,

I need a special aggregation function. For instance, given the following 
table data:
  aid    |   cat   | weight
----------+---------+---------a1  | Drama   |       1a1  | Romance |       6a1  | Short   |       1a1 | Other   |
7a2 | Comedy  |       1a2 | Drama   |       2a3  | Drama   |       1a3 | Adult   |       2a3 | Comedy  |       1a3 |
Other  |       1
 

I want to group by "aid" and choose the category (i.e., "cat") with the 
largest "weight":

aid   |   max_weighted_cat
----+---------------------
a1   |   Other
a2   |   Drama
a3   |   Adult

Any ideas? Thank you! :)

-- 
All best, 

Weimao

Weimao Ke
Indiana University Bloomington
School of Library and Information Science
http://ella.slis.indiana.edu/~wke



Re: pgsql aggregate: conditional max

От
Jeffrey Melloy
Дата:
Weimao Ke wrote:
> Hi,
>
> I need a special aggregation function. For instance, given the 
> following table data:
>
>   aid    |   cat   | weight
> ----------+---------+---------
> a1  | Drama   |       1
> a1  | Romance |       6
> a1  | Short   |       1
> a1 | Other   |       7
> a2  | Comedy  |       1
> a2 | Drama   |       2
> a3  | Drama   |       1
> a3 | Adult   |       2
> a3 | Comedy  |       1
> a3 | Other   |       1
>
> I want to group by "aid" and choose the category (i.e., "cat") with 
> the largest "weight":
>
> aid   |   max_weighted_cat
> ----+---------------------
> a1   |   Other
> a2   |   Drama
> a3   |   Adult
>
> Any ideas? Thank you! :)
>

Should be able to do this with a standard max() aggregate.

select aid, cat, max(weight)
from table
group by aid, cat;

Jeff


Re: pgsql aggregate: conditional max

От
Michael Fuhr
Дата:
On Sun, Mar 12, 2006 at 12:09:48AM -0500, Weimao Ke wrote:
> I want to group by "aid" and choose the category (i.e., "cat") with the 
> largest "weight":
> 
> aid   |   max_weighted_cat
> ----+---------------------
> a1   |   Other
> a2   |   Drama
> a3   |   Adult

PostgreSQL has a non-standard DISTINCT ON clause that would work.
See the weather_reports example in the documentation for SELECT:

http://www.postgresql.org/docs/8.1/interactive/sql-select.html

Try this query against your example data:

SELECT DISTINCT ON (aid) aid, cat
FROM tablename
ORDER BY aid, weight DESC, cat;

If multiple rows for a given aid match that aid's max weight then
the above query will return the first matching row according to the
given sort order.

Some people object to DISTINCT ON because it's non-deterministic if
you don't order by enough columns.  Here's something more standard;
it'll return all rows that match a given aid's max weight:

SELECT aid, cat
FROM tablename AS t
JOIN (SELECT aid, max(weight) AS weight     FROM tablename     GROUP BY aid) AS s USING (aid, weight);

-- 
Michael Fuhr


Re: pgsql aggregate: conditional max

От
Michael Fuhr
Дата:
On Sun, Mar 12, 2006 at 12:34:57AM -0500, Jeffrey Melloy wrote:
> Should be able to do this with a standard max() aggregate.
> 
> select aid, cat, max(weight)
> from table
> group by aid, cat;

That query returns the maximum weight for each (aid, cat) pair.
Against the example data it returns the entire table, not the
(aid, cat) pair with the max weight for a given aid.

-- 
Michael Fuhr


Re: pgsql aggregate: conditional max

От
Daniel CAUNE
Дата:
> Hi,
> 
> I need a special aggregation function. For instance, given the following
> table data:
> 
>    aid    |   cat   | weight
> ----------+---------+---------
>  a1  | Drama   |       1
>  a1  | Romance |       6
>  a1  | Short   |       1
>  a1 | Other   |       7
>  a2  | Comedy  |       1
>  a2 | Drama   |       2
>  a3  | Drama   |       1
>  a3 | Adult   |       2
>  a3 | Comedy  |       1
>  a3 | Other   |       1
> 
> I want to group by "aid" and choose the category (i.e., "cat") with the
> largest "weight":
> 
> aid   |   max_weighted_cat
> ----+---------------------
> a1   |   Other
> a2   |   Drama
> a3   |   Adult
> 
> Any ideas? Thank you! :)
> 

SELECT aid, cat FROM table, (   SELECT aid, max(weight) as weight     FROM table     GROUP BY aid) AS tablemaxweight
WHEREtable.aid = tablemaxweight.aid   AND table.weight = tablemaxweight.aid;
 

There is a limit case you don't specify how to deal with, when two or more categories have the same maximum weight.
Thequery I wrote retrieves all the categories that have the maximum weight, but perhaps you just want one per aid.
 

--
Daniel



Re: pgsql aggregate: conditional max

От
Weimao Ke
Дата:
Michael Fuhr wrote:

>On Sun, Mar 12, 2006 at 12:34:57AM -0500, Jeffrey Melloy wrote:
>  
>
>>Should be able to do this with a standard max() aggregate.
>>
>>select aid, cat, max(weight)
>>from table
>>group by aid, cat;
>>    
>>
>
>That query returns the maximum weight for each (aid, cat) pair.
>Against the example data it returns the entire table, not the
>(aid, cat) pair with the max weight for a given aid.
>
>  
>
Michael is right. This query does not solve the problem...

-- 
All best, 

Weimao

Weimao Ke
Indiana University Bloomington
School of Library and Information Science
http://ella.slis.indiana.edu/~wke



Re: pgsql aggregate: conditional max

От
Weimao Ke
Дата:
Michael Fuhr wrote:

>SELECT DISTINCT ON (aid) aid, cat
>FROM tablename
>ORDER BY aid, weight DESC, cat;
>  
>
Good pointer. I think this will solve my problem. :)

>SELECT aid, cat
>FROM tablename AS t
>JOIN (SELECT aid, max(weight) AS weight
>      FROM tablename
>      GROUP BY aid) AS s USING (aid, weight);
>  
>
This query will return duplicates if there are multiple categories (for 
one aid) with the same max weight. Yet, I should be able to remove the 
duplicates somehow...:)

I really appreciate your help!

-- 
All best, 

Weimao

Weimao Ke
Indiana University Bloomington
School of Library and Information Science
http://ella.slis.indiana.edu/~wke



Re: pgsql aggregate: conditional max

От
Weimao Ke
Дата:
Daniel CAUNE wrote:

> SELECT aid, cat
>
>  FROM table, (
>    SELECT aid, max(weight) as weight
>      FROM table
>      GROUP BY aid) AS tablemaxweight
>  WHERE table.aid = tablemaxweight.aid
>    AND table.weight = tablemaxweight.aid;
>
>There is a limit case you don't specify how to deal with, when two or more categories have the same maximum weight.
Thequery I wrote retrieves all the categories that have the maximum weight, but perhaps you just want one per aid.
 
>  
>
Yes, this will introduce duplicates. Yet it is not too difficult to 
select only one for each aid from the results. Thank you!

Weimao

>--
>Daniel
>
>
>  
>


-- 
All best, 

Weimao

Weimao Ke
Indiana University Bloomington
School of Library and Information Science
http://ella.slis.indiana.edu/~wke



Re: pgsql aggregate: conditional max

От
Markus Schaber
Дата:
Hi, Weimao Ke,

Weimao Ke wrote:
>> SELECT aid, cat
>> FROM tablename AS t
>> JOIN (SELECT aid, max(weight) AS weight
>>      FROM tablename
>>      GROUP BY aid) AS s USING (aid, weight);
>>
> This query will return duplicates if there are multiple categories (for
> one aid) with the same max weight. Yet, I should be able to remove the
> duplicates somehow...:)

Try

SELECT DISTINCT aid, cat
FROM tablename AS t
JOIN (SELECT aid, max(weight) AS weight    FROM tablename    GROUP BY aid) AS s USING (aid, weight);

HTH,
Markus


-- 
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf.     | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org