Re: query optimization: aggregate and distinct

Поиск

Список

Период

Сортировка

От	Jeff Davis
Тема	Re: query optimization: aggregate and distinct
Дата	21 августа 2003 г. 06:02:28
Msg-id	200308202305.48111.jdavis-pgsql@empires.org обсуждение исходный текст
Ответ на	query optimization: aggregate and distinct (Jeff Davis <jdavis-pgsql@empires.org>)
Ответы	Re: query optimization: aggregate and distinct (Tom Lane <tgl@sss.pgh.pa.us>)
Список	pgsql-general

Дерево обсуждения

On Wednesday 20 August 2003 04:26 pm, Jeff Davis wrote:
> I have below a simplified version of what I'm trying to do. Basically, I am
> trying to get both an aggregate (an average) and "most recent" value.
>
>  g | v  |             ts
> ---+----+----------------------------
>  1 | 10 | 2003-08-20 16:00:27.010769
>  1 | 20 | 2003-08-20 16:00:30.380476
>  2 | 40 | 2003-08-20 16:00:37.399717
>  2 | 80 | 2003-08-20 16:00:40.265717
>
> I would like, as output, something like this:
>
>  g | v  |        avg         |             ts
> ---+----+--------------------+----------------------------
>  1 | 20 | 15.000000000000000 | 2003-08-20 16:00:30.380476
>  2 | 80 | 60.000000000000000 | 2003-08-20 16:00:40.265717
>
> which I got by a query like:
>
> SELECT
>     t2.g,t2.v,t1.avg,t2.ts
> FROM
>     (SELECT
>         g,avg(v)
>         FROM t
>         GROUP BY g
>     ) t1,
>     (SELECT
>         DISTINCT ON (g)
>         * FROM t
>         ORDER BY g,ts DESC
>     ) t2
> WHERE t1.g = t2.g;
>
> That produces the results that I need, but it seems inefficient to join a
> table with itself like that. My real query (not this simplified example)
> takes 5+ seconds and I suspect this join is why.
>
> Is there a better way?
>
> For my real query, it's using index scans where I'd expect, and I
> frequently VACUUM ANALYZE the big table and I have all the stats turned on.
> Also, I have more shared buffers than needed to put everything in RAM.
>
> Right now I'm using 7.2.1. Any improvements in 7.3 or 7.4 that would help
> this issue?
>

I had an idea about using aggregates: what if I made an aggregate function
called "first" that just returned the value in the first tuple it
encountered?

That way, I could do an "ORDER BY", and then do a real aggregate on some
columns, and then just call first() on the rest and it will return the first
tuple according to the ordering.

Will that work? Are aggregates guarenteed to visit tuples in the same
direction as the ORDERing?

It seems like a bad hack, but I think it works. If not, I suppose I could pass
a user-defined aggregate a user-defined complex type.

Regards,
    Jeff Davis

В списке pgsql-general по дате отправления:

Предыдущее

От: Martijn van Oosterhout
Дата: 21 августа 2003 г., 05:40:57
Сообщение: Re: Buglist

Следующее

От: "Gregory S. Williamson"
Дата: 21 августа 2003 г., 06:13:37
Сообщение: Re: Postgresql Driver

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: query optimization: aggregate and distinct

Предыдущее

Следующее