Re: pg, mysql comparison with "group by" clause

Поиск

Список

Период

Сортировка

От	Greg Stark
Тема	Re: pg, mysql comparison with "group by" clause
Дата	13 октября 2005 г. 18:27:11
Msg-id	878xwxi87h.fsf@stark.xeocode.com обсуждение исходный текст
Ответ на	Re: pg, mysql comparison with "group by" clause (Scott Marlowe <smarlowe@g2switchworks.com>)
Ответы	Re: pg, mysql comparison with "group by" clause Re: pg, mysql comparison with "group by" clause
Список	pgsql-sql

Дерево обсуждения

Scott Marlowe <smarlowe@g2switchworks.com> writes:

> Sorry, but it's worse than that.  It is quite possible that two people
> could run this query at the same time and get different data from the
> same set and the same point in time.  That shouldn't happen accidentally
> in SQL, you should know it's coming.

I'm pretty unsympathetic to the "we should make a language less powerful and
more awkward because someone might use it wrong" argument.

> > In standard SQL you have to
> > write GROUP BY ... and list every single column you need from the master
> > table. Forcing the database to do a lot of redundant comparisons and sort on
> > uselessly long keys where in fact you only really need it to sort and group by
> > the primary key.
> 
> But again, you're getting whatever row the database feels like giving
> you.  A use of a simple, stupid aggregate like an any() aggregate would
> be fine here, and wouldn't require a lot of overhead, and would meet the
> SQL spec.

Great, so I have a user table with, oh, say, 40 columns. And I want to return
all those columns plus their current account balance in a single query.

The syntax under discussion would be:

select user.*, sum(money) from user join user_money using (user_id) group by user_id

You would prefer:

select user_id,       any(username) as username, any(firstname) as firstname,       any(lastname) as lastname,
any(address)as address,      any(city) as city, any(street) as street, any(phone) as phone,      any(last_update) as
last_update,any(last_login) as last_login,      any(referrer_id) as referrer_id, any(register_date) as register_date,
  ...      sum(money) as balance,      count(money) as num_txns from user join user_money using (user_id) group by
user_id

Having a safeties is fine but when I have to disengage the safety for every
single column it starts to get more than a little annoying. 

Note that you cannot write the above as a subquery since there are two
aggregates. You could write it as a join against a view but don't expect to
get the same plans from Postgres for that.

> Actually, for things like aggregates, I've often been able to improve
> performance with sub selects in PostgreSQL.  

If your experience is like mine it's a case of two wrongs cancelling each
other out. The optimizer underestimates the efficiency of nested loops which
is another problem. Since subqueries' only eligible plan is basically a nested
loop it often turns out to be faster than the more exotic plans a join can
reach.

In an ideal world subqueries would be transformed into the equivalent join (or
some more general join structure that can cover both sets of semantics) and
then planned through the same code path. In an ideal world the user should be
guaranteed that equivalent queries would always result in the same plan
regardless of how they're written.

-- 
greg

В списке pgsql-sql по дате отправления:

Предыдущее

От: Mike Diehl
Дата: 13 октября 2005 г., 17:53:34
Сообщение: Re: [DOCS] Update timestamp on update

Следующее

От: "Andy"
Дата: 13 октября 2005 г., 18:33:08
Сообщение: Re: Strange join...maybe some improvements???

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: pg, mysql comparison with "group by" clause

Предыдущее

Следующее