Обсуждение: RE: [SQL] Update problem I cannot work out

Поиск
Список
Период
Сортировка

RE: [SQL] Update problem I cannot work out

От
Michael J Davis
Дата:
Try something like:
Create function new_function(value_data_type)    'Select sum(t1.var1) from t1 where t1.var2 = $1'  LANGUAGE
'sql';
update t2 set var1=t2.var1+new_function(t2.var2);

-----Original Message-----From:    M Simms [SMTP:grim@argh.demon.co.uk]Sent:    Saturday, May 01, 1999 8:26 PMTo:
pgsql-sql@postgreSQL.orgSubject:   [SQL] Update problem I cannot work out
 
I cannot work out how to do the following query
testdb=> select * from t1;var1|var2----+----   1|   2   2|   2   2|   3(3 rows)
testdb=> select * from t2;var1|var2----+----   5|   2   9|   3   4|   4(3 rows)
I need to add to the value of each var1 in t2 the sum of all var1's
in t1that have the same value corresponding var2's
testdb=> update t2 set var1=t2.var1+sum(t1.var1) where
t1.var2=t2.var2;ERROR:  parser: illegal use of aggregates or non-group column in
target list
I imagine that this is failing because it isnt sure which values tosum(), but I cannot for the life of me figure out
howto solve this.
 
My second idea was to do
testdb=> update t2 set var1=t2.var1+t1.var1 where t2.var2=t1.var1;NOTICE:  Non-functional update, only first update is
performedUPDATE1
 
This will only, obviously, add one of the values I need to add.
The values I need to end up with, just so you know you are solving
theright problem, are:
testdb=> select * from t1;var1|var2----+----   1|   2   2|   2   2|   3(3 rows)
testdb=> select * from t2;var1|var2----+----   8|   2  11|   3   4|   4(3 rows)
Any help on this would be appreciated. I am a bit stuck here, and I
amsure I am just missing something blindingly obvious.I am loathe to use a cursor to do this, because my actual dataset
isupwardsof 10,000 records, and I do not wish to do 10,000 differentqueries.
 


Re: [SQL] Update problem I cannot work out

От
M Simms
Дата:
>
>     Try something like:
>
>     Create function new_function(value_data_type)
>         'Select sum(t1.var1) from t1 where t1.var2 = $1'  LANGUAGE
> 'sql';
>
>     update t2 set var1=t2.var1+new_function(t2.var2);
>

*snip*

Hi, thanks for replying.

My reply to this is the question, that as each function contains a
select, is this technically a single pass.

I get the feeling there will be tremendous overhead on this theory with a
large dataset, in comparison to a single query that does the
update. Can anyone comment on how much slower this could be (although
obviously if there is no way to do it in a single query then there is
no choice)?

Thanks

                        M Simms

Re: [SQL] Update problem I cannot work out

От
Tom Lane
Дата:
M Simms <grim@argh.demon.co.uk> writes:
>> Create function new_function(value_data_type)
>> 'Select sum(t1.var1) from t1 where t1.var2 = $1'  LANGUAGE
>> 'sql';
>> 
>> update t2 set var1=t2.var1+new_function(t2.var2);

> My reply to this is the question, that as each function contains a
> select, is this technically a single pass.

I think you are right: this way will repeat the sub-select for each
tuple in t2.  Now if you have an index on t1.var2, the system could
exploit it to pull out just the desired tuples in each sub-select, so
the total work wouldn't really be O(M*N).  But it still looks slow.

If I knew t1 would be large, I think I'd do it with a temporary table:
SELECT var2, sum(var1) FROM t1 GROUP BY var2 INTO temptable;UPDATE t2 SET var1 = t2.var1 + temptable.sum    WHERE
t2.var2= temptable.var2;DROP TABLE temptable;
 

(In 6.5 you can use a temporary table to avoid needing explicit DROP,
but that's just a convenience.)  Perhaps there's a way to accomplish
the same thing with a subselect, rather than explicitly creating a
table, but I don't know how.

This is more like O(N log N) in the size of each table, assuming that
the join is done with mergejoin.  Actually, if there's not a huge
number of distinct values of var2, then temptable should be small enough
for a hashjoin, which'd mean that there's no need to sort t2 at all.

BTW, I think your original try

update t2 set var1=t2.var1+t1.var1 where t2.var2=t1.var1;

should have worked; it's probably a bug that it doesn't work.  IMHO
the semantics ought to be that the addition is repeated for each t1
tuple that matches the current t2 tuple ... but I haven't read the
SQL spec.
        regards, tom lane