Обсуждение: RE: [SQL] Update problem I cannot work out
Try something like: Create function new_function(value_data_type) 'Select sum(t1.var1) from t1 where t1.var2 = $1' LANGUAGE 'sql'; update t2 set var1=t2.var1+new_function(t2.var2); -----Original Message-----From: M Simms [SMTP:grim@argh.demon.co.uk]Sent: Saturday, May 01, 1999 8:26 PMTo: pgsql-sql@postgreSQL.orgSubject: [SQL] Update problem I cannot work out I cannot work out how to do the following query testdb=> select * from t1;var1|var2----+---- 1| 2 2| 2 2| 3(3 rows) testdb=> select * from t2;var1|var2----+---- 5| 2 9| 3 4| 4(3 rows) I need to add to the value of each var1 in t2 the sum of all var1's in t1that have the same value corresponding var2's testdb=> update t2 set var1=t2.var1+sum(t1.var1) where t1.var2=t2.var2;ERROR: parser: illegal use of aggregates or non-group column in target list I imagine that this is failing because it isnt sure which values tosum(), but I cannot for the life of me figure out howto solve this. My second idea was to do testdb=> update t2 set var1=t2.var1+t1.var1 where t2.var2=t1.var1;NOTICE: Non-functional update, only first update is performedUPDATE1 This will only, obviously, add one of the values I need to add. The values I need to end up with, just so you know you are solving theright problem, are: testdb=> select * from t1;var1|var2----+---- 1| 2 2| 2 2| 3(3 rows) testdb=> select * from t2;var1|var2----+---- 8| 2 11| 3 4| 4(3 rows) Any help on this would be appreciated. I am a bit stuck here, and I amsure I am just missing something blindingly obvious.I am loathe to use a cursor to do this, because my actual dataset isupwardsof 10,000 records, and I do not wish to do 10,000 differentqueries.
> > Try something like: > > Create function new_function(value_data_type) > 'Select sum(t1.var1) from t1 where t1.var2 = $1' LANGUAGE > 'sql'; > > update t2 set var1=t2.var1+new_function(t2.var2); > *snip* Hi, thanks for replying. My reply to this is the question, that as each function contains a select, is this technically a single pass. I get the feeling there will be tremendous overhead on this theory with a large dataset, in comparison to a single query that does the update. Can anyone comment on how much slower this could be (although obviously if there is no way to do it in a single query then there is no choice)? Thanks M Simms
M Simms <grim@argh.demon.co.uk> writes: >> Create function new_function(value_data_type) >> 'Select sum(t1.var1) from t1 where t1.var2 = $1' LANGUAGE >> 'sql'; >> >> update t2 set var1=t2.var1+new_function(t2.var2); > My reply to this is the question, that as each function contains a > select, is this technically a single pass. I think you are right: this way will repeat the sub-select for each tuple in t2. Now if you have an index on t1.var2, the system could exploit it to pull out just the desired tuples in each sub-select, so the total work wouldn't really be O(M*N). But it still looks slow. If I knew t1 would be large, I think I'd do it with a temporary table: SELECT var2, sum(var1) FROM t1 GROUP BY var2 INTO temptable;UPDATE t2 SET var1 = t2.var1 + temptable.sum WHERE t2.var2= temptable.var2;DROP TABLE temptable; (In 6.5 you can use a temporary table to avoid needing explicit DROP, but that's just a convenience.) Perhaps there's a way to accomplish the same thing with a subselect, rather than explicitly creating a table, but I don't know how. This is more like O(N log N) in the size of each table, assuming that the join is done with mergejoin. Actually, if there's not a huge number of distinct values of var2, then temptable should be small enough for a hashjoin, which'd mean that there's no need to sort t2 at all. BTW, I think your original try update t2 set var1=t2.var1+t1.var1 where t2.var2=t1.var1; should have worked; it's probably a bug that it doesn't work. IMHO the semantics ought to be that the addition is repeated for each t1 tuple that matches the current t2 tuple ... but I haven't read the SQL spec. regards, tom lane