Обсуждение: Efficiency in Multiple Queries of Same Table in a PL/PgSQL Function
All... In the poly_example function below, I am emulating an actual requirement by querying the same table three (3) times in order to derive a solution to a problem. Is this the best or most efficient and effective way to implement this? The table (which consists of only five (5) FLOAT8 columns) I'm querying contains less than 50 rows.Thanks in advance for any insight or criticisms youoffer. CREATE OR REPLACE FUNCTION poly_example() RETURNS SETOF FLOAT8 AS $poly_example$ DECLARE term blah%ROWTYPE; sigma_l FLOAT8 := 0.0; sigma_b FLOAT8 := 0.0; sigma_r FLOAT8 := 0.0; BEGIN FOR term in SELECT * FROM blah LOOP sigma_l := sigma_l + (RANDOM() * 100) * (term.i * term.i) + RANDOM() * (term.j * term.j) + term.k; END LOOP; FOR term in SELECT * FROM blah LOOP sigma_b := sigma_b + (RANDOM() * 53) * (term.i * term.i) + (RANDOM()* 5) * (term.j * term.j) + term.k; END LOOP; FOR term in SELECT * FROM blah LOOP sigma_r := sigma_r + 96.232234 * (term.i * term.i) + 0.32322325* (term.j * term.j) + term.k; END LOOP; RETURN NEXT sigma_l + sigma_b + sigma_r; END; $poly_example$ LANGUAGE plpgsql STRICT IMMUTABLE; -- Gary Chambers /* Nothing fancy and nothing Microsoft! */
I don't see anything in the assignment statements (sigma_* :=) which would prevent one from doing all three of them within a single for loop. In fact, written as is there's some chance the values of the sigma_*s might change between repeated calls to the function since there is no explicit ordering of the rows returned from table blah. Putting all the assignments into a single select from blah would at least say that the sigma values are from the same dataset per run. As to efficiency in general, I would expect the entire table (~50 rows) would be entirely in memory after the first select, but you plan triples the time in the loop. This expense would likely only be noticeable if the function itself is called /lots/. Gary Chambers wrote: > All... > > In the poly_example function below, I am emulating an actual > requirement by querying the same table three (3) times in order to > derive a solution to a problem. Is this the best or most efficient > and effective way to implement this? The table (which consists of > only five (5) FLOAT8 columns) I'm querying contains less than 50 rows. > Thanks in advance for any insight or criticisms you offer. > > CREATE OR REPLACE FUNCTION poly_example() RETURNS SETOF FLOAT8 AS > $poly_example$ > DECLARE > term blah%ROWTYPE; > sigma_l FLOAT8 := 0.0; > sigma_b FLOAT8 := 0.0; > sigma_r FLOAT8 := 0.0; > > BEGIN > FOR term in SELECT * FROM blah LOOP > sigma_l := sigma_l + (RANDOM() * 100) * (term.i * term.i) + > RANDOM() * (term.j * term.j) + term.k; > END LOOP; > > FOR term in SELECT * FROM blah LOOP > sigma_b := sigma_b + (RANDOM() * 53) * (term.i * term.i) + > (RANDOM() * 5) * (term.j * term.j) + term.k; > END LOOP; > > FOR term in SELECT * FROM blah LOOP > sigma_r := sigma_r + 96.232234 * (term.i * term.i) + > 0.32322325 * (term.j * term.j) + term.k; > END LOOP; > > RETURN NEXT sigma_l + sigma_b + sigma_r; > END; > $poly_example$ LANGUAGE plpgsql STRICT IMMUTABLE; > > -- Gary Chambers > > /* Nothing fancy and nothing Microsoft! */ > >
Re: Efficiency in Multiple Queries of Same Table in a PL/PgSQL Function
От
matthias schoeneich
Дата:
Hi, as you don't seem to need the sigma_*'s, you could calc the whole result with one query using: CREATE OR REPLACE FUNCTION poly_example2() RETURNS SETOF FLOAT8 AS $poly_example$ DECLARE f_result FLOAT8 := 0.0; i_rowcount INT := 0 ; BEGIN SELECT sum((RANDOM() * 100 ) * (term.i * term.i) + RANDOM() * (term.j * term.j) + term.k) + sum((RANDOM() * 53 ) * (term.i * term.i) +(RANDOM()* 5) * (term.j * term.j) + term.k) + sum( 96.232234 * (term.i * term.i) + 0.32322325 * (term.j * term.j) + term.k) , count(*) INTO f_result , i_rowcount FROM blah AS term; IF i_rowcount > 0 THEN RETURN NEXT f_result; ELSE RETURN NEXT 0; END IF; END; $poly_example$ LANGUAGE plpgsql STRICT IMMUTABLE; I've just put it in your plpgsql body to handle the case where table blah contains no rows. Matthias Rob Sargent schrieb: > I don't see anything in the assignment statements (sigma_* :=) which > would prevent one from doing all three of them within a single for > loop. In fact, written as is there's some chance the values of the > sigma_*s might change between repeated calls to the function since there > is no explicit ordering of the rows returned from table blah. Putting > all the assignments into a single select from blah would at least say > that the sigma values are from the same dataset per run. > > > As to efficiency in general, I would expect the entire table (~50 rows) > would be entirely in memory after the first select, but you plan triples > the time in the loop. This expense would likely only be noticeable if > the function itself is called /lots/. > > Gary Chambers wrote: >> All... >> >> In the poly_example function below, I am emulating an actual >> requirement by querying the same table three (3) times in order to >> derive a solution to a problem. Is this the best or most efficient >> and effective way to implement this? The table (which consists of >> only five (5) FLOAT8 columns) I'm querying contains less than 50 rows. >> Thanks in advance for any insight or criticisms you offer. >> >> CREATE OR REPLACE FUNCTION poly_example() RETURNS SETOF FLOAT8 AS >> $poly_example$ >> DECLARE >> term blah%ROWTYPE; >> sigma_l FLOAT8 := 0.0; >> sigma_b FLOAT8 := 0.0; >> sigma_r FLOAT8 := 0.0; >> >> BEGIN >> FOR term in SELECT * FROM blah LOOP >> sigma_l := sigma_l + (RANDOM() * 100) * (term.i * term.i) + >> RANDOM() * (term.j * term.j) + term.k; >> END LOOP; >> >> FOR term in SELECT * FROM blah LOOP >> sigma_b := sigma_b + (RANDOM() * 53) * (term.i * term.i) + >> (RANDOM() * 5) * (term.j * term.j) + term.k; >> END LOOP; >> >> FOR term in SELECT * FROM blah LOOP >> sigma_r := sigma_r + 96.232234 * (term.i * term.i) + >> 0.32322325 * (term.j * term.j) + term.k; >> END LOOP; >> >> RETURN NEXT sigma_l + sigma_b + sigma_r; >> END; >> $poly_example$ LANGUAGE plpgsql STRICT IMMUTABLE; >> >> -- Gary Chambers >> >> /* Nothing fancy and nothing Microsoft! */ >> >> > >
But if I read the OP correctly the sigma are in fact used additively in each row in blah. "sigma_* = sigma_* +" matthias schoeneich wrote: > Hi, > > as you don't seem to need the sigma_*'s, you could calc the whole > result with one query using: > > CREATE OR REPLACE FUNCTION poly_example2() RETURNS SETOF FLOAT8 AS > $poly_example$ > DECLARE > f_result FLOAT8 := 0.0; > i_rowcount INT := 0 ; > > BEGIN > SELECT sum((RANDOM() * 100 ) * (term.i * term.i) + RANDOM() * > (term.j * term.j) + term.k) + > sum((RANDOM() * 53 ) * (term.i * term.i) +(RANDOM()* 5) * > (term.j * term.j) + term.k) + > sum( 96.232234 * (term.i * term.i) + 0.32322325 * > (term.j * term.j) + term.k) , > count(*) > INTO f_result , > i_rowcount > FROM blah AS term; > > IF i_rowcount > 0 THEN > RETURN NEXT f_result; > ELSE > RETURN NEXT 0; > END IF; > END; > $poly_example$ LANGUAGE plpgsql STRICT IMMUTABLE; > > I've just put it in your plpgsql body to handle the case where table > blah contains no rows. > > Matthias > > Rob Sargent schrieb: >> I don't see anything in the assignment statements (sigma_* :=) which >> would prevent one from doing all three of them within a single for >> loop. In fact, written as is there's some chance the values of the >> sigma_*s might change between repeated calls to the function since >> there is no explicit ordering of the rows returned from table blah. >> Putting all the assignments into a single select from blah would at >> least say that the sigma values are from the same dataset per run. >> >> >> As to efficiency in general, I would expect the entire table (~50 >> rows) would be entirely in memory after the first select, but you >> plan triples the time in the loop. This expense would likely only be >> noticeable if the function itself is called /lots/. >> >> Gary Chambers wrote: >>> All... >>> >>> In the poly_example function below, I am emulating an actual >>> requirement by querying the same table three (3) times in order to >>> derive a solution to a problem. Is this the best or most efficient >>> and effective way to implement this? The table (which consists of >>> only five (5) FLOAT8 columns) I'm querying contains less than 50 rows. >>> Thanks in advance for any insight or criticisms you offer. >>> >>> CREATE OR REPLACE FUNCTION poly_example() RETURNS SETOF FLOAT8 AS >>> $poly_example$ >>> DECLARE >>> term blah%ROWTYPE; >>> sigma_l FLOAT8 := 0.0; >>> sigma_b FLOAT8 := 0.0; >>> sigma_r FLOAT8 := 0.0; >>> >>> BEGIN >>> FOR term in SELECT * FROM blah LOOP >>> sigma_l := sigma_l + (RANDOM() * 100) * (term.i * term.i) + >>> RANDOM() * (term.j * term.j) + term.k; >>> END LOOP; >>> >>> FOR term in SELECT * FROM blah LOOP >>> sigma_b := sigma_b + (RANDOM() * 53) * (term.i * term.i) + >>> (RANDOM() * 5) * (term.j * term.j) + term.k; >>> END LOOP; >>> >>> FOR term in SELECT * FROM blah LOOP >>> sigma_r := sigma_r + 96.232234 * (term.i * term.i) + >>> 0.32322325 * (term.j * term.j) + term.k; >>> END LOOP; >>> >>> RETURN NEXT sigma_l + sigma_b + sigma_r; >>> END; >>> $poly_example$ LANGUAGE plpgsql STRICT IMMUTABLE; >>> >>> -- Gary Chambers >>> >>> /* Nothing fancy and nothing Microsoft! */ >>> >>> >> >> >
> But if I read the OP correctly the sigma are in fact used additively in each > row in blah. "sigma_* = sigma_* +" I apologize, but I omitted a CASE statement prior to each calculation of the values. The coefficients for each calculation change depending upon which case is valid. I could probably rewrite it using arrays and may, in fact, do so to eliminate the two added queries. Thank you for your input. -- Gary Chambers /* Nothing fancy and nothing Microsoft! */