Обсуждение: Using expression names in subsequent calculation

Поиск
Список
Период
Сортировка

Using expression names in subsequent calculation

От
Tony Theodore
Дата:
Hello,

I'm migrating an application from Access, and having great fun so far,
but I'm running into problems with some queries. What I'm trying to do
boils down to this:

SELECT 10 AS qty, 0.5 AS price, 0.1 AS tax_rate, qty * price AS val,
val * tax_rate AS tax_amount, val + tax_amount as total

but get a "column qty does not exist" error.

In a real query - qty, price and tax_rate would come from actual
columns, but then the error is on the "val" column. I'd like to use
the name of the expression rather than repeat the calculation.

Near the bottom of the SELECT docs, there's a note about namespaces
that explains why this won't work, but I'm wondering how to go about
this type of query.

So far I've come up with:

WITH order_details AS (SELECT 10 AS qty, 0.5 AS price, 0.1 AS tax_rate),
order_values AS (SELECT qty * price AS val from order_details),
order_taxes AS (SELECT val * tax_rate AS tax_amount FROM
order_details, order_values)
SELECT qty, price, tax_rate, val, tax_amount, val + tax_amount as
total FROM order_details, order_values, order_taxes

However, that looks at lot more complicated than simply repeating the
calculations ;)

What's the best way go about calculations like this that build upon
previous results?

Thanks,

Tony

Re: Using expression names in subsequent calculation

От
Tom Lane
Дата:
Tony Theodore <tony.theodore@gmail.com> writes:
> I'm migrating an application from Access, and having great fun so far,
> but I'm running into problems with some queries. What I'm trying to do
> boils down to this:

> SELECT 10 AS qty, 0.5 AS price, 0.1 AS tax_rate, qty * price AS val,
> val * tax_rate AS tax_amount, val + tax_amount as total

> but get a "column qty does not exist" error.

Yeah, this is entirely contrary to the SQL standard.  In the standard,
the columns of the result are notionally computed in parallel, so there
is no way for one to refer to another.  MS hasn't done the world any
favors by inventing this incompatible extension.

> What's the best way go about calculations like this that build upon
> previous results?

What you need to do is use a sub-select to create columns that can be
referenced at the next level.  The particular example you're showing
here requires multiple levels of sub-select because you're chaining
the operations.  Something like

SELECT *, val + tax_amount as total
FROM
  (SELECT *, val * tax_rate AS tax_amount
   FROM
     (SELECT *, qty * price AS val
      FROM
        (SELECT 10 AS qty, 0.5 AS price, 0.1 AS tax_rate) as ss1
     ) as ss2
  ) as ss3;

Keep in mind that the Postgres planner will typically flatten
sub-selects used this way into a single level of plan, with the same
result as if you hadn't chained the calculations but just expanded all
the expressions into their primitive constitutents by hand.  You can see
that in this slightly less silly version of your example:

regression=# create table sales (qty int, price numeric, tax_rate numeric);
CREATE TABLE
regression=# explain verbose SELECT *, val + tax_amount as total
FROM
(SELECT *, val * tax_rate AS tax_amount
FROM
(SELECT *,  qty * price AS val
FROM
sales
) as ss2
) as ss3;

                                    QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Seq Scan on public.sales  (cost=0.00..41.12 rows=830 width=68)
   Output: sales.qty, sales.price, sales.tax_rate, ((sales.qty)::numeric * sales.price), (((sales.qty)::numeric *
sales.price)* sales.tax_rate), (((sales.qty)::numeric * sales.price) + (((sales.qty)::numeric * sales.price) *
sales.tax_rate))
(2 rows)

In this particular example that's probably just fine, because the
individual calculations are cheap enough that repeating them probably
beats incurring the overhead of multiple run-time plan levels.
But if you were trying to use a structure like this to avoid multiple
evaluations of a very expensive function, you'd want to stick an
"OFFSET 0" into the sub-select level that had the expensive function,
so as to create an optimization fence.

            regards, tom lane

Re: Using expression names in subsequent calculation

От
Tony Theodore
Дата:
On 12 September 2011 02:00, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Tony Theodore <tony.theodore@gmail.com> writes:
>> I'm migrating an application from Access, and having great fun so far,
>> but I'm running into problems with some queries. What I'm trying to do
>> boils down to this:
>
>> SELECT 10 AS qty, 0.5 AS price, 0.1 AS tax_rate, qty * price AS val,
>> val * tax_rate AS tax_amount, val + tax_amount as total
>
>> but get a "column qty does not exist" error.
>
> Yeah, this is entirely contrary to the SQL standard.  In the standard,
> the columns of the result are notionally computed in parallel, so there
> is no way for one to refer to another.  MS hasn't done the world any
> favors by inventing this incompatible extension.
>
>> What's the best way go about calculations like this that build upon
>> previous results?
>
> What you need to do is use a sub-select to create columns that can be
> referenced at the next level.  The particular example you're showing
> here requires multiple levels of sub-select because you're chaining
> the operations.  Something like
>
> SELECT *, val + tax_amount as total
> FROM
>  (SELECT *, val * tax_rate AS tax_amount
>   FROM
>     (SELECT *, qty * price AS val
>      FROM
>        (SELECT 10 AS qty, 0.5 AS price, 0.1 AS tax_rate) as ss1
>     ) as ss2
>  ) as ss3;

Excellent, thanks for that, I didn't think that was possible and the
WITH statement was the only way to do such things.


> Keep in mind that the Postgres planner will typically flatten
> sub-selects used this way into a single level of plan, with the same
> result as if you hadn't chained the calculations but just expanded all
> the expressions into their primitive constitutents by hand.  You can see
> that in this slightly less silly version of your example:
>
> regression=# create table sales (qty int, price numeric, tax_rate numeric);
> CREATE TABLE
> regression=# explain verbose SELECT *, val + tax_amount as total
> FROM
> (SELECT *, val * tax_rate AS tax_amount
> FROM
> (SELECT *,  qty * price AS val
> FROM
> sales
> ) as ss2
> ) as ss3;
>
>                                    QUERY PLAN
>
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>  Seq Scan on public.sales  (cost=0.00..41.12 rows=830 width=68)
>   Output: sales.qty, sales.price, sales.tax_rate, ((sales.qty)::numeric * sales.price), (((sales.qty)::numeric *
sales.price)* sales.tax_rate), (((sales.qty)::numeric * sales.price) + (((sales.qty)::numeric * sales.price) *
sales.tax_rate))
> (2 rows)
>
> In this particular example that's probably just fine, because the
> individual calculations are cheap enough that repeating them probably
> beats incurring the overhead of multiple run-time plan levels.
> But if you were trying to use a structure like this to avoid multiple
> evaluations of a very expensive function, you'd want to stick an
> "OFFSET 0" into the sub-select level that had the expensive function,
> so as to create an optimization fence.
>
>                        regards, tom lane
>

Okay, I'll do some reading up on that.

Thanks again,

Tony