Обсуждение: Feature proposal and discussion: full-fledged column/function equivalence

Поиск
Список
Период
Сортировка

Feature proposal and discussion: full-fledged column/function equivalence

От
Chris Travers
Дата:
Hi all;

I had a pleasant surprise today when demonstrating a previous misfeature in PostgreSQL behaved unexpectedly.  In further investigation, there is a really interesting syntax which is very helpful for some things I had not known about.

Consider the following:

CREATE TABLE keyvaltest (
    key text primary key,
    value text not null
);
INSERT INTO keyvaltest VALUES ('foo', 'bar'), ('fooprime', 'barprime');
SELECT value(k) from keyvaltest k;

The latter performs exactly like 

SELECT k.value from keyvaltest k;

So the column/function equivalent is there.  This is probably not the best for production SQL code just because it is non-standard, but it is great for theoretical examples because it shows the functional dependency between tuple and tuple member.

It gets better:

CREATE OR REPLACE FUNCTION value(test) returns int language sql as $$
select 3; $$;
ERROR:  "value" is  already an attribute of type test

So this further suggests that value(test) is effectively an implicit function of test (because it is a trivial functional dependency).

So with all this in mind, is there any reason why we can't or shouldn't allow:

CREATE testfunction(test) returns int language sql as $$ select 1; $$;
SELECT testfunction FROM test;

That would allow first-class calculated columns.

I assume the work is mostly at the parser/grammatical level.  Is there any reason why supporting that would be a bad idea?
--
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor lock-in.

Re: Feature proposal and discussion: full-fledged column/function equivalence

От
Vik Fearing
Дата:
On 08/01/2014 04:57 PM, Chris Travers wrote:
> Hi all;
>
> I had a pleasant surprise today when demonstrating a previous misfeature
> in PostgreSQL behaved unexpectedly.  In further investigation, there is
> a really interesting syntax which is very helpful for some things I had
> not known about.
>
> Consider the following:
>
> CREATE TABLE keyvaltest (
>     key text primary key,
>     value text not null
> );
> INSERT INTO keyvaltest VALUES ('foo', 'bar'), ('fooprime', 'barprime');
> SELECT value(k) from keyvaltest k;
>
> The latter performs exactly like
>
> SELECT k.value from keyvaltest k;

Interesting.  I wasn't aware of that.

> So the column/function equivalent is there.  This is probably not the
> best for production SQL code just because it is non-standard, but it is
> great for theoretical examples because it shows the functional
> dependency between tuple and tuple member.
>
> It gets better:
>
> CREATE OR REPLACE FUNCTION value(test) returns int language sql as $$
> select 3; $$;
> ERROR:  "value" is  already an attribute of type test
>
> So this further suggests that value(test) is effectively an implicit
> function of test (because it is a trivial functional dependency).
>
> So with all this in mind, is there any reason why we can't or shouldn't
> allow:
>
> CREATE testfunction(test) returns int language sql as $$ select 1; $$;
> SELECT testfunction FROM test;
>
> That would allow first-class calculated columns.
>
> I assume the work is mostly at the parser/grammatical level.  Is there
> any reason why supporting that would be a bad idea?

This is already supported since forever.

SELECT test.testfunction FROM test;

This link might be of interest to you:
http://momjian.us/main/blogs/pgblog/2013.html#April_10_2013
--
Vik


Re: Feature proposal and discussion: full-fledged column/function equivalence

От
Vik Fearing
Дата:
On 08/01/2014 06:28 PM, Vik Fearing wrote:
>> So with all this in mind, is there any reason why we can't or shouldn't
>> > allow:
>> >
>> > CREATE testfunction(test) returns int language sql as $$ select 1; $$;
>> > SELECT testfunction FROM test;
>> >
>> > That would allow first-class calculated columns.
>> >
>> > I assume the work is mostly at the parser/grammatical level.  Is there
>> > any reason why supporting that would be a bad idea?
> This is already supported since forever.
>
> SELECT test.testfunction FROM test;
>
> This link might be of interest to you:
> http://momjian.us/main/blogs/pgblog/2013.html#April_10_2013

Sorry, that's not the link I wanted.  This one is:
http://momjian.us/main/blogs/pgblog/2013.html#April_1_2013

Despite being posted on April 1st, it is not a joke. :)
--
Vik


Re: Feature proposal and discussion: full-fledged column/function equivalence

От
David G Johnston
Дата:
Vik Fearing wrote
>> CREATE testfunction(test) returns int language sql as $$ select 1; $$;
>> SELECT testfunction FROM test;
>>
>> That would allow first-class calculated columns.
>>
>> I assume the work is mostly at the parser/grammatical level.  Is there
>> any reason why supporting that would be a bad idea?
>
> This is already supported since forever.
>
> SELECT test.testfunction FROM test;

More to the point: if you are writing a multiple-relation query and have
"testfunction" functions defined for at least two of the relations used in
the query how would the system decide which one to use?

SELECT testfunction FROM test JOIN test_extended USING (test_id)

I guess you could allow for the non-ambiguous cases and error out otherwise
but that seems to be adding quite a bit of complexity for little gain.

David J.



--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Feature-proposal-and-discussion-full-fledged-column-function-equivalence-tp5813533p5813571.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


Re: Re: Feature proposal and discussion: full-fledged column/function equivalence

От
Chris Travers
Дата:



On Fri, Aug 1, 2014 at 12:19 PM, David G Johnston <david.g.johnston@gmail.com> wrote:
Vik Fearing wrote
>> CREATE testfunction(test) returns int language sql as $$ select 1; $$;
>> SELECT testfunction FROM test;
>>
>> That would allow first-class calculated columns.
>>
>> I assume the work is mostly at the parser/grammatical level.  Is there
>> any reason why supporting that would be a bad idea?
>
> This is already supported since forever.
>
> SELECT test.testfunction FROM test;

More to the point: if you are writing a multiple-relation query and have
"testfunction" functions defined for at least two of the relations used in
the query how would the system decide which one to use?

Same way you do it for columns.  Throw an error that it is ambiguous.
 

SELECT testfunction FROM test JOIN test_extended USING (test_id)

I guess you could allow for the non-ambiguous cases and error out otherwise
but that seems to be adding quite a bit of complexity for little gain.

Hmm.  As I see it, there is one possible backwards compatibility issue but it is almost certainly extraordinarily rare.

Suppose in your above example, test_extended has a testfunction attribute but test has a testfunction function.  In the current codebase, there is no parsing ambiguity (the attribute wins because the function is ignored), but we'd have to throw the same error as if the function were an attribute if we did this.

It doesn't seem terribly logically complicated to do this (since it is a slight extension to the lookup in the system catalogs), and I am having trouble imagining that there are many cases where these sorts of functions are added.

The larger question becomes:

Would it be more useful to have such functions in the select * result, or to treat them as hidden columns from that?  (I am thinking that can be decided down the road though if I go through and take this up on -hackers). 

David J.



--
View this message in context: http://postgresql.1045698.n5.nabble.com/Feature-proposal-and-discussion-full-fledged-column-function-equivalence-tp5813533p5813571.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



--
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor lock-in.

Re: Re: Feature proposal and discussion: full-fledged column/function equivalence

От
David Johnston
Дата:
On Fri, Aug 1, 2014 at 6:22 PM, Chris Travers <chris.travers@gmail.com> wrote:
On Fri, Aug 1, 2014 at 12:19 PM, David G Johnston <david.g.johnston@gmail.com> wrote:
Vik Fearing wrote
>> CREATE testfunction(test) returns int language sql as $$ select 1; $$;
>> SELECT testfunction FROM test;
>>
>> That would allow first-class calculated columns.
>>
>> I assume the work is mostly at the parser/grammatical level.  Is there
>> any reason why supporting that would be a bad idea?
>
> This is already supported since forever.
>
> SELECT test.testfunction FROM test;

More to the point: if you are writing a multiple-relation query and have
"testfunction" functions defined for at least two of the relations used in
the query how would the system decide which one to use?

Same way you do it for columns.  Throw an error that it is ambiguous.
 

​I'd rather approach the first-class issue by being able to say:  ALTER TABLE test ADD COLUMN ​testfunction(test) -- maybe with an "AS col_alias"...

I do not have anything particularly against your proposal but neither do I find it an overwhelming improvement over "testfunction(test) and test.testfunction" - especially when I can encapsulate them behind a VIEW.
 

SELECT testfunction FROM test JOIN test_extended USING (test_id)

I guess you could allow for the non-ambiguous cases and error out otherwise
but that seems to be adding quite a bit of complexity for little gain.

Hmm.  As I see it, there is one possible backwards compatibility issue but it is almost certainly extraordinarily rare.

Suppose in your above example, test_extended has a testfunction attribute but test has a testfunction function.  In the current codebase, there is no parsing ambiguity (the attribute wins because the function is ignored), but we'd have to throw the same error as if the function were an attribute if we did this.

It doesn't seem terribly logically complicated to do this (since it is a slight extension to the lookup in the system catalogs), and I am having trouble imagining that there are many cases where these sorts of functions are added.

The larger question becomes:

Would it be more useful to have such functions in the select * result, or to treat them as hidden columns from that?  (I am thinking that can be decided down the road though if I go through and take this up on -hackers). 



If they truly are "first class" members of the table they should probably appear with " SELECT * "; otherwise, and this is simply semantics, you are simply adding yet another syntax to remember to invoke a function since the user will still have to know said function exists.  I read "first class" to mean that the fact the value is being derived from a function call is invisible to the user.  And this then points leads back to the idea of defining a generated column on the actual table or, in absence of that capability - live with the fact the updateable can accomplish many, if not all, of the same goals today.

David J.

Re: Re: Feature proposal and discussion: full-fledged column/function equivalence

От
Tom Lane
Дата:
David Johnston <david.g.johnston@gmail.com> writes:
> On Fri, Aug 1, 2014 at 6:22 PM, Chris Travers <chris.travers@gmail.com>
> wrote:
>> On Fri, Aug 1, 2014 at 12:19 PM, David G Johnston <
>> david.g.johnston@gmail.com> wrote:
>>> More to the point: if you are writing a multiple-relation query and have
>>> "testfunction" functions defined for at least two of the relations used in
>>> the query how would the system decide which one to use?

>> Same way you do it for columns.  Throw an error that it is ambiguous.

> ​I'd rather approach the first-class issue by being able to say:  ALTER
> TABLE test ADD COLUMN ​testfunction(test) -- maybe with an "AS col_alias"...

The real reason not to do this is that there is already a SQL standard
feature for computed columns (they're called generated columns IIRC).
We don't need to, and shouldn't, invent nonstandard syntax for that.

We inherited the notion that "a.b is equivalent to b(a)" from PostQUEL;
it's nowhere to be seen in SQL.  While I don't feel a need to incur the
backwards compatibility hit of taking that out, I also don't feel a need
to extend it, especially not in directions that risk breaking existing
applications.

            regards, tom lane