Обсуждение: functions in WHERE clause

Поиск
Список
Период
Сортировка

functions in WHERE clause

От
sramsay@uga.edu
Дата:
Hi All,

I've got one of these:

SELECT * from some_table WHERE
test_for_equality_is_syntactically_ugly;

What I'd like to do is encapsulate the WHERE clause in a function,
but I'm having no end of trouble.

The WHERE clause expects the function to return a boolean value.  I
can certainly return a boolean value from a function, but here it
seems to me that what the function really has to do is return a 
set of boolean values -- the test in the WHERE clause sometimes
evaluates to true and sometimes evaluates to false, and that is in
turn used to constrain the query results.   But you can't return a
set of anything (I don't think) in a WHERE clause, because it seems
to want a singular boolean value.

Is it possible to do what I'm trying to do?  I've written a few
simple sql and pl/pgsql functions over the years, but I'm no expert.

Perhaps I'm barking up the wrong tree here.  It seems like what I
really need is a way to have some kind of rewrite rule (e.g.  when
you see "foo(x)" substitute ugly string).  Sort of like a Lisp macro
;)

Steve

-- 
Stephen Ramsay
Assistant Professor
Department of English
University of Georgia
email: sramsay@uga.edu
web: http://cantor.english.uga.edu/
PGP Public Key ID: 0xA38D7B11


Re: functions in WHERE clause

От
Michael Fuhr
Дата:
On Sun, Mar 05, 2006 at 10:16:52AM -0500, sramsay@uga.edu wrote:
> I've got one of these:
> 
> SELECT * from some_table WHERE
> test_for_equality_is_syntactically_ugly;
> 
> What I'd like to do is encapsulate the WHERE clause in a function,
> but I'm having no end of trouble.

Would a view work?  If not then please provide a more concrete
example that shows what you're trying to do.

CREATE VIEW foo AS
SELECT * FROM some_table
WHERE test_for_equality_is_syntactically_ugly;

-- 
Michael Fuhr


Re: functions in WHERE clause

От
sramsay@uga.edu
Дата:
On Sun, Mar 05, 2006 at 10:26:35AM -0700, Michael Fuhr wrote:
> On Sun, Mar 05, 2006 at 10:16:52AM -0500, sramsay@uga.edu wrote:
> > I've got one of these:
> > 
> > SELECT * from some_table WHERE
> > test_for_equality_is_syntactically_ugly;
> > 
> > What I'd like to do is encapsulate the WHERE clause in a function,
> > but I'm having no end of trouble.
> 
> Would a view work?  If not then please provide a more concrete
> example that shows what you're trying to do.
> 
> CREATE VIEW foo AS
> SELECT * FROM some_table
> WHERE test_for_equality_is_syntactically_ugly;

Thanks for reply!

No, because test_for_equality_is_syntactically_ugly varies.

Here's the specifics:

I've written an XML database (of sorts) that uses the ltree contrib
module to find elements that belong to particular nodes.  Right now,
if you wanted to find all the word tokens found within the
titleStmt element of a set of XML documents, you'd do something like this:

SELECT token FROM event WHERE ltree ~ '*.titleStmt.*';

(ltree contains a set of "paths" that together express the
hierarchical relationships in the documents, so you can format that
bit after the ~ to include any element, and you can also use other
kinds of operators to get ancestors, children, etc.).

Now, I realize this isn't *really* all that syntactically ugly, but
it would be really nice if the user of the db could type in an XPath
like so:

SELECT token FROM event WHERE xpath("//titleStmt");

This will require some munging inside a function to go from the
XPath to the ltree-style regex expression, but I'm finding that I'm
having more basic troubles.

You can't do this:

CREATE FUNCTION xpath(lquery) RETURNS ltree AS $$ SELECT ltree FROM event WHERE ltree ~ $1;
$$ LANGUAGE SQL;

Because that only returns the first value from the SELECT (and it's
not pl-pgsql anyway).

But I also can't get this kind of thing to work:

CREATE FUNCTION xpath(lquery) RETURNS SETOF ltree AS $$
DECLARE   tree record;
BEGINFOR tree IN SELECT ltree FROM event WHERE ltree ~ $1 LOOP    RETURN NEXT tree;END LOOP;RETURN;
END
$$ LANGUAGE plpgsql;

Because SETOF won't work in a WHERE context.

Part of the problem is that I don't really understand how WHERE
constraints work under the hood.  The lvalue in the WHERE clause
expects a boolean, but I think that really means "evaluate the truth
or falsehood of this expression on all rows in the table specified
in the FROM clause." So how do you get a function to play nicely
with that concept?  "The argument of WHERE must not return a set,"
as the psql shell keeps telling me, but is there anything it would
accept that would make this work?

Thanks again,

Steve

-- 
Stephen Ramsay
Assistant Professor
Department of English
University of Georgia
email: sramsay@uga.edu
web: http://cantor.english.uga.edu/
PGP Public Key ID: 0xA38D7B11


Re: functions in WHERE clause

От
Tom Lane
Дата:
sramsay@uga.edu writes:
> I've got one of these:
> SELECT * from some_table WHERE
> test_for_equality_is_syntactically_ugly;
> What I'd like to do is encapsulate the WHERE clause in a function,

You mean like replacing

SELECT * from some_table WHERE x = 42 AND y = 77

with

create function mytest(int,int) returns bool as
$$select $1 = 42 AND $2 = 77$$ language sql;

SELECT * from some_table WHERE mytest(x,y);

A SQL-language function like this should get inlined into the query,
so that you don't lose any performance compared to writing out the
full expression each time.
        regards, tom lane


Re: functions in WHERE clause

От
Tom Lane
Дата:
sramsay@uga.edu writes:
> You can't do this:

> CREATE FUNCTION xpath(lquery) RETURNS ltree AS $$
>   SELECT ltree FROM event WHERE ltree ~ $1;
> $$ LANGUAGE SQL;

That would work fine if you said RETURNS SETOF ltree.

> But I also can't get this kind of thing to work:

> CREATE FUNCTION xpath(lquery) RETURNS SETOF ltree AS $$
> DECLARE
>     tree record;
> BEGIN
>     FOR tree IN SELECT ltree FROM event WHERE ltree ~ $1 LOOP
>         RETURN NEXT tree;
>     END LOOP;
>     RETURN;
> END
> $$ LANGUAGE plpgsql;

That should work too, except that you are trying to return a record
not an ltree value.  Try "RETURN NEXT tree.ltree".

> Because SETOF won't work in a WHERE context.

Possibly you need to read the error messages you are getting more
closely, because I'm pretty sure whatever it said had nothing to
do with either SETOF or WHERE ...
        regards, tom lane


Re: functions in WHERE clause

От
Дата:
On Sun, Mar 05, 2006 at 01:16:40PM -0500, Tom Lane wrote:
> sramsay@uga.edu writes:
> That would work fine if you said RETURNS SETOF ltree.
> 
> That should work too, except that you are trying to return a record
> not an ltree value.  Try "RETURN NEXT tree.ltree".
> 
> > Because SETOF won't work in a WHERE context.
> 
> Possibly you need to read the error messages you are getting more
> closely, because I'm pretty sure whatever it said had nothing to
> do with either SETOF or WHERE ...

I think it does, actually.  I can write functions that return
ltrees, records, or sets of ltree, and they'll work in any part of
the query -- except the WHERE clause.  If the function returns
anything other than a bool, it complains that the function must
return a bool.

Which makes sense to me, actually, because the "result" of something
like:

x = 42 and y = 77

(to quote your earlier example) should be true or false -- not a set
of rows or records or types.  At least I think.

But let's return to your example for a moment, because it really
does look like the kind of "rewrite" rule that I want here.

You suggested replacing:

SELECT * from some_table WHERE x = 42 AND y = 77

with

create function mytest(int,int) returns bool as
$$select $1 = 42 AND $2 = 77$$ language sql;

So you could then do:

SELECT * from some_table WHERE mytest(x,y);

But imagine instead that this function is more generic.  You know
that you're trying to get something that's equal to x and equal to
y, but you don't know (until the function is called) what those
rvalues should be.  In other words, it's the 42 and the 47 that you
don't know until runtime -- you always know what columns your
searching on.

When I try do something along the lines of what you're doing, I get:

CREATE OR REPLACE FUNCTION xpath(lquery) RETURNS bool AS $$SELECT ltree ~ $1;
$$ LANGUAGE SQL;

But I can't even load that function, because it says:

ERROR:  column "ltree" does not exist

And round and round I go . . .

Thanks for the reply,

Steve

-- 
Stephen Ramsay
Assistant Professor
Department of English
University of Georgia
email: sramsay@uga.edu
web: http://cantor.english.uga.edu/
PGP Public Key ID: 0xA38D7B11


Re: functions in WHERE clause

От
Stephan Szabo
Дата:
On Sun, 5 Mar 2006 sramsay@uga.edu wrote:

> On Sun, Mar 05, 2006 at 01:16:40PM -0500, Tom Lane wrote:
> > sramsay@uga.edu writes:
> > That would work fine if you said RETURNS SETOF ltree.
> >
> > That should work too, except that you are trying to return a record
> > not an ltree value.  Try "RETURN NEXT tree.ltree".
> >
> > > Because SETOF won't work in a WHERE context.
> >
> > Possibly you need to read the error messages you are getting more
> > closely, because I'm pretty sure whatever it said had nothing to
> > do with either SETOF or WHERE ...
>
> I think it does, actually.  I can write functions that return
> ltrees, records, or sets of ltree, and they'll work in any part of
> the query -- except the WHERE clause.  If the function returns
> anything other than a bool, it complains that the function must
> return a bool.
>
> Which makes sense to me, actually, because the "result" of something
> like:
>
> x = 42 and y = 77
>
> (to quote your earlier example) should be true or false -- not a set
> of rows or records or types.  At least I think.
>
> But let's return to your example for a moment, because it really
> does look like the kind of "rewrite" rule that I want here.
>
> You suggested replacing:
>
> SELECT * from some_table WHERE x = 42 AND y = 77
>
> with
>
> create function mytest(int,int) returns bool as
> $$select $1 = 42 AND $2 = 77$$ language sql;
>
> So you could then do:
>
> SELECT * from some_table WHERE mytest(x,y);
>
> But imagine instead that this function is more generic.  You know
> that you're trying to get something that's equal to x and equal to
> y, but you don't know (until the function is called) what those
> rvalues should be.  In other words, it's the 42 and the 47 that you
> don't know until runtime -- you always know what columns your
> searching on.

Then you need to pass those in as well as something representing the row
that's being tested -- where clauses are filters on rows. The whole row
representation might be better than columns for some cases.

For example:

create table tt1(a int, b int);
create function f1(tt1, int) returns bool as 'select $1.a = $2' language
'sql';

select * from tt1 where f1(tt1, 1);

---

The other option is to do this as a set returning function in the first
place rather than trying to do a wierd where clause thing.

create function f2(int) returns setof tt1 as 'select * from tt1 where a =
$1' language 'sql';
select * from f2(1);



Re: functions in WHERE clause

От
sramsay@uga.edu
Дата:
On Sun, Mar 05, 2006 at 11:38:46AM -0800, Stephan Szabo wrote:
> > But imagine instead that this function is more generic.  You know
> > that you're trying to get something that's equal to x and equal to
> > y, but you don't know (until the function is called) what those
> > rvalues should be.  In other words, it's the 42 and the 47 that you
> > don't know until runtime -- you always know what columns your
> > searching on.
> 
> Then you need to pass those in as well as something representing the row
> that's being tested -- where clauses are filters on rows. The whole row
> representation might be better than columns for some cases.
> 
> For example:
> 
> create table tt1(a int, b int);
> create function f1(tt1, int) returns bool as 'select $1.a = $2' language
> 'sql';
> 
> select * from tt1 where f1(tt1, 1);

Eureka!  That does it:

CREATE OR REPLACE FUNCTION xpath(event, lquery) RETURNS bool AS $$   SELECT $1.ltree ~ $2;
$$ LANGUAGE SQL;

And it's fast (the event table is a view)!

Interestingly, this version is very slow:

CREATE OR REPLACE FUNCTION xpath(event, lquery) RETURNS bool AS
$$
BEGIN    RETURN $1.ltree ~ $2;
END
$$ LANGUAGE plpgsql STABLE;

EXPLAIN shows lots of sequential scans when you try to do it this
way.  No inlining, perhaps?

I guess I am still a little confused as to why you have to pass the
table in as a parameter -- why you can't just do:

RETURN event.ltree ~ $2;

But I'll live . . .

Thanks to one and all for the very generous assistance.  I have
learned much.

Steve

> The other option is to do this as a set returning function in the first
> place rather than trying to do a wierd where clause thing.

Mostly for syntactic clarity on the caller's end, but I understand what
you're saying.

-- 
Stephen Ramsay
Assistant Professor
Department of English
University of Georgia
email: sramsay@uga.edu
web: http://cantor.english.uga.edu/
PGP Public Key ID: 0xA38D7B11


Re: functions in WHERE clause

От
Stephan Szabo
Дата:
On Sun, 5 Mar 2006 sramsay@uga.edu wrote:

> On Sun, Mar 05, 2006 at 11:38:46AM -0800, Stephan Szabo wrote:
> > > But imagine instead that this function is more generic.  You know
> > > that you're trying to get something that's equal to x and equal to
> > > y, but you don't know (until the function is called) what those
> > > rvalues should be.  In other words, it's the 42 and the 47 that you
> > > don't know until runtime -- you always know what columns your
> > > searching on.
> >
> > Then you need to pass those in as well as something representing the row
> > that's being tested -- where clauses are filters on rows. The whole row
> > representation might be better than columns for some cases.
> >
> > For example:
> >
> > create table tt1(a int, b int);
> > create function f1(tt1, int) returns bool as 'select $1.a = $2' language
> > 'sql';
> >
> > select * from tt1 where f1(tt1, 1);
>
> Eureka!  That does it:
>
> CREATE OR REPLACE FUNCTION xpath(event, lquery) RETURNS bool AS $$
>     SELECT $1.ltree ~ $2;
> $$ LANGUAGE SQL;
>
> And it's fast (the event table is a view)!
>
> Interestingly, this version is very slow:
>
> CREATE OR REPLACE FUNCTION xpath(event, lquery) RETURNS bool AS
> $$
> BEGIN
>      RETURN $1.ltree ~ $2;
> END
> $$ LANGUAGE plpgsql STABLE;
>
> EXPLAIN shows lots of sequential scans when you try to do it this
> way.  No inlining, perhaps?

Yeah, I'd assume it's running the function once for each row after the
construction of the event row (so it's probably doing any joins in the
view first), whereas the SQL one might inline and be optimized differently
which might shrink the number of rows that any joins are being done on.

> I guess I am still a little confused as to why you have to pass the
> table in as a parameter -- why you can't just do:
>
> RETURN event.ltree ~ $2;

Well, the problem here is say you have a function f(int) that says RETURN
tt1.a = $1 like my example.  What does "select * from tt2 where f(1)" or
"select * from tt1 a, tt1 b where f(1)" mean? Functions don't really have
much context sensitivity, and it's not always reasonable to expect to be
able to open the box to know things like f(x) in the above depends on
there being a single tt1 entry in the from clause (and trying to deal
with outer references would be an utter mess).


Re: functions in WHERE clause

От
Greg Stark
Дата:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> A SQL-language function like this should get inlined into the query,
> so that you don't lose any performance compared to writing out the
> full expression each time.

I think what's going on here is that he doesn't really want a function in the
programming sense. A function takes arguments and returns a result based on
those arguments. It would be trivial to make a function that returns true or
false for his constraints but it would require passing in the columns as well
as the values he's testing against. So it would be "xpath(ltreecolumn, 42, 47)".

What he's looking for is a convenience macro that hides the columns he's
testing against so he doesn't have to write the columns in every query. So he
can just type "xpath(42,47)" without retyping "ltreecolumn".

Afaik there's no functionality for this in Postgres. functions aren't given
any context information when they're executed aside from the arguments passed.

If I understand what you want then I think you're much better off just typing
the name of the column you're testing against explicitly every time anyways.
One day you'll need more flexibility or you'll have someone else reading the
code and you'll be glad you're not hiding what's going on and hard coding
column names inside some macro function anyways.

-- 
greg



Re: functions in WHERE clause

От
Markus Schaber
Дата:
Hi, Steve,

sramsay@uga.edu wrote:

> SELECT * from some_table WHERE
> test_for_equality_is_syntactically_ugly;

> The WHERE clause expects the function to return a boolean value.  I
> can certainly return a boolean value from a function, but here it
> seems to me that what the function really has to do is return a 
> set of boolean values -- the test in the WHERE clause sometimes
> evaluates to true and sometimes evaluates to false, and that is in
> turn used to constrain the query results.   But you can't return a
> set of anything (I don't think) in a WHERE clause, because it seems
> to want a singular boolean value.

And this is as it is intended. The equality test is applied row-by-row,
and for each row, it is either true or false, but not undecided.

So your query should look like

SELECT * FROM some_table WHERE your_function(column_a, column_b);

> Is it possible to do what I'm trying to do?  I've written a few
> simple sql and pl/pgsql functions over the years, but I'm no expert.

Yes, it is.

HTH,
Markus

-- 
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf.     | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org