Обсуждение: Could regexp_matches be immutable?

Поиск
Список
Период
Сортировка

Could regexp_matches be immutable?

От
Rod Taylor
Дата:
I tried making a functional index based on an expression containing
the 2 argument regexp_matches() function. Is there a reason why this
function is not marked immutable instead of normal?

regards,

Rod Taylor


Re: Could regexp_matches be immutable?

От
Tom Lane
Дата:
Rod Taylor <rod.taylor@gmail.com> writes:
> I tried making a functional index based on an expression containing
> the 2 argument regexp_matches() function. Is there a reason why this
> function is not marked immutable instead of normal?

regex_flavor affects its result.
        regards, tom lane


Re: Could regexp_matches be immutable?

От
David Fetter
Дата:
On Wed, Oct 14, 2009 at 04:51:03PM -0400, Tom Lane wrote:
> Rod Taylor <rod.taylor@gmail.com> writes:
> > I tried making a functional index based on an expression
> > containing the 2 argument regexp_matches() function. Is there a
> > reason why this function is not marked immutable instead of
> > normal?
> 
> regex_flavor affects its result.

Speaking of which, can we see about deprecating and removing this GUC?
I've yet to hear of anyone using a flavor other than the default.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: Could regexp_matches be immutable?

От
Andrew Dunstan
Дата:

David Fetter wrote:
> On Wed, Oct 14, 2009 at 04:51:03PM -0400, Tom Lane wrote:
>   
>> Rod Taylor <rod.taylor@gmail.com> writes:
>>     
>>> I tried making a functional index based on an expression
>>> containing the 2 argument regexp_matches() function. Is there a
>>> reason why this function is not marked immutable instead of
>>> normal?
>>>       
>> regex_flavor affects its result.
>>     
>
> Speaking of which, can we see about deprecating and removing this GUC?
> I've yet to hear of anyone using a flavor other than the default.
>
>
>   

You have now. I have a client who sadly uses a non-default setting. And 
on 8.4, what is more.

There are more things under heaven and earth ....

cheers

andrew


Re: Could regexp_matches be immutable?

От
"Kevin Grittner"
Дата:
David Fetter <david@fetter.org> wrote:
> On Wed, Oct 14, 2009 at 04:51:03PM -0400, Tom Lane wrote:
>> Rod Taylor <rod.taylor@gmail.com> writes:
>> > I tried making a functional index based on an expression
>> > containing the 2 argument regexp_matches() function. Is there a
>> > reason why this function is not marked immutable instead of
>> > normal?
>> 
>> regex_flavor affects its result.
> 
> Speaking of which, can we see about deprecating and removing this
> GUC?
+1  It would seem to me to be more valuable to have the benefits of
IMMUTABLE than preserve pre-7.4 compatibility forever.
-Kevin


Re: Could regexp_matches be immutable?

От
Josh Berkus
Дата:
On 10/14/09 2:07 PM, David Fetter wrote:
> On Wed, Oct 14, 2009 at 04:51:03PM -0400, Tom Lane wrote:
>> Rod Taylor <rod.taylor@gmail.com> writes:
>>> I tried making a functional index based on an expression
>>> containing the 2 argument regexp_matches() function. Is there a
>>> reason why this function is not marked immutable instead of
>>> normal?
>> regex_flavor affects its result.
> 
> Speaking of which, can we see about deprecating and removing this GUC?
> I've yet to hear of anyone using a flavor other than the default.

Actually, *we* (PGX) have a client who does.  You just haven't worked on
their stuff.

--Josh


Re: Could regexp_matches be immutable?

От
David Fetter
Дата:
On Wed, Oct 14, 2009 at 05:14:31PM -0400, Andrew Dunstan wrote:
> David Fetter wrote:
>> On Wed, Oct 14, 2009 at 04:51:03PM -0400, Tom Lane wrote:
>>   
>>> Rod Taylor <rod.taylor@gmail.com> writes:
>>>     
>>>> I tried making a functional index based on an expression
>>>> containing the 2 argument regexp_matches() function. Is there a
>>>> reason why this function is not marked immutable instead of
>>>> normal?
>>>>       
>>> regex_flavor affects its result.
>>>     
>>
>> Speaking of which, can we see about deprecating and removing this GUC?
>> I've yet to hear of anyone using a flavor other than the default.
>
> You have now. I have a client who sadly uses a non-default setting. And  
> on 8.4, what is more.

OK, now I've heard of one.  I still think we should deprecate and
remove.  Say, deprecate this next release and remove for the following
one?

> There are more things under heaven and earth ....

My philosophy doesn't include infinite backward compatibility.
Neither do heaven and earth, come to think of it. :)

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: Could regexp_matches be immutable?

От
Josh Berkus
Дата:
> +1  It would seem to me to be more valuable to have the benefits of
> IMMUTABLE than preserve pre-7.4 compatibility forever.

Just create a shell function which calls it in a specific flavor, and
make that immutable.

--Josh


Re: Could regexp_matches be immutable?

От
Tom Lane
Дата:
Andrew Dunstan <andrew@dunslane.net> writes:
> David Fetter wrote:
>> Speaking of which, can we see about deprecating and removing this GUC?
>> I've yet to hear of anyone using a flavor other than the default.

> You have now. I have a client who sadly uses a non-default setting. And 
> on 8.4, what is more.

How critical is it to them?  It would be nice to get rid of that source
of variability.

It would be possible to keep using old-style regexes even without the
GUC, if they can interpose anything that can stick an "embedded options"
prefix on the pattern strings.  See 9.7.3.4:
http://developer.postgresql.org/pgdocs/postgres/functions-matching.html
        regards, tom lane


Re: Could regexp_matches be immutable?

От
David Fetter
Дата:
On Wed, Oct 14, 2009 at 06:06:23PM -0400, Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
> > David Fetter wrote:
> >> Speaking of which, can we see about deprecating and removing this GUC?
> >> I've yet to hear of anyone using a flavor other than the default.
> 
> > You have now. I have a client who sadly uses a non-default setting. And 
> > on 8.4, what is more.
> 
> How critical is it to them?  It would be nice to get rid of that source
> of variability.
> 
> It would be possible to keep using old-style regexes even without the
> GUC, if they can interpose anything that can stick an "embedded options"
> prefix on the pattern strings.  See 9.7.3.4:
> http://developer.postgresql.org/pgdocs/postgres/functions-matching.html

Switching it to just embedded options solves the issue of leaving the
feature in while cutting the surprises down for those not using it. :)

The "embedded options" method is also doable by search-and-replace, as
they only work in AREs, which such people wouldn't be using.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: Could regexp_matches be immutable?

От
Andrew Dunstan
Дата:

Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>   
>> David Fetter wrote:
>>     
>>> Speaking of which, can we see about deprecating and removing this GUC?
>>> I've yet to hear of anyone using a flavor other than the default.
>>>       
>
>   
>> You have now. I have a client who sadly uses a non-default setting. And 
>> on 8.4, what is more.
>>     
>
> How critical is it to them?  It would be nice to get rid of that source
> of variability.
>
> It would be possible to keep using old-style regexes even without the
> GUC, if they can interpose anything that can stick an "embedded options"
> prefix on the pattern strings.  See 9.7.3.4:
> http://developer.postgresql.org/pgdocs/postgres/functions-matching.html
>
>         
>   


They are probably quite open to changing it, but IIRC it is a setting 
imposed by OpenACS, which is what they are based on.

cheers

andrew


Re: Could regexp_matches be immutable?

От
Tom Lane
Дата:
Andrew Dunstan <andrew@dunslane.net> writes:
> They are probably quite open to changing it, but IIRC it is a setting 
> imposed by OpenACS, which is what they are based on.

I seem to recall having asked this before ... but does OpenACS even
know what they're doing here?  The difference between ERE mode and ARE
mode is awfully slight.
        regards, tom lane


Re: Could regexp_matches be immutable?

От
Andrew Dunstan
Дата:

Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>   
>> They are probably quite open to changing it, but IIRC it is a setting 
>> imposed by OpenACS, which is what they are based on.
>>     
>
> I seem to recall having asked this before ... but does OpenACS even
> know what they're doing here?  The difference between ERE mode and ARE
> mode is awfully slight.
>   

That's not the worst of it :-( See 
<http://openacs.org/xowiki/How_to_install_in_Postgres_8.x>

cheers

andrew


Re: Could regexp_matches be immutable?

От
David Fetter
Дата:
On Wed, Oct 14, 2009 at 11:51:13PM -0400, Andrew Dunstan wrote:
>
>
> Tom Lane wrote:
>> Andrew Dunstan <andrew@dunslane.net> writes:
>>   
>>> They are probably quite open to changing it, but IIRC it is a
>>> setting imposed by OpenACS, which is what they are based on.
>>
>> I seem to recall having asked this before ... but does OpenACS even
>> know what they're doing here?  The difference between ERE mode and
>> ARE mode is awfully slight.
>
> That's not the worst of it :-( See
> <http://openacs.org/xowiki/How_to_install_in_Postgres_8.x>

This just illustrates the fact that at least as far as PostgreSQL is
concerned, OpenACS is a dead project.  It's been at least 3 major
versions since they even attempted to keep compatible with PostgreSQL.

OpenACS could be revived as a PostgreSQL-supporting piece of software,
and that might be a good thing.  It would entail adjusting OpenACS to
the things PostgreSQL has changed rather than the reverse.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: Could regexp_matches be immutable?

От
Tom Lane
Дата:
David Fetter <david@fetter.org> writes:
> On Wed, Oct 14, 2009 at 11:51:13PM -0400, Andrew Dunstan wrote:
>> That's not the worst of it :-( See
>> <http://openacs.org/xowiki/How_to_install_in_Postgres_8.x>

> This just illustrates the fact that at least as far as PostgreSQL is
> concerned, OpenACS is a dead project.  It's been at least 3 major
> versions since they even attempted to keep compatible with PostgreSQL.

That seems pretty harsh, considering that there's plenty of discussion
of how to use OpenACS with PG on their website.

What it looks like to me is that the recommendation about regex_flavor
might be just cargo-cult programming.  Somebody had some trouble with an
updated PG version, turned on every backwards-compatibility option he
could find, and when it worked he didn't inquire too closely into which
settings were actually important.  And ever since then that's been the
received wisdom about how to make OpenACS run with Postgres.  It would
be interesting to try turning off these options one at a time to see
which ones do matter.

(I'd bet lunch that the one about add_missing_from is bogus, too,
or could easily be made so.  mysql isn't forgiving about missing
FROM items, so it's hard to believe that they have a lot of such
things no matter how little they care about Postgres.)
        regards, tom lane


Re: Could regexp_matches be immutable?

От
Josh Berkus
Дата:
> (I'd bet lunch that the one about add_missing_from is bogus, too,
> or could easily be made so.  mysql isn't forgiving about missing
> FROM items, so it's hard to believe that they have a lot of such
> things no matter how little they care about Postgres.)

OpenACS does the old-style DELETEs without a subselect, so they rely on
add-missing-from for that.  I had to debug this for another user.

--Josh Berkus


Re: Could regexp_matches be immutable?

От
David Fetter
Дата:
On Thu, Oct 15, 2009 at 10:22:52AM -0700, Josh Berkus wrote:
> > (I'd bet lunch that the one about add_missing_from is bogus, too,
> > or could easily be made so.  mysql isn't forgiving about missing
> > FROM items, so it's hard to believe that they have a lot of such
> > things no matter how little they care about Postgres.)
> 
> OpenACS does the old-style DELETEs without a subselect, so they rely
> on add-missing-from for that.  I had to debug this for another user.

Is OpenACS getting enough new deployments to fix this?

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: Could regexp_matches be immutable?

От
Tom Lane
Дата:
Rod Taylor <rod.taylor@gmail.com> writes:
> I tried making a functional index based on an expression containing
> the 2 argument regexp_matches() function. Is there a reason why this
> function is not marked immutable instead of normal?

So I went to see about making the changes to remove regex_flavor, and
was astonished to find that all the regex-related functions are already
marked immutable, and AFAICS always have been.  This is clearly wrong,
and we would have to fix it if we weren't about to remove the GUC.
(In principle we should advise people to change the markings in existing
databases, but given the lack of complaints it's probably not worth the
trouble --- I doubt many applications change regex_flavor on the fly.)

So, having dismissed my original off-the-cuff answer to Rod, the next
question is what's really going wrong for him.  I get this from
a quick trial:

regression=# create table tt1(f1 text, f2 text);
CREATE TABLE
regression=# create index tt1i on tt1(regexp_matches(f1,f2));
ERROR:  index expression cannot return a set

IOW the problem is that regexp_matches returns SETOF, not that it's
marked stable (as it should have been).  I'm not sure what semantics you
were expecting the index to have, but we don't have any useful support
for indexes on sets.
        regards, tom lane


Re: Could regexp_matches be immutable?

От
Peter Eisentraut
Дата:
On Tue, 2009-10-20 at 20:48 -0400, Tom Lane wrote:
> So I went to see about making the changes to remove regex_flavor, and
> was astonished to find that all the regex-related functions are already
> marked immutable, and AFAICS always have been.  This is clearly wrong,
> and we would have to fix it if we weren't about to remove the GUC.
> (In principle we should advise people to change the markings in existing
> databases, but given the lack of complaints it's probably not worth the
> trouble --- I doubt many applications change regex_flavor on the fly.)

Are you sure this wasn't intentional, because it breaks performance and
we doubted that many applications would change regex_flavor on the fly?




Re: Could regexp_matches be immutable?

От
Tom Lane
Дата:
Peter Eisentraut <peter_e@gmx.net> writes:
> On Tue, 2009-10-20 at 20:48 -0400, Tom Lane wrote:
>> So I went to see about making the changes to remove regex_flavor, and
>> was astonished to find that all the regex-related functions are already
>> marked immutable, and AFAICS always have been.  This is clearly wrong,
>> and we would have to fix it if we weren't about to remove the GUC.

> Are you sure this wasn't intentional, because it breaks performance and
> we doubted that many applications would change regex_flavor on the fly?

Intentional or not, it's wrong :-(

In practice I doubt there are many cases where constant-folding a regex
would be possible or performance-critical.  The real use of having it
be immutable is probably Rod's, ie, using it in an index.  And that is
*obviously* really dangerous if there's a GUC affecting the results.
        regards, tom lane


Re: Could regexp_matches be immutable?

От
Rod Taylor
Дата:
> So, having dismissed my original off-the-cuff answer to Rod, the next
> question is what's really going wrong for him.  I get this from
> a quick trial:

I wish I had kept specific notes on what I was actually trying to do.
I tried to_number first then the expression as seen below. I guess I
saw the error again and assumed it was the same as for to_number.

sk=# BEGIN;
BEGIN
sk=#
sk=# create table t1 (col1 text);
CREATE TABLE
sk=# INSERT INTO t1 values ('Z342432');
INSERT 0 1
sk=# INSERT INTO t1 values ('REW9432');
INSERT 0 1
sk=#
sk=# SELECT (regexp_matches(col1, '(\d+)$'))[1] from t1;regexp_matches
----------------3424329432
(2 rows)

sk=#
sk=# create index t1_idx ON t1 ((  (regexp_matches(col1, '(\d+)$'))[1]  ));
ERROR:  index expression cannot return a set
sk=#
sk=# ROLLBACK;
ROLLBACK



It is interesting that "citext" seems to be functional with exactly
the same statements.

sk=# BEGIN;
BEGIN
sk=#
sk=# create table t1 (col1 citext);
CREATE TABLE
sk=# INSERT INTO t1 values ('Z342432');
INSERT 0 1
sk=# INSERT INTO t1 values ('REW9432');
INSERT 0 1
sk=#
sk=# SELECT (regexp_matches(col1, '(\d+)$'))[1] from t1;regexp_matches
----------------3424329432
(2 rows)

sk=#
sk=# create index t1_idx ON t1 ((  (regexp_matches(col1, '(\d+)$'))[1]  ));
CREATE INDEX
sk=#
sk=# ROLLBACK;
ROLLBACK



The function regexp_replace(col1, '^[^0-9]+', '') does seem to do the
trick for text.


Re: Could regexp_matches be immutable?

От
Tom Lane
Дата:
Rod Taylor <rod.taylor@gmail.com> writes:
> It is interesting that "citext" seems to be functional with exactly
> the same statements.

Huh, it looks to me like that's an error in the declaration of the
citext versions of regexp_matches --- they should be declared to return
setof text[], the same as the underlying text functions.  David, do you
agree?
        regards, tom lane


Re: Could regexp_matches be immutable?

От
"David E. Wheeler"
Дата:
On Oct 21, 2009, at 7:27 AM, Tom Lane wrote:

> Huh, it looks to me like that's an error in the declaration of the
> citext versions of regexp_matches --- they should be declared to  
> return
> setof text[], the same as the underlying text functions.  David, do  
> you
> agree?

Ooh, yeah, dunno how I missed that.

Best,

David


Re: Could regexp_matches be immutable?

От
Tom Lane
Дата:
"David E. Wheeler" <david@kineticode.com> writes:
> On Oct 21, 2009, at 7:27 AM, Tom Lane wrote:
>> Huh, it looks to me like that's an error in the declaration of the
>> citext versions of regexp_matches --- they should be declared to  
>> return
>> setof text[], the same as the underlying text functions.  David, do  
>> you agree?

> Ooh, yeah, dunno how I missed that.

I think we're probably stuck in 8.4, but we should fix it going
forward.  Would you make a quick check if any of the other citext
functions have the same bug?
        regards, tom lane


Re: Could regexp_matches be immutable?

От
"David E. Wheeler"
Дата:
On Oct 21, 2009, at 9:37 AM, Tom Lane wrote:

>> Ooh, yeah, dunno how I missed that.
>
> I think we're probably stuck in 8.4, but we should fix it going
> forward.  Would you make a quick check if any of the other citext
> functions have the same bug?

I've fixed it in my [version for 8.3](https://svn.kineticode.com/citext/trunk 
). Is there a straight-foward way to check such a thing  
programmatically, with a query perhaps? Or should I just put aside an  
hour to do an audit?

Best,

David


Re: Could regexp_matches be immutable?

От
David E. Wheeler
Дата:
On Oct 21, 2009, at 9:40 AM, David E. Wheeler wrote:

> On Oct 21, 2009, at 9:37 AM, Tom Lane wrote:
>
>>> Ooh, yeah, dunno how I missed that.
>>
>> I think we're probably stuck in 8.4, but we should fix it going
>> forward.  Would you make a quick check if any of the other citext
>> functions have the same bug?
>
> I've fixed it in my [version for 8.3](https://svn.kineticode.com/citext/trunk 
> ). Is there a straight-foward way to check such a thing  
> programmatically, with a query perhaps? Or should I just put aside  
> an hour to do an audit?

FWIW, I think that this is a bug, and that the variation from the text  
version will be unexpected. I recommend fixing it for 8.4.2.

Best,

David



Re: Could regexp_matches be immutable?

От
Tom Lane
Дата:
"David E. Wheeler" <david@kineticode.com> writes:
> FWIW, I think that this is a bug, and that the variation from the text  
> version will be unexpected. I recommend fixing it for 8.4.2.

Well, it's certainly a bug, but I don't think it's back-patchable.
A back-patch will not affect existing installations anyway.  What it
will do is break user code that is expecting the existing behavior
(for instance, Rod's index).  I think it's something we can only
change at a major version boundary.
        regards, tom lane


Re: Could regexp_matches be immutable?

От
Tom Lane
Дата:
"David E. Wheeler" <david@kineticode.com> writes:
> Is there a straight-foward way to check such a thing  
> programmatically, with a query perhaps? Or should I just put aside an  
> hour to do an audit?

I was wondering whether you could query pg_proc to look for functions
with the same name and different arguments/results.  It's a bit tricky
though because you'd expect s/citext/text/ in at least some positions
(maybe not all)?
        regards, tom lane


Re: Could regexp_matches be immutable?

От
"David E. Wheeler"
Дата:
On Oct 21, 2009, at 9:48 AM, Tom Lane wrote:

> I was wondering whether you could query pg_proc to look for functions
> with the same name and different arguments/results.  It's a bit tricky
> though because you'd expect s/citext/text/ in at least some positions
> (maybe not all)?

Yeah, almost all. I'll poke around, though it might be a day or two…

Best,

David