Proof of concept:
PG 7.3 using regression database:
regression=# select count(*) from tenk1 where 'quotidian' ~ string4;count
------- 0
(1 row)
Time: 676.14 ms
regression=# select count(*) from tenk1 where 'quotidian' ~ stringu1;count
------- 0
(1 row)
Time: 3426.96 ms
regression=# select count(*) from tenk1 where 'quotidian' ~* stringu1;count
------- 0
(1 row)
Time: 466344.48 ms
CVS tip plus code extracted from Tcl:
regression=# select count(*) from tenk1 where 'quotidian' ~ string4;count
------- 0
(1 row)
Time: 472.48 ms
regression=# select count(*) from tenk1 where 'quotidian' ~ stringu1;count
------- 0
(1 row)
Time: 4414.91 ms
regression=# select count(*) from tenk1 where 'quotidian' ~* stringu1;count
------- 0
(1 row)
Time: 4608.49 ms
In the first case there are only four distinct patterns used, so we're
running with cached precompiled regexes. In the other cases a new regex
compilation must occur at each row. So, regex execution is a little
faster than before (at least for trivial regexes); compilation seems to
be a shade slower, but it doesn't fall over and die when compiling
case-insensitive patterns (or bracket expressions, which is what the
code actually reduces a case-insensitive pattern to).
This is nowhere near ready to commit, but it compiles cleanly and passes
regression tests ...
regards, tom lane