Andrew,
On 9/20/07, Andrew Dunstan <andrew@dunslane.net> wrote:
> Please try the attached patch, which goes back to using a special case
> for single-byte ILIKE. I want to make sure that at the very least we
> don't cause a performance regression with the code done this release. I
> can't see an obvious way around the problem for multi-byte case -
> lower() then requires converting to and from wchar, and I don't see a
> way of avoiding calling lower(). If this is a major blocker I would
> suggest you look at an alternative to using ILIKE for your UTF8 data.
I tested your patch with latin1 and C encoding.
It's better but still slower than 8.2.
C results:
cityvox_c=# SELECT e.numeve FROM evenement e WHERE e.libgeseve LIKE
'%hocus pocus%';numeve
--------
(0 rows)
Time: 113.655 ms
cityvox_c=# SELECT e.numeve FROM evenement e WHERE e.libgeseve ILIKE
'%hocus pocus%'; numeve
-----------900024298 87578
(2 rows)
Time: 124.829 ms
Latin1 results:
cityvox_latin1=# SELECT e.numeve FROM evenement e WHERE e.libgeseve
LIKE '%hocus pocus%';numeve
--------
(0 rows)
Time: 113.207 ms
cityvox_latin1=# SELECT e.numeve FROM evenement e WHERE e.libgeseve
ILIKE '%hocus pocus%'; numeve
-----------900024298 87578
(2 rows)
Time: 123.163 ms
And to answer your IRC question about switching to regexp, it's even
slower than the new UTF-8 ILIKE of 8.3 so I don't think it's the way
to go :).
Regards,
--
Guillaume