Обсуждение: BUG #2261: ILIKE seems to be buggy on koi8 input
The following bug has been logged online: Bug reference: 2261 Logged by: Evgeny Gridasov Email address: eugrid@fpm.kubsu.ru PostgreSQL version: 8.1.2 Operating system: Debian Linux Description: ILIKE seems to be buggy on koi8 input Details: my terminal is RU_ru.KOI8-R, template1's encoding is UTF8. ILIKE seems to be buggy when comparing russian strings, while UPPER/LOWER works OK. template1=# \encoding koi8; try to get uppercase of some russian letters: template1=# select upper('ÑÑва'); upper ------- ФЫÐÐ (1 row) result is OK! next, try to compare uppercase and lowercase using ILIKE: template1=# select true where 'ÑÑва' ilike 'ФЫÐÐ'; bool ------ (0 rows) OOPS! Nothing happened. But why? try the same but with latin charset letters: template1=# select true where 'asdf' ilike 'ASDF'; bool ------ t (1 row) Try to compare lowercase with lowercase (russian): template1=# select true where 'ÑÑва' ilike 'ÑÑва'; bool ------ t (1 row) it works.
"Evgeny Gridasov" <eugrid@fpm.kubsu.ru> writes: > my terminal is RU_ru.KOI8-R, > template1's encoding is UTF8. > ILIKE seems to be buggy when comparing russian strings, > while UPPER/LOWER works OK. I'll bet that the database's locale setting is expecting some encoding other than UTF8 :-(. You need to have compatible locale and encoding settings inside the database. You didn't say exactly what the database LC_COLLATE value is, but if it's RU_ru.KOI8-R, that definitely does not match UTF8. regards, tom lane
Evgeny Gridasov <eugrid@fpm.kubsu.ru> writes: > postgresql server starts with environment: > LC_COLLATE=en_US.UTF-8 > LC_ALL=en_US.UTF-8 > LANG=en_US.UTF-8 Well, that setting shouldn't translate much except A-Z/a-z. If you want cyrillic upper/lower case conversions you need database's LC_CTYPE to be ru_RU.something. regards, tom lane
postgresql server starts with environment: LC_COLLATE=en_US.UTF-8 LC_ALL=en_US.UTF-8 LANG=en_US.UTF-8 I've tried to set different LC_COLLATE/LC_ALL/LANG settings but it did not help. I've tried to change my psql input to unicode russian, but it did not help, too. 'show all' says I've got lc_collate and other lc_* set to en_US.UTF-8. initdb was run with this locale. It cannot be modified setting it in postgresql.conf (creation db constant?) Should I reinit database to get this working or what? If I should reinit db, what locale should I choose? BTW, ~* syntax does not also work with upper/lower case russian letters, while upper()/lower() still work ok. On Wed, 15 Feb 2006 12:44:18 -0500 Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Evgeny Gridasov" <eugrid@fpm.kubsu.ru> writes: > > my terminal is RU_ru.KOI8-R, > > template1's encoding is UTF8. > > ILIKE seems to be buggy when comparing russian strings, > > while UPPER/LOWER works OK. > > I'll bet that the database's locale setting is expecting some encoding > other than UTF8 :-(. You need to have compatible locale and encoding > settings inside the database. You didn't say exactly what the database > LC_COLLATE value is, but if it's RU_ru.KOI8-R, that definitely does not > match UTF8. > > regards, tom lane -- Evgeny Gridasov Software Engineer I-Free, Russia
Evgeny Gridasov wrote: > It cannot be modified setting it in postgresql.conf (creation db > constant?) Should I reinit database to get this working or what? Yes. > If I should reinit db, what locale should I choose? Something like ru_RU.utf8. -- Peter Eisentraut http://developer.postgresql.org/~petere/