Re: is this a bug or I am blind?

Поиск
Список
Период
Сортировка
От Martijn van Oosterhout
Тема Re: is this a bug or I am blind?
Дата
Msg-id 20051216175411.GA11985@svana.org
обсуждение исходный текст
Ответ на Re: is this a bug or I am blind?  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: is this a bug or I am blind?  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: is this a bug or I am blind?  (Martijn van Oosterhout <kleptog@svana.org>)
Список pgsql-general
On Fri, Dec 16, 2005 at 12:12:08PM -0500, Tom Lane wrote:
> Perhaps the fast-path check is a bad idea, but fixing this is not just
> a matter of removing that.  If we subscribe to strcoll's worldview then
> we have to conclude that *text strings are not hashable*, because
> strings that should be "equal" may have different hash codes.  And at
> least in the current PG code, that's not something we can flip on and off
> depending on the locale --- texteq would have to be marked non hashable
> in the system catalogs, meaning a big performance hit for *everybody*
> even if their locale is not this weird.

That's true, in the sense that unconverted strings are not hashable.
This is what strxfrm was created for, to return the sorting key for a
string. A quick C program demonstrates that indeed in that locale these
two strings are equal, whereas in en_AU they are not.

$ LC_ALL=hu_HU ./strxfrm potyty potty
String  1: potyty
Strxfrm 1: " ((\x01\x02\x02\x02\x02\x01\x02\x02\x02\x02
String  2: potty
Strxfrm 2: " ((\x01\x02\x02\x02\x02\x01\x02\x02\x02\x02
$ LC_ALL=en_AU ./strxfrm potyty potty
String  1: potyty
Strxfrm 1: \x1B\x1A\x1F$\x1F$\x01\x02\x02\x02\x02\x02\x02\x01\x02\x02\x02\x02\x02\x02
String  2: potty
Strxfrm 2: \x1B\x1A\x1F\x1F$\x01\x02\x02\x02\x02\x02\x01\x02\x02\x02\x02\x02

I think the only way to make indexes properly locale sensetive would be
to either use strcoll() in all cases, or store the result from
strxfrm() in the index. Anything else will break somewhere.

In any case, we first need to determine which answer is correct, before
we run off trying to fix it.

This is Glibc 2.3.2 on a Debian Linux system.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Вложения

В списке pgsql-general по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: is this a bug or I am blind?
Следующее
От: Csaba Nagy
Дата:
Сообщение: Re: is this a bug or I am blind?