Re: BUG #18362: unaccent rules and Old Greek text

Поиск
Список
Период
Сортировка
От Thomas Munro
Тема Re: BUG #18362: unaccent rules and Old Greek text
Дата
Msg-id CA+hUKGJmgaxpNn5x1Po1kmUxDiojsYWVWKKvhX+4QnyjDCWKKQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: BUG #18362: unaccent rules and Old Greek text  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: BUG #18362: unaccent rules and Old Greek text
Re: BUG #18362: unaccent rules and Old Greek text
Список pgsql-bugs
On Thu, May 16, 2024 at 1:40 AM Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, May 15, 2024 at 2:45 AM Peter Eisentraut <peter@eisentraut.org> wrote:
> > On 14.05.24 16:51, Robert Haas wrote:
> > The rules are only loaded once on first use, right?  I tested with
> >
> > date; for x in $(seq 1 1000); do psql -X -c "select unaccent('foobar')"
> > -o /dev/null; done; date
> >
> > and this had the same runtime (about 8 seconds here) with and without
> > the patch.
>
> Cool. Sounds like that's not a problem.

Thanks Peter for testing, and thanks Robert for kicking this thread.

> > Btw., with the patch I get
> >
> > WARNING:  duplicate source strings, first one will be used
> >
> > so it will need to adjustments in how the rules are produced.
>
> OK. Does anyone want to look into that?

I think the problem is that the new "simple redirection" rule from the
Unicode database produces some values that are also present in
Latin-ASCII.xml, and these are all tolerated as long as the "from" and
"to" strings both match, because we uniquify them as pairs.  But there
is one pair where the "to" string is different, resulting in this
clash:

ℌ      x
ℌ      H

I think the first line might actually be a bug in CLDR data.  I dunno,
but this just doesn't look right:

ℌ → x ; # 210C;BLACK-LETTER CAPITAL H (compat)

And in the tests I now see that Michael had already figured that out!
I've included a kludge to remove that.  Someone should file a ticket with CLDR.

Вложения

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Thomas Munro
Дата:
Сообщение: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Следующее
От: Sandeep Thakkar
Дата:
Сообщение: Re: Issues in finding libeay.dll and ssleay.dll for win x64