Re: insensitive collations

Поиск
Список
Период
Сортировка
От Daniel Verite
Тема Re: insensitive collations
Дата
Msg-id 54ef5bfe-043e-4328-9d70-5818789838f4@manitou-mail.org
обсуждение исходный текст
Ответ на Re: insensitive collations  (Jim Finnerty <jfinnert@amazon.com>)
Ответы Re: insensitive collations  (Jim Finnerty <jfinnert@amazon.com>)
Список pgsql-hackers
Jim Finnerty wrote:

> For a UTF8 encoded, case-insensitive (nondeterministic), accent-sensitive
> ICU
> collation, a LIKE predicate can be used with a small transformation of the
> predicate, and the pattern can contain multi-byte characters:
>
> from:
>
> SELECT * FROM locations WHERE location LIKE 'midi-Pyrené%';
> -- ERROR:  nondeterministic collations are not supported for LIKE
>
> to:
>
> SELECT * FROM locations WHERE lower(location) COLLATE "C" LIKE
> lower('midi-Pyrené%');

For prefix matching, there's a simpler way with non-deterministic
collations based on the advice in [1]

The trick is that if an ICU collation is assigned to "location",
whether it's deterministic or not,

SELECT * FROM locations WHERE location LIKE 'midi-Pyrené%';

is equivalent to:

SELECT * FROM locations WHERE location BETWEEN
   'midi-Pyrené' AND 'midi-Pyrené' || E'\uFFFF';

and that will use a btree index if available.

Also, it works with all features of ND-collations and all encodings, not
just case-insensitiveness and UTF-8.

Now that doesn't solve LIKE '%midi-Pyrené%', or LIKE '%midi_Pyrené%',
but that trick could be a building block for an algorithm implementing
LIKE with ND-collations in the future.

[1]

https://unicode-org.github.io/icu/userguide/collation/architecture.html#generating-bounds-for-a-sort-key-prefix-matching


Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: https://www.manitou-mail.org
Twitter: @DanielVerite



В списке pgsql-hackers по дате отправления:

Предыдущее
От: David Steele
Дата:
Сообщение: Re: SQL/JSON: JSON_TABLE
Следующее
От: Fujii Masao
Дата:
Сообщение: Re: make the stats collector shutdown without writing the statsfiles if the immediate shutdown is requested.