Re: SQL Error [0A000]: ERROR: nondeterministic collations are not supported for LIKE

Поиск
Список
Период
Сортировка
От Daniel Verite
Тема Re: SQL Error [0A000]: ERROR: nondeterministic collations are not supported for LIKE
Дата
Msg-id ba826053-a7af-43d8-9fd3-ddfbcb28104e@manitou-mail.org
обсуждение исходный текст
Ответ на SQL Error [0A000]: ERROR: nondeterministic collations are notsupported for LIKE  (Jeff Lanzarotta <jeff.lanzarotta@gmail.com>)
Ответы Re: SQL Error [0A000]: ERROR: nondeterministic collations are notsupported for LIKE  (Jeff Lanzarotta <jeff.lanzarotta@gmail.com>)
Список pgsql-general
    Jeff Lanzarotta wrote:

> I have a question about nondeterministic collations in PostgreSQL 12.  I
> have created a new collation that is nondeterministic and created several
> columns which use this collation.  Querying these columns works great until
> I use LIKE.  When I do, I get the following error:
>
> SQL Error [0A000]: ERROR: nondeterministic collations are not supported for
> LIKE
>
> Is there any plan to allow this functionality?

PostgreSQL development is conducted without a roadmap [1]. Maybe
someone will submit a patch to enable LIKE with nondeterministic
collations, but so far it did not happen according to the current set
of patches at https://commitfest.postgresql.org

Such matches can be weirder than you might think (not to
mention much slower).
Consider for instance a collation that ignores punctuation:

CREATE COLLATION "nd3alt" (
  provider = 'icu',
  locale='und@colAlternate=shifted',
  deterministic = false
);

In the icu_ext extension, icu_strpos [2] can match a substring with
a nondeterministic collation, which is one part of what LIKE
would need to do for such collations. The implementation uses
the string search facility of the ICU library.

With the above-defined collation, we can have for instance:

SELECT icu_strpos('abc.  ...de', 'c,d' COLLATE nd3alt);
 icu_strpos
------------
      3

So even though 'c,d' is not a substring of 'abc.  ...de' in the common
sense, it is recognized as such by this collation, by design.

A LIKE operator for nondeterministic collations should be able to
recognize this too, but with an arbitrary number of substrings to
match in the pattern, plus it should handle the underscore wildcard
in a way that hopefully makes sense.

With the example above,
   'abc.  ...de' LIKE '%c,d%' COLLATE nd3alt
should certainly be a match, but in the case of this variant:
  'abc.  ...de' LIKE '%c_d%' COLLATE nd3alt
it's not necessarily clear how (or even if) it should work.


[1] https://www.postgresql.org/developer/roadmap/
[2] https://github.com/dverite/icu_ext#icu_strpos


Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite



В списке pgsql-general по дате отправления:

Предыдущее
От: Laurenz Albe
Дата:
Сообщение: Re: jsonb_set() strictness considered harmful to data
Следующее
От: Jeff Lanzarotta
Дата:
Сообщение: Re: SQL Error [0A000]: ERROR: nondeterministic collations are notsupported for LIKE