Re: encoding affects ICU regex character classification

Поиск
Список
Период
Сортировка
От Jeremy Schneider
Тема Re: encoding affects ICU regex character classification
Дата
Msg-id 452c0341-6c6a-4a87-8b90-6320831094ea@ardentperf.com
обсуждение исходный текст
Ответ на Re: encoding affects ICU regex character classification  (Jeff Davis <pgsql@j-davis.com>)
Ответы Re: encoding affects ICU regex character classification  (Thomas Munro <thomas.munro@gmail.com>)
Re: encoding affects ICU regex character classification  (Jeff Davis <pgsql@j-davis.com>)
Список pgsql-hackers
On 12/14/23 7:12 AM, Jeff Davis wrote:
> The concern over unassigned code points is misplaced. The application
> may be aware of newly-assigned code points, and there's no way they
> will be mapped correctly in Postgres if the provider is not aware of
> those code points. The user can either proceed in using unassigned code
> points and accept the risk of future changes, or wait for the provider
> to be upgraded.

This does not seem to me like a good way to view the situation.

Earlier this summer, a day or two after writing a document, I was
completely surprised to open it on my work computer and see "unknown
character" boxes. When I had previously written the document on my home
computer and when I had viewed it from my cell phone, everything was
fine. Apple does a very good job of always keeping iPhones and MacOS
versions up-to-date with the latest versions of Unicode and latest
characters. iPhone keyboards make it very easy to access any character.
Emojis are the canonical example here. My work computer was one major
version of MacOS behind my home computer.

And I'm probably one of a few people on this hackers email list who even
understands what the words "unassigned code point" mean. Generally DBAs,
sysadmins, architects and developers who are all part of the tangled web
of building and maintaining systems which use PostgreSQL on their
backend are never going to think about unicode characters proactively.

This goes back to my other thread (which sadly got very little
discussion): PosgreSQL really needs to be safe by /default/ ... having
GUCs is fine though; we can put explanation in the docs about what users
should consider if they change a setting.

-Jeremy


-- 
http://about.me/jeremy_schneider




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jeremy Schneider
Дата:
Сообщение: Re: Built-in CTYPE provider
Следующее
От: Japin Li
Дата:
Сообщение: Re: Transaction timeout