Re: Move defaults toward ICU in 16?

Поиск
Список
Период
Сортировка
От Jonathan S. Katz
Тема Re: Move defaults toward ICU in 16?
Дата
Msg-id 46d615da-ede5-ddc4-af50-d71ff1587ec8@postgresql.org
обсуждение исходный текст
Ответ на Re: Move defaults toward ICU in 16?  (Jeff Davis <pgsql@j-davis.com>)
Ответы Re: Move defaults toward ICU in 16?  (Jeff Davis <pgsql@j-davis.com>)
Список pgsql-hackers
On 2/13/23 8:11 PM, Jeff Davis wrote:
> On Thu, 2023-02-02 at 05:13 -0800, Jeff Davis wrote:
>> As a project, do we want to nudge users toward ICU as the collation
>> provider as the best practice going forward?
> 
> One consideration here is security. Any vulnerability in ICU collation
> routines could easily become a vulnerability in Postgres.

Would it be any different than a vulnerability in OpenSSL et al? I know 
that's a general, nuanced question but it would be good to understand if 
we are exposing ourselves to any more vulnerabilities. And would it be 
any different than today, given people can build PG with libicu as is?

Continuing on $SUBJECT, I wanted to understand performance comparisons. 
I saw your comments[1] in response to Robert's question, looked at your 
benchmarks[2] and one that ICU ran on older versions[3]. It seems that 
in general, users would see performance gains switching to ICU. The only 
one in [3] that stood out to me was the tests on the "ko_KR" collation 
underperformed on a list of Korean names, but maybe that is better in 
newer versions.

I agree with most of your points in [1]. The platform-consistent 
behavior is a good point, especially with more PG deployments running on 
different systems. While taking on a new dependency is a concern, ICU 
was released in 1999[4], has an active community, and seems to follow 
standards (i.e. the Unicode Consortium).

I do wonder about upgrades, beyond the ongoing work with pg_upgrade. I 
think the logical methods (pg_dumpall, logical replication) should 
generally be OK, but we should ensure we think of things that could go 
wrong and how we'd answer them.

Based on the available data, I think it's OK to move towards ICU as the 
default, or preferred, collation provider. I agree (for now) in not 
taking a hard dependency on ICU.

Thanks,

Jonathan

[1] 
https://www.postgresql.org/message-id/b676252eeb57ab8da9dbb411d0ccace95caeda0a.camel%40j-davis.com
[2] 
https://www.postgresql.org/message-id/64039a2dbcba6f42ed2f32bb5f0371870a70afda.camel@j-davis.com
[3] https://icu.unicode.org/charts/collation-icu4c48-glibc
[4] https://en.wikipedia.org/wiki/International_Components_for_Unicode

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: We shouldn't signal process groups with SIGQUIT
Следующее
От: Jim Jones
Дата:
Сообщение: Re: [PATCH] Add pretty-printed XML output option