Re: May "PostgreSQL server side GB18030 character set support" reconsidered?

Поиск
Список
Период
Сортировка
От Tatsuo Ishii
Тема Re: May "PostgreSQL server side GB18030 character set support" reconsidered?
Дата
Msg-id 20201005.174109.1358847281635212616.t-ishii@sraoss.co.jp
обсуждение исходный текст
Ответ на May "PostgreSQL server side GB18030 character set support" reconsidered?  (Han Parker <parker.han@outlook.com>)
Ответы 回复: May "PostgreSQL server side GB18030 character set support" reconsidered?  (Han Parker <parker.han@outlook.com>)
Список pgsql-general
> Hi,
> 
> May "GB18030 server side support" deserve reconsidering, after about 15 years later than  release of GB18030-2005?
> It may be the one of most green features for PostgreSQL.

Moving GB18030 to server side encoding requires a technical challenge:
currently PostgreSQL's SQL parser and perhaps in other parts of
backend assume that each byte in a string data is not confused with
ASCII byte. Since GB18030's second and fourth byte are in range of
0x40 to 0x7e, backend will be confused. How do you resolve the
technical challenge exactly?

> 1. In this big data and mobile era, in the country with most population, 50% more disk energy consuming for Chinese
characters(UTF-8 usually 3 bytes for a Chinese character, while GB180830 only 2 bytes) is indeed a harm to "Carbon
Neutral", along with Polar ice melting.
 

Really? I thought GB18030 uses up to 4 bytes.
https://en.wikipedia.org/wiki/GB_18030#Encoding

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp



В списке pgsql-general по дате отправления:

Предыдущее
От: Ian Barwick
Дата:
Сообщение: Re: which git workflow is used by pg comminuty developers?
Следующее
От: Thorsten Schöning
Дата:
Сообщение: What's your experience with using Postgres in IoT-contexts?