Re: Trouble with UTF-8 data

Поиск

Список

Период

Сортировка

От	Tom Lane
Тема	Re: Trouble with UTF-8 data
Дата	17 января 2008 г. 22:39:05
Msg-id	16915.1200613130@sss.pgh.pa.us обсуждение исходный текст
Ответ на	Trouble with UTF-8 data (Janine Sisk <janine@furfly.net>)
Ответы	Re: Trouble with UTF-8 data ("Albe Laurenz" <laurenz.albe@wien.gv.at>)
Список	pgsql-general

Дерево обсуждения

Janine Sisk <janine@furfly.net> writes:
> But I'm still getting this error when loading the data into the new
> database:

> ERROR:  invalid byte sequence for encoding "UTF8": 0xeda7a1

The reason PG doesn't like this sequence is that it corresponds to
a Unicode "surrogate pair" code point, which is not supposed to
ever appear in UTF-8 representation --- surrogate pairs are a kluge for
UTF-16 to deal with Unicode code points of more than 16 bits.  See

http://en.wikipedia.org/wiki/UTF-16

I think you need a version of iconv that knows how to fold surrogate
pairs into proper UTF-8 form.  It might also be that the data is
outright broken --- if this sequence isn't followed by another
surrogate-pair sequence then it isn't valid Unicode by anybody's
interpretation.

7.2.x unfortunately didn't check Unicode data carefully, and would
have let this data pass without comment ...

            regards, tom lane

В списке pgsql-general по дате отправления:

Предыдущее

От: "Merlin Moncure"
Дата: 17 января 2008 г., 22:33:59
Сообщение: Re: Accessing composite type columns from C

Следующее

От: Ivan Sergio Borgonovo
Дата: 17 января 2008 г., 22:54:33
Сообщение: case dumbiness in return from functions

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Trouble with UTF-8 data

Предыдущее

Следующее