Re: UTF8 problem

Поиск
Список
Период
Сортировка
От Stephane Bortzmeyer
Тема Re: UTF8 problem
Дата
Msg-id 20060615143430.GA17590@nic.fr
обсуждение исходный текст
Ответ на Re: UTF8 problem  (Douglas McNaught <doug@mcnaught.org>)
Список pgsql-general
On Thu, Jun 08, 2006 at 07:25:35AM -0400,
 Douglas McNaught <doug@mcnaught.org> wrote
 a message of 29 lines which said:

> I would think it would (at least potentially) vary with each
> message.  The dbmail software should really set client_encoding
> based on the Content-Transfer-Encoding header in the message (or
> whatever it's called).

A *big* warning from someone who stores email in PostgreSQL: many
email messages *lie*. They have a Content-transfer-encoding and then
they actually use another encoding.

If you blindly try to inject the body of the message into PostgreSQL,
with the indicated encoding, you will sometimes fail, for instance if
the message claim to be in UTF-8 but is not (something that PostgreSQL
will detect).

Either you:

* "sanitize" all incoming data
* or you accept to reject these invalid email
* or you store them in a unstructured field (a blob)




В списке pgsql-general по дате отправления:

Предыдущее
От: "surabhi.ahuja"
Дата:
Сообщение: B+ versus hash maps
Следующее
От: Jon Lapham
Дата:
Сообщение: A few questions about carriage returns (\r)