Обсуждение: Automatic locale detection?

Поиск

Список

Период

Сортировка

Automatic locale detection?

От

Matthew Peter

Дата:

08 октября 2006 г., 19:04:06

Is it possible to automatically detect the language encoding of incoming data? For instance if Japanese is used, is there a way to know it is Japanese from a bit in the charset, a dictionary-based evaluation or otherwise?

All-new Yahoo! Mail - Fire up a more powerful email and get things done faster.

Re: Automatic locale detection?

От

Martijn van Oosterhout

Дата:

09 октября 2006 г., 11:11:53

On Sun, Oct 08, 2006 at 12:04:01PM -0700, Matthew Peter wrote:
> Is it possible to automatically detect the language encoding of
> incoming data? For instance if Japanese is used, is there a way to
> know it is Japanese from a bit in the charset, a dictionary-based
> evaluation or otherwise?

While technically possible, do you really want to run the risk of
getting it wrong?

Secondly, if you don't know the encoding of your data, you've got a
security problem, since you can't safely escape the data.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Вложения

signature.asc

Re: Automatic locale detection?

От

Lexington Luthor

Дата:

09 октября 2006 г., 11:54:09

Matthew Peter wrote:
> Is it possible to automatically detect the language encoding of incoming
> data? For instance if Japanese is used, is there a way to know it is
> Japanese from a bit in the charset, a dictionary-based evaluation or
> otherwise?
>

Have a look at http://www.mozilla.org/projects/intl/chardet.html and
http://chardet.feedparser.org/ for some implementations of this idea.

These detectors are often inaccurate though (and sometimes fail
completely), see the warning at the bottom of
http://chardet.feedparser.org/docs/supported-encodings.html

Regards,
LL

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Automatic locale detection?

Automatic locale detection?

Re: Automatic locale detection?

Вложения

Re: Automatic locale detection?