Обсуждение: Problem with LATIN1 characters from Perl-DBI

Поиск
Список
Период
Сортировка

Problem with LATIN1 characters from Perl-DBI

От
Andreas Joseph Krogh
Дата:
Hi.
I have a database created with -E LATIN1. Inserting norwegian characters lik
'ø' works perfectly from JDBC, but from Perl, it stores the word 'søker' as
'søker'(UNICODE).

perl --version:
This is perl, v5.8.3 built for i386-linux-thread-multi
A Mandrake-10 Linux system.

I first had the problem printing out LATIN1 chars to stdout too, but solved
that by using the pragma
use encoding 'ISO-8859-1';

I've tried:
$dbh->do("set CLIENT_ENCODING TO 'ISO-8859-1'")
or die("Couldn't set encoding to ISO-8859-1");
but that didn't work.

Any hints anyone?

--
Andreas Joseph Krogh <andreak@officenet.no>
Senior Software Developer / Manager
gpg public_key: http://dev.officenet.no/~andreak/public_key.asc
------------------------+---------------------------------------------+
OfficeNet AS            | Two tomatoes in a fridge. One tomato says   |
Hoffsveien 17           | to the other, "It's cold in here, isn't it?"|
PO. Box 425 Skøyen      | The other tomato says, "F**king hell,       |
0213 Oslo               | a talking tomato!"                          |
NORWAY                  |                                             |
Phone : +47 22 13 01 00 |                                             |
Direct: +47 22 13 10 03 |                                             |
Mobile: +47 909  56 963 |                                             |
------------------------+---------------------------------------------+


Re: Problem with LATIN1 characters from Perl-DBI

От
Andreas Joseph Krogh
Дата:
On Tuesday 07 September 2004 14:06, you wrote:
> Hi.
> I have a database created with -E LATIN1. Inserting norwegian characters
> lik 'ø' works perfectly from JDBC, but from Perl, it stores the word
> 'søker' as 'søker'(UNICODE).
>
> perl --version:
> This is perl, v5.8.3 built for i386-linux-thread-multi
> A Mandrake-10 Linux system.
>
> I first had the problem printing out LATIN1 chars to stdout too, but solved
> that by using the pragma
> use encoding 'ISO-8859-1';
>
> I've tried:
> $dbh->do("set CLIENT_ENCODING TO 'ISO-8859-1'")
> or die("Couldn't set encoding to ISO-8859-1");
> but that didn't work.
>
> Any hints anyone?

Replying to my self:

I fixed it by using the following:
use encoding 'ISO-8859-1';
use Unicode::MapUTF8 qw(to_utf8 from_utf8 utf8_supported_charset);

$tmp_text = from_utf8({ -string => $plain_text, -charset => 'ISO-8859-1' });
$retval = $insert_stmt->execute($tmp_text);

The problem was that the contents of $plain_text was obtained by some library
which returned text in utf8. When printing it out to stdout, the 'use
encoding' pragma took care of the conversion, but that didn't work for
inserting the contents of $plain_text into the database. So I must convert it
to latin1 using the from_utf8 subroutine *before* inserting it into the DB.

--
Andreas Joseph Krogh <andreak@officenet.no>
Senior Software Developer / Manager
gpg public_key: http://dev.officenet.no/~andreak/public_key.asc
------------------------+---------------------------------------------+
OfficeNet AS            | Two tomatoes in a fridge. One tomato says   |
Hoffsveien 17           | to the other, "It's cold in here, isn't it?"|
PO. Box 425 Skøyen      | The other tomato says, "F**king hell,       |
0213 Oslo               | a talking tomato!"                          |
NORWAY                  |                                             |
Phone : +47 22 13 01 00 |                                             |
Direct: +47 22 13 10 03 |                                             |
Mobile: +47 909  56 963 |                                             |
------------------------+---------------------------------------------+