Обсуждение: Inserting Unicode into Postgre

Поиск
Список
Период
Сортировка

Inserting Unicode into Postgre

От
"Firestar"
Дата:
Hi,

I'm currently using PostgreSQL 7.0 on Solaris. My Java program receives
strings in Big5
encoding and will store them in PostgreSQL (via JDBC). However, the inserted
strings become
multiple '?' (question marks) instead everytime i do a insert command. And
when i retrieve them,
via JDBC, the string becomes those question marks.

Is the problem due to the Unicode encoding that Java String uses, or must i
enable multibyte-support
in my postgre installation? If i enable multibyte support, should i create
my table with Unicode support,
or Big5?

Thanks in advance.

Firestar



Re: Inserting Unicode into Postgre

От
Tatsuo Ishii
Дата:
> I'm currently using PostgreSQL 7.0 on Solaris. My Java program receives
> strings in Big5
> encoding and will store them in PostgreSQL (via JDBC). However, the inserted
> strings become
> multiple '?' (question marks) instead everytime i do a insert command. And
> when i retrieve them,
> via JDBC, the string becomes those question marks.
>
> Is the problem due to the Unicode encoding that Java String uses, or must i
> enable multibyte-support
> in my postgre installation? If i enable multibyte support, should i create
> my table with Unicode support,
> or Big5?

First of all, you cannot store Big5 data into PostgreSQL. You need to
convert Big5 to either EUC_TW or UTF-8 before storing them into
PostgreSQL database. There are several ways to accompish this.

The easiest way would be upgrade to 7.1 with multibyte support enabled
and create a database with UNICODE (actially UTF-8) or EUC_TW
encoding. In this environment, 7.1's JDBC driver would recognize the
database encoding correctly, and do an automatic conversion between
database encodings and UTF-8, that is Java's internal encoding.

Ask Java expers on this list for more details.
--
Tatsuo Ishii


Re: Inserting Unicode into Postgre

От
He Weiping
Дата:
Firestar wrote:

> Hi,
>
> I'm currently using PostgreSQL 7.0 on Solaris. My Java program receives
> strings in Big5
> encoding and will store them in PostgreSQL (via JDBC). However, the inserted
> strings become
> multiple '?' (question marks) instead everytime i do a insert command. And
> when i retrieve them,
> via JDBC, the string becomes those question marks.
>
> Is the problem due to the Unicode encoding that Java String uses, or must i
> enable multibyte-support
> in my postgre installation? If i enable multibyte support, should i create
> my table with Unicode support,
> or Big5?
>

Upgrade to just released 7.1,
now postgres can do unicode conversion to you.
(thanks to Mr. Tatsuo Ishii)
I think you should enable both  enable-multibyte & enable-unicode-conversion
switch.
when building postgresql.

regards

Laser


Re: Inserting Unicode into Postgre

От
"Firestar"
Дата:
Hi Tatsuo, thanks for your fast reply.

My string (which contains big5 characters) is originally read from an
inputstream, and created by:
    insertStmt = new String(bytes, "big5")

Since all strings in java is in unicode, so if i enable unicode support with
postgre7.1, JDBC should now
be able to insert the string correctly into the database?

Btw, i dun seem to be able to find the JDBC driver for postgre 7.1 on the
website. I guess i have to build
it myself during the installation (as suggested by the readme file)?

Thanks in advance,
Firestar

"Tatsuo Ishii" <t-ishii@sra.co.jp> wrote in message
news:20010417161538B.t-ishii@sra.co.jp...
> > I'm currently using PostgreSQL 7.0 on Solaris. My Java program receives
> > strings in Big5
> > encoding and will store them in PostgreSQL (via JDBC). However, the
inserted
> > strings become
> > multiple '?' (question marks) instead everytime i do a insert command.
And
> > when i retrieve them,
> > via JDBC, the string becomes those question marks.
> >
> > Is the problem due to the Unicode encoding that Java String uses, or
must i
> > enable multibyte-support
> > in my postgre installation? If i enable multibyte support, should i
create
> > my table with Unicode support,
> > or Big5?
>
> First of all, you cannot store Big5 data into PostgreSQL. You need to
> convert Big5 to either EUC_TW or UTF-8 before storing them into
> PostgreSQL database. There are several ways to accompish this.
>
> The easiest way would be upgrade to 7.1 with multibyte support enabled
> and create a database with UNICODE (actially UTF-8) or EUC_TW
> encoding. In this environment, 7.1's JDBC driver would recognize the
> database encoding correctly, and do an automatic conversion between
> database encodings and UTF-8, that is Java's internal encoding.
>
> Ask Java expers on this list for more details.
> --
> Tatsuo Ishii
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)