Обсуждение: Inserting Unicode into Postgre
Hi, I'm currently using PostgreSQL 7.0 on Solaris. My Java program receives strings in Big5 encoding and will store them in PostgreSQL (via JDBC). However, the inserted strings become multiple '?' (question marks) instead everytime i do a insert command. And when i retrieve them, via JDBC, the string becomes those question marks. Is the problem due to the Unicode encoding that Java String uses, or must i enable multibyte-support in my postgre installation? If i enable multibyte support, should i create my table with Unicode support, or Big5? Thanks in advance. Firestar
> I'm currently using PostgreSQL 7.0 on Solaris. My Java program receives > strings in Big5 > encoding and will store them in PostgreSQL (via JDBC). However, the inserted > strings become > multiple '?' (question marks) instead everytime i do a insert command. And > when i retrieve them, > via JDBC, the string becomes those question marks. > > Is the problem due to the Unicode encoding that Java String uses, or must i > enable multibyte-support > in my postgre installation? If i enable multibyte support, should i create > my table with Unicode support, > or Big5? First of all, you cannot store Big5 data into PostgreSQL. You need to convert Big5 to either EUC_TW or UTF-8 before storing them into PostgreSQL database. There are several ways to accompish this. The easiest way would be upgrade to 7.1 with multibyte support enabled and create a database with UNICODE (actially UTF-8) or EUC_TW encoding. In this environment, 7.1's JDBC driver would recognize the database encoding correctly, and do an automatic conversion between database encodings and UTF-8, that is Java's internal encoding. Ask Java expers on this list for more details. -- Tatsuo Ishii
Firestar wrote: > Hi, > > I'm currently using PostgreSQL 7.0 on Solaris. My Java program receives > strings in Big5 > encoding and will store them in PostgreSQL (via JDBC). However, the inserted > strings become > multiple '?' (question marks) instead everytime i do a insert command. And > when i retrieve them, > via JDBC, the string becomes those question marks. > > Is the problem due to the Unicode encoding that Java String uses, or must i > enable multibyte-support > in my postgre installation? If i enable multibyte support, should i create > my table with Unicode support, > or Big5? > Upgrade to just released 7.1, now postgres can do unicode conversion to you. (thanks to Mr. Tatsuo Ishii) I think you should enable both enable-multibyte & enable-unicode-conversion switch. when building postgresql. regards Laser
Hi Tatsuo, thanks for your fast reply. My string (which contains big5 characters) is originally read from an inputstream, and created by: insertStmt = new String(bytes, "big5") Since all strings in java is in unicode, so if i enable unicode support with postgre7.1, JDBC should now be able to insert the string correctly into the database? Btw, i dun seem to be able to find the JDBC driver for postgre 7.1 on the website. I guess i have to build it myself during the installation (as suggested by the readme file)? Thanks in advance, Firestar "Tatsuo Ishii" <t-ishii@sra.co.jp> wrote in message news:20010417161538B.t-ishii@sra.co.jp... > > I'm currently using PostgreSQL 7.0 on Solaris. My Java program receives > > strings in Big5 > > encoding and will store them in PostgreSQL (via JDBC). However, the inserted > > strings become > > multiple '?' (question marks) instead everytime i do a insert command. And > > when i retrieve them, > > via JDBC, the string becomes those question marks. > > > > Is the problem due to the Unicode encoding that Java String uses, or must i > > enable multibyte-support > > in my postgre installation? If i enable multibyte support, should i create > > my table with Unicode support, > > or Big5? > > First of all, you cannot store Big5 data into PostgreSQL. You need to > convert Big5 to either EUC_TW or UTF-8 before storing them into > PostgreSQL database. There are several ways to accompish this. > > The easiest way would be upgrade to 7.1 with multibyte support enabled > and create a database with UNICODE (actially UTF-8) or EUC_TW > encoding. In this environment, 7.1's JDBC driver would recognize the > database encoding correctly, and do an automatic conversion between > database encodings and UTF-8, that is Java's internal encoding. > > Ask Java expers on this list for more details. > -- > Tatsuo Ishii > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)