Обсуждение: How to restore a SQL-ASCII encoded database to a new UTF-8 db?

Поиск
Список
Период
Сортировка

How to restore a SQL-ASCII encoded database to a new UTF-8 db?

От
Postgres User
Дата:
Hi,

I have a database that was created with SQL-ASCII encoding
(unfortunately).  I ran pg_restore to load the struct and data into a
new database with UTF-8 encoding but no surprise- I'm seeing this
error for a number of tables:

pg_restore: [archiver (db)] COPY failed: ERROR:  invalid byte sequence for encod
ing "UTF8"

Any idea on how I can copy the data between these databases without
any data loss?  For some reason I thought that a conversion to Unicode
would be easy.

Thanks

Re: How to restore a SQL-ASCII encoded database to a new UTF-8 db?

От
"Albe Laurenz"
Дата:
> I have a database that was created with SQL-ASCII encoding
> (unfortunately).  I ran pg_restore to load the struct and data into a
> new database with UTF-8 encoding but no surprise- I'm seeing this
> error for a number of tables:
>
> pg_restore: [archiver (db)] COPY failed: ERROR:  invalid byte
> sequence for encod
> ing "UTF8"
>
> Any idea on how I can copy the data between these databases without
> any data loss?  For some reason I thought that a conversion to Unicode
> would be easy.

Conversion to Unicode is easy if you know the encoding of your data
and that is consistent :^)

Try to figure out the encoding of your data.

Then dump in text format and change the "SET client_encoding"
line in the dump accordingly.

Yours,
Laurenz Albe

Re: How to restore a SQL-ASCII encoded database to a new UTF-8 db?

От
Tommy Gildseth
Дата:
Postgres User wrote:
> Hi,
>
> I have a database that was created with SQL-ASCII encoding
> (unfortunately).  I ran pg_restore to load the struct and data into a
> new database with UTF-8 encoding but no surprise- I'm seeing this
> error for a number of tables:
>
> pg_restore: [archiver (db)] COPY failed: ERROR:  invalid byte sequence for encod
> ing "UTF8"
>
> Any idea on how I can copy the data between these databases without
> any data loss?  For some reason I thought that a conversion to Unicode
> would be easy.


Provided you haven't actually any characters from different character
sets or invalid characters in the dump, you may be able to import it
just by changing the client encoding in the dump. There's probably a
line saying something like
"SET CLIENT_ENCODING=SQL-ASCII;"
If you change that to
"SET CLIENT_ENCODING=Whatever_encoding_your_data_is_in;"

You may be able to import it. IIRC, PostgreSQL doesn't do any automatic
conversion between SQL-ASCII <-> Any encoding, but if you put the
correct encoding, PostgreSQL will deal with the conversion automatically.

--
Tommy Gildseth
DBA, Gruppe for databasedrift
Universitetet i Oslo, USIT
m: +47 45 86 38 50
t: +47 22 85 29 39

Re: How to restore a SQL-ASCII encoded database to a new UTF-8 db?

От
skmanji
Дата:
You can do this by converting the characters in raw dump file directly. iconv -f 8859_1 -t UTF-8 backup.db.psql > backup.db.psql.utf8 Then convert the line in backup.db.psql.utf8 from: SET client_encoding = 'SQL_ASCII'; to: SET client_encoding = 'UTF8';

View this message in context: Re: How to restore a SQL-ASCII encoded database to a new UTF-8 db?
Sent from the PostgreSQL - general mailing list archive at Nabble.com.