Обсуждение: A client and server encoding question

Поиск
Список
Период
Сортировка

A client and server encoding question

От
Amit Langote
Дата:
Hi,

With a server initdb'd with UTF8 encoding , if I create a table with a
client using LATIN1 encoding and later try to work with the relation
with a client using UTF8 encoding (both the scenarios simulated using
single session of psql but with different client_encoding set), there
is an error. Following might help illustrate the problem:

psql (9.2.4)
Type "help" for help.

postgres=# SHOW server_encoding;
 server_encoding
-----------------
 UTF8
(1 row)
Time: 0.761 ms

postgres=# SET client_encoding TO LATIN1;
SET
Time: 1.382 ms

postgres=# create table id_äß(ID int);
CREATE TABLE
Time: 31.344 ms

postgres=# \dt
        List of relations
 Schema |  Name   | Type  | Owner
--------+---------+-------+-------
 public | id_äß | table | amit
(1 row)

postgres=# SET client_encoding TO UTF8;
SET
Time: 1.007 ms

postgres=# \dt
           List of relations
 Schema |     Name     | Type  | Owner
--------+--------------+-------+-------
 public | id_äÃ\u009F | table | amit
(1 row)

postgres=# drop table id_äß;
ERROR:  table "id_äß" does not exist
Time: 1.668 ms

postgres=# SET client_encoding TO LATIN1;
SET
Time: 0.745 ms

postgres=# drop table id_äß;
DROP TABLE
Time: 16.954 ms

But, I had an impression that above shouldn't have caused any problem?
Should UTF8 handle the situation gracefully? Or am I missing
something?

--
Amit Langote


Re: A client and server encoding question

От
Albe Laurenz
Дата:
Amit Langote wrote:
> With a server initdb'd with UTF8 encoding , if I create a table with a
> client using LATIN1 encoding and later try to work with the relation
> with a client using UTF8 encoding (both the scenarios simulated using
> single session of psql but with different client_encoding set), there
> is an error. Following might help illustrate the problem:
> 
> psql (9.2.4)
> Type "help" for help.
> 
> postgres=# SHOW server_encoding;
>  server_encoding
> -----------------
>  UTF8
> (1 row)
> Time: 0.761 ms
> 
> postgres=# SET client_encoding TO LATIN1;
> SET
> Time: 1.382 ms
> 
> postgres=# create table id_äß(ID int);
> CREATE TABLE
> Time: 31.344 ms
> 
> postgres=# \dt
>         List of relations
>  Schema |  Name   | Type  | Owner
> --------+---------+-------+-------
>  public | id_äß | table | amit
> (1 row)
> 
> postgres=# SET client_encoding TO UTF8;
> SET
> Time: 1.007 ms
> 
> postgres=# \dt
>            List of relations
>  Schema |     Name     | Type  | Owner
> --------+--------------+-------+-------
>  public | id_äÃ\u009F | table | amit
> (1 row)
> 
> postgres=# drop table id_äß;
> ERROR:  table "id_äß" does not exist
> Time: 1.668 ms
> 
> postgres=# SET client_encoding TO LATIN1;
> SET
> Time: 0.745 ms
> 
> postgres=# drop table id_äß;
> DROP TABLE
> Time: 16.954 ms
> 
> But, I had an impression that above shouldn't have caused any problem?
> Should UTF8 handle the situation gracefully? Or am I missing
> something?

You are missing that your terminal is still running with an UTF8 locale.

So when you create the table, you are feeding psql with \x69645fc3a4c39f:
69 ...... "i"
64 ...... "d"
5f ...... "_"
c3a4 .... "ä"
c39f .... "ß"

But you told psql that you are going to feed it LATIN1, so these
7 bytes are interpreted as 7 LATIN1 characters, converted to UTF8,
and the table actually has this name: \x69645fc383c2a4c383c29f
because the server uses UTF8.

If you change your client encoding back to UTF8, no conversion
between client and server will take place, and it's hardly
surprising that the server complains if you tell it to drop
the table with the name \x69645fc3a4c39f.

Yours,
Laurenz Albe

Re: A client and server encoding question

От
Amit Langote
Дата:
On Tue, Oct 22, 2013 at 7:00 PM, Albe Laurenz <laurenz.albe@wien.gv.at> wrote:
> Amit Langote wrote:
>> With a server initdb'd with UTF8 encoding , if I create a table with a
>> client using LATIN1 encoding and later try to work with the relation
>> with a client using UTF8 encoding (both the scenarios simulated using
>> single session of psql but with different client_encoding set), there
>> is an error. Following might help illustrate the problem:
>>
>> psql (9.2.4)
>> Type "help" for help.
>>
>> postgres=# SHOW server_encoding;
>>  server_encoding
>> -----------------
>>  UTF8
>> (1 row)
>> Time: 0.761 ms
>>
>> postgres=# SET client_encoding TO LATIN1;
>> SET
>> Time: 1.382 ms
>>
>> postgres=# create table id_äß(ID int);
>> CREATE TABLE
>> Time: 31.344 ms
>>
>> postgres=# \dt
>>         List of relations
>>  Schema |  Name   | Type  | Owner
>> --------+---------+-------+-------
>>  public | id_äß | table | amit
>> (1 row)
>>
>> postgres=# SET client_encoding TO UTF8;
>> SET
>> Time: 1.007 ms
>>
>> postgres=# \dt
>>            List of relations
>>  Schema |     Name     | Type  | Owner
>> --------+--------------+-------+-------
>>  public | id_äÃ\u009F | table | amit
>> (1 row)
>>
>> postgres=# drop table id_äß;
>> ERROR:  table "id_äß" does not exist
>> Time: 1.668 ms
>>
>> postgres=# SET client_encoding TO LATIN1;
>> SET
>> Time: 0.745 ms
>>
>> postgres=# drop table id_äß;
>> DROP TABLE
>> Time: 16.954 ms
>>
>> But, I had an impression that above shouldn't have caused any problem?
>> Should UTF8 handle the situation gracefully? Or am I missing
>> something?
>
> You are missing that your terminal is still running with an UTF8 locale.
>
> So when you create the table, you are feeding psql with \x69645fc3a4c39f:
> 69 ...... "i"
> 64 ...... "d"
> 5f ...... "_"
> c3a4 .... "ä"
> c39f .... "ß"
>
> But you told psql that you are going to feed it LATIN1, so these
> 7 bytes are interpreted as 7 LATIN1 characters, converted to UTF8,
> and the table actually has this name: \x69645fc383c2a4c383c29f
> because the server uses UTF8.
>
> If you change your client encoding back to UTF8, no conversion
> between client and server will take place, and it's hardly
> surprising that the server complains if you tell it to drop
> the table with the name \x69645fc3a4c39f.
>

You are right, I missed the point that my terminal emulator is still
feeding UTF8 into psql.

--
Amit