Re: inserts bypass encoding conversion

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: inserts bypass encoding conversion
Дата
Msg-id 1727535.1692240032@sss.pgh.pa.us
обсуждение исходный текст
Ответ на RE: inserts bypass encoding conversion  ("James Pang (chaolpan)" <chaolpan@cisco.com>)
Список pgsql-admin
"James Pang (chaolpan)" <chaolpan@cisco.com> writes:
> So,  insert into values(chr(226)||chr(128)||chr(166)) actually got stored in database with LATIN1 with single byte
sequence,but when query select * from testutf8, it got converted to UTF8 three byte sequence first ?  

There are no LATIN1 characters that have longer than 2-byte UTF8
representations, so no.

I think your fundamental misunderstanding is supposing that this:

    chr(226)||chr(128)||chr(166)

produces something equivalent to the UTF8 sequence 0xe2 0x80 0xa6.
It will not, no matter which server encoding you are dealing with.
It will produce something that is three separate characters
according to the server encoding.  In LATIN1, that could well be
the byte sequence 0xe2 0x80 0xa6, but *that byte sequence does not
mean the same thing that it would mean in UTF8 encoding*.

You also seem not to grasp the fact that an encoding conversion
will happen between your client and the server if client_encoding
is different from server_encoding.  Because of that, the output of
a SELECT command doesn't prove much of anything here.

            regards, tom lane



В списке pgsql-admin по дате отправления:

Предыдущее
От: "James Pang (chaolpan)"
Дата:
Сообщение: RE: inserts bypass encoding conversion
Следующее
От: Rajesh Kumar
Дата:
Сообщение: Autovacuum not working peoperly