Обсуждение: lower() silently fails for 9.5.3 on Windows, but works on Mac, Linux

Поиск
Список
Период
Сортировка

lower() silently fails for 9.5.3 on Windows, but works on Mac, Linux

От
Alexander Farber
Дата:
Hello fellow PostgreSQL users,

does anybody else observe the problem, that calling lower() method on UTF8 cyrillic strings works on Mac and Linux for version 9.5.3, but fails on Windows 7 / 64 bit (I am using the unzippable version w/o installer)?

I am probably not providing enough information here... not sure what else to write.

Here is my code (trying to lowercase a string before saving it in 2-dim. array):


        FOR _tile IN SELECT * FROM JSONB_ARRAY_ELEMENTS(in_tiles)
        LOOP
                _letter :=  _tile->>'letter';
                _value  := (_tile->>'value')::int;
                _col    := (_tile->>'col')::int + 1;
                _row    := (_tile->>'row')::int + 1;

                RAISE NOTICE 'tile = %', _tile;

                IF NOT words_valid_tile(_letter,
                                        _value,
                                        _col,
                                        _row) THEN
                        RAISE EXCEPTION 'Invalid tile = %', _tile;
                END IF;

                IF (_letters[_col][_row] IS NOT NULL) THEN
                        RAISE EXCEPTION 'Cell already occupied %', _tile;
                END IF;
                 
                _letters[_col][_row] := lower(_letter);
                RAISE NOTICE 'letter = %', _letters[_col][_row];  -- STILL UPPERCASE
                _values[_col][_row]  := _value;

        END LOOP;

I wonder if there is a workaround for this problem on Windows

Thank you
Alex


Re: lower() silently fails for 9.5.3 on Windows, but works on Mac, Linux

От
Alexander Farber
Дата:
Here the Windows-log excerpt (the 5 cyrillic letters stay uppercased):

LOG:  database system is ready to accept connections
LOG:  autovacuum launcher started
LOG:  statement: SET client_encoding = 'UTF8';

LOG:  execute <unnamed>: SELECT out_gid AS gid FROM words_play_game($1, $2, $3::jsonb)
DETAIL:  parameters: $1 = '1', $2 = '3', $3 = '[{"col":7,"letter":"П","row":11,"value":2},{"col":7,"letter":"И","row":10,"value":1},{"col":7,"letter":"Л","row":9,"value":2},{"col":7,"letter":"О","row":8,"value":1},{"col":7,"letter":"П","row":7,"value":2}]'
NOTICE:  tile = {"col": 7, "row": 11, "value": 2, "letter": "П"}
NOTICE:  letter = П
NOTICE:  tile = {"col": 7, "row": 10, "value": 1, "letter": "И"}
NOTICE:  letter = И
NOTICE:  tile = {"col": 7, "row": 9, "value": 2, "letter": "Л"}
NOTICE:  letter = Л
NOTICE:  tile = {"col": 7, "row": 8, "value": 1, "letter": "О"}
NOTICE:  letter = О
NOTICE:  tile = {"col": 7, "row": 7, "value": 2, "letter": "П"}
NOTICE:  letter = П

Re: lower() silently fails for 9.5.3 on Windows, but works on Mac, Linux

От
Alexander Farber
Дата:
More info:

# \l+
                                                  
   Name    |  Owner  | Encoding | Collate | Ctype |
-----------+---------+----------+---------+-------+
 postgres  | user1   | UTF8     | C       | C     |
 template0 | user1   | UTF8     | C       | C     |
           |         |          |         |       |
 template1 | user1   | UTF8     | C       | C     |
           |         |          |         |       |
 words     | user1   | UTF8     | C       | C     |

Re: lower() silently fails for 9.5.3 on Windows, but works on Mac, Linux

От
"Charles Clavadetscher"
Дата:
Hello

> From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Alexander Farber
> Sent: Montag, 8. August 2016 09:10
> To: pgsql-general <pgsql-general@postgresql.org>
> Subject: Re: [GENERAL] lower() silently fails for 9.5.3 on Windows, but works on Mac, Linux
>
> More info:
>
> # \l+
>
>   Name    |  Owner  | Encoding | Collate | Ctype |
> -----------+---------+----------+---------+-------+
> postgres  | user1   | UTF8     | C       | C     |
> template0 | user1   | UTF8     | C       | C     |
>           |         |          |         |       |
> template1 | user1   | UTF8     | C       | C     |
>           |         |          |         |       |
> words     | user1   | UTF8     | C       | C     |


I cannot test on 9.5, which you are using, but I was able to reproduce the problem on 9.3:

kofadmin@kofdb.localhost=> CREATE DATABASE test TEMPLATE template0 ENCODING 'UTF8' LC_COLLATE 'C' LC_CTYPE 'C';
CREATE DATABASE
kofadmin@kofdb.localhost=> \l
                                                 List of databases
   Name    |  Owner   | Encoding |          Collate           |           Ctype            |   Access privileges
-----------+----------+----------+----------------------------+----------------------------+-----------------------
 kofdb     | kofadmin | UTF8     | English_United States.1252 | English_United States.1252 |
 test      | kofadmin | UTF8     | C                          | C                          |

kofadmin@kofdb.localhost=> \c test
psql (9.6devel, server 9.3.6)
You are now connected to database "test" as user "kofadmin".
kofadmin@test.localhost=> select lower('P');  <-- This works (latin letters)
 lower
-------
 p
(1 row)

kofadmin@test.localhost=> select lower('И'); <-- This does not work (cyrillic)
 lower
-------
 И
(1 row)

kofadmin@test.localhost=> \c kofdb
psql (9.6devel, server 9.3.6)
You are now connected to database "kofdb" as user "kofadmin".
kofadmin@kofdb.localhost=> select lower('И'); <-- This works on a DB with another collation and ctype
 lower
-------
 и
(1 row)

It seems to be a problem with collation and or ctype.

What are the settings of the database on your Linux system where all works correct?

Bye
Charles




Re: lower() silently fails for 9.5.3 on Windows, but works on Mac, Linux

От
"Charles Clavadetscher"
Дата:
Hello

> -----Original Message-----
> From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Charles
> Clavadetscher
> Sent: Montag, 8. August 2016 09:30
> To: 'Alexander Farber' <alexander.farber@gmail.com>; 'pgsql-general' <pgsql-general@postgresql.org>
> Subject: Re: [GENERAL] lower() silently fails for 9.5.3 on Windows, but works on Mac, Linux
>
> Hello
>
> > From: pgsql-general-owner@postgresql.org
> > [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Alexander
> > Farber
> > Sent: Montag, 8. August 2016 09:10
> > To: pgsql-general <pgsql-general@postgresql.org>
> > Subject: Re: [GENERAL] lower() silently fails for 9.5.3 on Windows,
> > but works on Mac, Linux
> >
> > More info:
> >
> > # \l+
> >
> >   Name    |  Owner  | Encoding | Collate | Ctype |
> > -----------+---------+----------+---------+-------+
> > postgres  | user1   | UTF8     | C       | C     |
> > template0 | user1   | UTF8     | C       | C     |
> >           |         |          |         |       |
> > template1 | user1   | UTF8     | C       | C     |
> >           |         |          |         |       |
> > words     | user1   | UTF8     | C       | C     |
>
>
> I cannot test on 9.5, which you are using, but I was able to reproduce the problem on 9.3:
>
> kofadmin@kofdb.localhost=> CREATE DATABASE test TEMPLATE template0 ENCODING 'UTF8' LC_COLLATE 'C' LC_CTYPE 'C';
> CREATE DATABASE kofadmin@kofdb.localhost=> \l
>                                                  List of databases
>    Name    |  Owner   | Encoding |          Collate           |           Ctype            |   Access privileges
> -----------+----------+----------+----------------------------+----------------------------+-----------------------
>  kofdb     | kofadmin | UTF8     | English_United States.1252 | English_United States.1252 |
>  test      | kofadmin | UTF8     | C                          | C                          |
>
> kofadmin@kofdb.localhost=> \c test
> psql (9.6devel, server 9.3.6)
> You are now connected to database "test" as user "kofadmin".
> kofadmin@test.localhost=> select lower('P');  <-- This works (latin letters)  lower
> -------
>  p
> (1 row)
>
> kofadmin@test.localhost=> select lower('И'); <-- This does not work (cyrillic)  lower
> -------
>  И
> (1 row)
>
> kofadmin@test.localhost=> \c kofdb
> psql (9.6devel, server 9.3.6)
> You are now connected to database "kofdb" as user "kofadmin".
> kofadmin@kofdb.localhost=> select lower('И'); <-- This works on a DB with another collation and ctype  lower
> -------
>  и
> (1 row)
>
> It seems to be a problem with collation and or ctype.
>
> What are the settings of the database on your Linux system where all works correct?

A possible workaround:

kofadmin@test.localhost=> CREATE COLLATION "en_US" (LOCALE = 'English_United States.1252');
CREATE COLLATION

kofadmin@test.localhost=> select * from pg_collation;
 collname | collnamespace | collowner | collencoding |        collcollate         |         collctype
----------+---------------+-----------+--------------+----------------------------+----------------------------
 default  |            11 |        10 |           -1 |                            |
 C        |            11 |        10 |           -1 | C                          | C
 POSIX    |            11 |        10 |           -1 | POSIX                      | POSIX
 en_US    |          2200 |     16394 |            6 | English_United States.1252 | English_United States.1252
(4 rows)

kofadmin@test.localhost=> with x as (select 'И'::text collate "en_US" as letter) select lower(letter) from x ;
 lower
-------
 и
(1 row)

Sure, bit uncomfortable.
Bye
Charles

>
> Bye
> Charles
>
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general



Re: lower() silently fails for 9.5.3 on Windows, but works on Mac, Linux

От
Alexander Farber
Дата:
Thank you for the replies.

On CentOS 7 Linux with pgdg 9.5.3 the lower() method works and I have:

 # \l
                                  List of databases
   Name    |  Owner   | Encoding |   Collate   |    Ctype    |
-----------+----------+----------+-------------+-------------+
 postgres  | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 |
 template0 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 |
           |          |          |             |             |
 template1 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 |
           |          |          |             |             |
 words     | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 |


Re: lower() silently fails for 9.5.3 on Windows, but works on Mac, Linux

От
"Charles Clavadetscher"
Дата:
Hello

> From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Alexander Farber
> Sent: Montag, 8. August 2016 10:12
> Cc: pgsql-general <pgsql-general@postgresql.org>
> Subject: Re: [GENERAL] lower() silently fails for 9.5.3 on Windows, but works on Mac, Linux
>
> Thank you for the replies.
> On CentOS 7 Linux with pgdg 9.5.3 the lower() method works and I have:
>
> # \l
>                                  List of databases
>   Name    |  Owner   | Encoding |   Collate   |    Ctype    |
> -----------+----------+----------+-------------+-------------+
> postgres  | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 |
> template0 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 |
>           |          |          |             |             |
> template1 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 |
>           |          |          |             |             |
> words     | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 |

Well yes, anogher collation and ctype.

I found an easier way to use the workaround after create collation:

kofadmin@test.localhost=> select lower(('И'::text collate "en_US")) ;
 lower
-------
 и
(1 row)

Maybe other more expert than me on this topic will suggest better solution.
Bye
Charles




Re: lower() silently fails for 9.5.3 on Windows, but works on Mac, Linux

От
Alexander Farber
Дата:
Hello Charles, unfortunately on Windows 7 this fails:

psql (9.5.3)
Type "help" for help.

# select lower(('И'::text collate "en_US")) ;
ERROR:  collation "en_US" for encoding "UTF8" does not exist
LINE 1: select lower(('?'::text collate "en_US")) ;
                                ^
By the way I the following code works well for me on all 3 platforms:

CREATE TABLE words_verbs (
        word varchar(255) PRIMARY KEY CHECK (
                word ~ '^[А-Я]{2,}$' AND
                word !~ '[ЖШ]Ы' AND
                word !~ '[ЧЩ]Я' AND
                word !~ 'Ц[ЮЯ]' AND
                (word ~ '[ТЧ]ЬСЯ$' OR
                 word ~ '[ТЧ]Ь$' OR 
                 word ~ 'ТИ$')),
        hashed varchar(32) NOT NULL
);

but I understand that it is probably different methods on the lower layer (pcre instead of some collating functions?)....

Regards
Alex

Re: lower() silently fails for 9.5.3 on Windows, but works on Mac, Linux

От
"Charles Clavadetscher"
Дата:
Hello Alexander

> -----Original Message-----
> From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Alexander Farber
> Sent: Montag, 8. August 2016 10:21
> Cc: pgsql-general <pgsql-general@postgresql.org>
> Subject: Re: [GENERAL] lower() silently fails for 9.5.3 on Windows, but works on Mac, Linux
>
> Hello Charles, unfortunately on Windows 7 this fails:
>
> psql (9.5.3)
> Type "help" for help.
>
> # select lower(('И'::text collate "en_US")) ;
> ERROR:  collation "en_US" for encoding "UTF8" does not exist LINE 1: select lower(('?'::text collate "en_US")) ;

I assume that you did not create the collation yet as I mentioned in a previous mail.

kofadmin@test.localhost=> CREATE COLLATION "en_US" (LOCALE = 'English_United States.1252');
CREATE COLLATION

Which locale can be created depends on those available on your OS.

Bye
Charles

> By the way I the following code works well for me on all 3 platforms:
>
> CREATE TABLE words_verbs (
>         word varchar(255) PRIMARY KEY CHECK (
>                 word ~ '^[А-Я]{2,}$' AND
>                 word !~ '[ЖШ]Ы' AND
>                 word !~ '[ЧЩ]Я' AND
>                 word !~ 'Ц[ЮЯ]' AND
>                 (word ~ '[ТЧ]ЬСЯ$' OR
>                  word ~ '[ТЧ]Ь$' OR
>                  word ~ 'ТИ$')),
>         hashed varchar(32) NOT NULL
> );
>
>
> but I understand that it is probably different methods on the lower layer (pcre instead of some collating
> functions?)....
>
>
> Regards
>
> Alex




Re: lower() silently fails for 9.5.3 on Windows, but works on Mac, Linux

От
Tom Lane
Дата:
Alexander Farber <alexander.farber@gmail.com> writes:
> More info:
> # \l+

>    Name    |  Owner  | Encoding | Collate | Ctype |
> -----------+---------+----------+---------+-------+
>  postgres  | user1   | UTF8     | C       | C     |
>  template0 | user1   | UTF8     | C       | C     |
>            |         |          |         |       |
>  template1 | user1   | UTF8     | C       | C     |
>            |         |          |         |       |
>  words     | user1   | UTF8     | C       | C     |

Well, there's your problem: in C locale, only the standard ASCII letters
will be transformed by upper/lower.

            regards, tom lane