Обсуждение: BUG #17676: Text comparison appears to be wrong

Поиск
Список
Период
Сортировка

BUG #17676: Text comparison appears to be wrong

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      17676
Logged by:          Rob Johnson
Email address:      robj@hightouchinc.com
PostgreSQL version: 14.5
Operating system:   Ubuntu
Description:

No tables are needed.  Just ran this, comparing strings with lower-case 'x'
and period '.' characters.  The first two columns are false as expected, the
last column is true, which appears to be wrong.

=> select '.' > 'x' as first, '.x' > 'x.' as second, '.xx' > 'x..' as third;

 first | second | third 
-------+--------+-------
 f     | f      | t
(1 row)

My Postgres version:
=> select version();
                                                             version
                                                    

---------------------------------------------------------------------------------------------------------------------------------
 PostgreSQL 14.5 (Ubuntu 14.5-0ubuntu0.22.04.1) on x86_64-pc-linux-gnu,
compiled by gcc (Ubuntu 11.2.0-19ubuntu1) 11.2.0, 64-bit
(1 row)

I am located in the United States and haven't done anything to change
character sets, collations, or anything like that.  The \l+ psql command
shows this for my database, which is called nigeldb:

=> \l+ nigeldb
                                                List of databases
  Name   |  Owner   | Encoding |   Collate   |    Ctype    | Access
privileges | Size  | Tablespace | Description 
---------+----------+----------+-------------+-------------+-------------------+-------+------------+-------------
 nigeldb | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 |
  | 48 MB | pg_default | 
(1 row)


Re: BUG #17676: Text comparison appears to be wrong

От
"David G. Johnston"
Дата:
On Thu, Nov 3, 2022 at 12:56 PM PG Bug reporting form <noreply@postgresql.org> wrote:
The following bug has been logged on the website:

Bug reference:      17676
Logged by:          Rob Johnson
Email address:      robj@hightouchinc.com
PostgreSQL version: 14.5
Operating system:   Ubuntu
Description:       

No tables are needed.  Just ran this, comparing strings with lower-case 'x'
and period '.' characters.  The first two columns are false as expected, the
last column is true, which appears to be wrong.

=> select '.' > 'x' as first, '.x' > 'x.' as second, '.xx' > 'x..' as third;

                                                List of databases
  Name   |  Owner   | Encoding |   Collate   |    Ctype    | Access
privileges | Size  | Tablespace | Description
---------+----------+----------+-------------+-------------+-------------------+-------+------------+-------------
 nigeldb | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 |               
  | 48 MB | pg_default |


Not a bug.

This just seems to be how UTF-8 Collation works; punctuation produces non-obvious sorting outcomes.

https://superuser.com/questions/227925/in-utf-8-collation-why-11-is-less-then-1

I confirmed that explicitly adding COLLATE "C" produces the expected outcome for third.
 
David J.