Обсуждение: BUG #17676: Text comparison appears to be wrong
The following bug has been logged on the website: Bug reference: 17676 Logged by: Rob Johnson Email address: robj@hightouchinc.com PostgreSQL version: 14.5 Operating system: Ubuntu Description: No tables are needed. Just ran this, comparing strings with lower-case 'x' and period '.' characters. The first two columns are false as expected, the last column is true, which appears to be wrong. => select '.' > 'x' as first, '.x' > 'x.' as second, '.xx' > 'x..' as third; first | second | third -------+--------+------- f | f | t (1 row) My Postgres version: => select version(); version --------------------------------------------------------------------------------------------------------------------------------- PostgreSQL 14.5 (Ubuntu 14.5-0ubuntu0.22.04.1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 11.2.0-19ubuntu1) 11.2.0, 64-bit (1 row) I am located in the United States and haven't done anything to change character sets, collations, or anything like that. The \l+ psql command shows this for my database, which is called nigeldb: => \l+ nigeldb List of databases Name | Owner | Encoding | Collate | Ctype | Access privileges | Size | Tablespace | Description ---------+----------+----------+-------------+-------------+-------------------+-------+------------+------------- nigeldb | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | | 48 MB | pg_default | (1 row)
On Thu, Nov 3, 2022 at 12:56 PM PG Bug reporting form <noreply@postgresql.org> wrote:
The following bug has been logged on the website:
Bug reference: 17676
Logged by: Rob Johnson
Email address: robj@hightouchinc.com
PostgreSQL version: 14.5
Operating system: Ubuntu
Description:
No tables are needed. Just ran this, comparing strings with lower-case 'x'
and period '.' characters. The first two columns are false as expected, the
last column is true, which appears to be wrong.
=> select '.' > 'x' as first, '.x' > 'x.' as second, '.xx' > 'x..' as third;
List of databases
Name | Owner | Encoding | Collate | Ctype | Access
privileges | Size | Tablespace | Description
---------+----------+----------+-------------+-------------+-------------------+-------+------------+-------------
nigeldb | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
| 48 MB | pg_default |
Not a bug.
This just seems to be how UTF-8 Collation works; punctuation produces non-obvious sorting outcomes.
https://superuser.com/questions/227925/in-utf-8-collation-why-11-is-less-then-1
I confirmed that explicitly adding COLLATE "C" produces the expected outcome for third.
https://superuser.com/questions/227925/in-utf-8-collation-why-11-is-less-then-1
I confirmed that explicitly adding COLLATE "C" produces the expected outcome for third.
David J.