Обсуждение: varchar sort ordering ignore blanks

Поиск
Список
Период
Сортировка

varchar sort ordering ignore blanks

От
Luca Arzeni
Дата:
Hi there,
I have a table with a single column, pk of varchar type

The table contains few names, say:

XXXX A
XXXX C
XXXXB

In the first two records there is a between the XXXX and the following letter
A and C  while, the third one has a B immediately following the XXXX (without
blanks).

In postgres 7.4.7 (debian sarge), if I issue a select to sort the record I
(correctly) obtain:
XXXX A
XXXX C
XXXXB

In postgres 8.1.9 (debian etch), if I issue a select to sort the record I
(mistakenly) obtain:
XXXX A
XXXXB
XXXX C

That is: the sort order in postgres 8.1.9 seems to ignore the blank.

In all cases I'm using locale LATIN9 during DB creation, but I tested also
with ASCII, UTF8 and LATIN1 encoding.

Can someone help me to get the correct order in postgres 8.1.9 ?

=== Sample code  ===

CREATE TABLE t_table
(
  c_column varchar(30) NOT NULL,
  CONSTRAINT t_table_pk PRIMARY KEY (c_column)
)
WITHOUT OIDS;

INSERT INTO t_table(c_column) VALUES ('XXXX A');
INSERT INTO t_table(c_column) VALUES ('XXXXB');
INSERT INTO t_table(c_column) VALUES ('XXXX C');

select * from t_table order by c_column asc;

=============

Thanks, Luca Arzeni


Re: varchar sort ordering ignore blanks

От
Tom Lane
Дата:
Luca Arzeni <l.arzeni@amadego.com> writes:
> That is: the sort order in postgres 8.1.9 seems to ignore the blank.

This is expected behavior in most non-C locales.

> In all cases I'm using locale LATIN9 during DB creation, but I tested also
> with ASCII, UTF8 and LATIN1 encoding.

LATIN9 isn't a locale, it's an encoding.  Try "initdb --locale=C".

            regards, tom lane

Re: varchar sort ordering ignore blanks

От
"Luca Arzeni"
Дата:


On Jan 15, 2008 6:37 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Luca Arzeni <l.arzeni@amadego.com> writes:
> That is: the sort order in postgres 8.1.9 seems to ignore the blank.

This is expected behavior in most non-C locales.

> In all cases I'm using locale LATIN9 during DB creation, but I tested also
> with ASCII, UTF8 and LATIN1 encoding.

LATIN9 isn't a locale, it's an encoding.  Try "initdb --locale=C".

                       regards, tom lane
 
--------------------
I guess this has nothing to do with the encoding, but with the collation
rules used, which is governed by "lc_collate" parameter. See what you
get on both DBs for:

SHOW lc_collate ;

HTH,
Csaba.
-------------------

Thanks Tom, and Csaba

both of you hit the problem: actually Postgres 7.4.7 has a C locale and Postgres 8.1 has US.UTF8 locale. Setting locale to locale=C or locale=POSIX for release 8.1 solved this issue, but it opens another one: if I use locale=C, I get

XXXX A
XXXX C
XXXXB

as sort order, but this setting gives me an error when it cames to:

XXXX d
XXXX e
XXXX f
XXXX è

because the right sort ordering should be:

XXXX d
XXXX e
XXXX è
XXXX f

So the problem is:

- C or POSIX locale is OK with blanks but fails on locale specific vowels
- LATIN9 locale is OK with vowels but ignores blanks

Is there any way to consider blanks meaningfull AND sort properly locale specific vowels ?

I don't know what SQL standard says about this issue, but I'm sure that in Italy you sort names considering vowels AND blanks!

Thanks, Luca