Обсуждение: Having the issue in while selecting the data and feltering in order by.

Поиск
Список
Период
Сортировка

Having the issue in while selecting the data and feltering in order by.

От
nikhil raj
Дата:
Hi ALL,

I have recently migrated to windows postgres server to linux postgres server.

while i run this query in windows PG server
SELECT *
FROM (VALUES ('a'), ('___b1'), ('_a1'),('a2'),('a3'),('a5'), ('a2')) t (val) order by val

I am getting the out put in this sort order
10.250.200.184 - Remote Desktop Connection 2019-08-08 14.46.46.png

Same when i run this in linux machine i am getting this out in  different sort order on the same query.
10.250.200.184 - Remote Desktop Connection 2019-08-08 14.42.27.png


Вложения

Re: Having the issue in while selecting the data and feltering inorder by.

От
Luca Ferrari
Дата:
On Thu, Aug 8, 2019 at 11:20 AM nikhil raj <nikhilraj474@gmail.com> wrote:
> Same when i run this in linux machine i am getting this out in  different sort order on the same query.

A collation problem?
What does this query do?
SELECT *
FROM (VALUES ('a'), ('___b1'), ('_a1'),('a2'),('a3'),('a5'), ('a2')) t
(val) order by val COLLATE "C";



Re: Having the issue in while selecting the data and feltering inorder by.

От
Francisco Olarte
Дата:
Nikhil ( or is it Raj? ):

On Thu, Aug 8, 2019 at 11:47 AM nikhil raj <nikhilraj474@gmail.com> wrote:
> Here are the detailes.
> Windows :
Zapped image showing windows locale "English-united-states.1252"
> linux :
Zapped image showing Linux locale "En-us.utf-8"
> both are in same version 11.3
> How can i get the sorting order by like windows how to change the lc_collate  Please can you help me over here  OR
elseis there an other way we get the sort order like window through code 

1st, it is a locale mismatch problem. I do not own windows, so I can
not test it, but IIRC win1252 is a latin-1 variant. I do not know how
their locales sort, but it seems they are doing something like the
C-utf-8 locale.

You could try to recreate the database using the same locale as in
windows. If you are able to do this ( dump, recreate, restore) it may
be the easier way. I do not know if there are windows-like locales for
linux, of how does that locale sort ( I mean, how does it process
things like á or ö or ñ, your example show only the most basic stuff
). You could probably use C locale, in ascii, win1252 or utf-8
depending on your data content to get good results ( locale comes from
the OS, but pg has the charsets ). With the restricted data in your
examples, just switching to C locale will suffice, but I do not know
if you have some more complex examples / sort orders ( someone
knowledgeable in windows may be able to help. It seems like windows
locales are not as sophisticated as Linux one, and IIRC this was true
20 years ago when I used them ).

Other option is to modify either your columns, to include a collate
option in the relevant ones, or the order-by clauses in the relevant
queries, which support the collate clause to. You may want to read
https://www.postgresql.org/docs/11/charset.html and investigate a bit
with some test dbs.

I would test something like creating a database with C locale, charset
win1252, this may be a good starting point. But test some more queries
in windows, i.e., things like this:

$ echo -e 'a\nA\nb\nB1' | LC_ALL=en_US.UTF-8 sort
a
A
b
B1
$ echo -e 'a\nA\nb\nB1' | LC_ALL=C.UTF-8 sort
A
B1
a
b

Because it may matter for you, and I doubt the interactions can be
properly tested without a proper dataset AND a windos pg server.

Francisco Olarte.



Re: Having the issue in while selecting the data and feltering inorder by.

От
Francisco Olarte
Дата:
Luca:

On Thu, Aug 8, 2019 at 12:33 PM Luca Ferrari <fluca1978@gmail.com> wrote:
> On Thu, Aug 8, 2019 at 11:20 AM nikhil raj <nikhilraj474@gmail.com> wrote:
> > Same when i run this in linux machine i am getting this out in  different sort order on the same query.
> A collation problem?
> What does this query do?
> SELECT *
> FROM (VALUES ('a'), ('___b1'), ('_a1'),('a2'),('a3'),('a5'), ('a2')) t
> (val) order by val COLLATE "C";

It will probacly do the expected thing for him, but I think another
problem may be what does this query:

SELECT * FROM (VALUES
('0'),('1'),('a'),('A'),('á'),('à'),('b'),('B'),('ñ'),('Ñ')) order by
1;

Does on the Windows server, what he is trying to replicate. Finding a
locale which does not ignore _ as the windows one does is easy, but
replicating all the doodahs may be difficult without knowing how the
windows locale sorts all the chars ( and I'm not getting at 1a vs a1,
or _ vs a vs A).

I suspect C locale may be ok though.

Francisco Olarte.