Re: regex match and special characters

Поиск
Список
Период
Сортировка
От Adrian Klaver
Тема Re: regex match and special characters
Дата
Msg-id d438c15e-f960-3705-459c-53b0e3f3366a@aklaver.com
обсуждение исходный текст
Ответ на regex match and special characters  (Alex Kliukin <alexk@hintbits.com>)
Ответы Sv: Re: regex match and special characters  (Andreas Joseph Krogh <andreas@visena.com>)
Список pgsql-general
On 08/16/2018 03:59 AM, Alex Kliukin wrote:
> Hi,
> 
> Here is a simple SQL statement that gives different results on PostgreSQL 9.6 and PostgreSQL 10+. The space character
atthe end of the string is actually U+2006 SIX-PER-EM SPACE
(http://www.fileformat.info/info/unicode/char/2006/index.htm)
> 
> test=# select 'abcd ' ~ 'abcd\s';
>   ?column?
> ----------
>   t
> (1 row)
> 
> test=# select version();
>                                               version
> -------------------------------------------------------------------------------------------------
>   PostgreSQL 12devel on x86_64-pc-linux-gnu, compiled by gcc (Gentoo 6.4.0-r1 p1.3) 6.4.0, 64-bit
> (1 row)
> 
> 
> On another server (running on the same system on a different port)
> 
> postgres=# select version();
>                                              version
> -----------------------------------------------------------------------------------------------
>   PostgreSQL 9.6.9 on x86_64-pc-linux-gnu, compiled by gcc (Gentoo 6.4.0-r1 p1.3) 6.4.0, 64-bit
> (1 row)
> 
> postgres=# select 'abcd ' ~ 'abcd\s';
>   ?column?
> ----------
>   f
> (1 row)
> 
> For both clusters, the client encoding is UTF8, the database encoding and collation is UTF8 and en_US.utf8
respectively,and the lc_ctype is en_US.utf8. I am accessing the databases running locally by ssh-ing first to the
host.
> 
> I observed similar issues with other Linux-based servers running Ubuntu, in all cases the regex resulted in true on
PostgreSQL10+ and false on earlier versions (down to 9.3). The query comes from a table check that suddenly stopped
acceptingrows valid in the older version during the migration. Making it  select 'abcd ' ~ E'abcd\\s' doesn't  modify
theoutcome, unsurprisingly.
 
> 
> Is it reproducible for others here as well? Given that it is, Is there a way to make both versions behave the same?

select version();
                                       version 

------------------------------------------------------------------------------------
  PostgreSQL 10.5 on x86_64-pc-linux-gnu, compiled by gcc (SUSE Linux) 
4.8.5, 64-bit


lc_collate                          | en_US.UTF-8 

lc_ctype                            | en_US.UTF-8


test=# select 'abcd'||chr(2006) ~ E'abcd\s';
  ?column?
----------
  f
(1 row)

In your example you are working on Postgres devel. Have you tried it on 
Postgres 10 and/or 11?

> 
> Cheers,
> Alex
> 
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com


В списке pgsql-general по дате отправления:

Предыдущее
От: pavan95
Дата:
Сообщение: Re: Copying data from a CSV file into a table dynamically
Следующее
От: Vikas Sharma
Дата:
Сообщение: Copy over large data Postgresql 9.5