Обсуждение: Re: Question on replace function [solved]
Hi Rob On 09/25/2016 01:39 PM, rob stone wrote: > > On Sun, 2016-09-25 at 10:29 +0200, Charles Clavadetscher wrote: >> Hello >> >> I am using PostgreSQL 9.4.7 on x86_64-unknown-linux-gnu, compiled by >> gcc >> (Debian 4.7.2-5) 4.7.2, 64-bit >> >> I imported data from a MariaDB table into PostgreSQL and noticed >> that >> the content of a field was not correct, but I was not able to change >> it. >> The field is called vcard and is of datatye text. >> >> The structure of the table: >> >> roundcubemail=# \d contacts >> Tabelle „public.contacts“ >> Spalte | Typ | >> Attribute >> ------------+--------------------------+----------------------------- >> ----------------------------------- >> contact_id | integer | not null Vorgabewert >> nextval(('contacts_seq'::text)::regclass) >> changed | timestamp with time zone | not null Vorgabewert now() >> del | smallint | not null Vorgabewert 0 >> name | character varying(128) | not null Vorgabewert >> ''::character varying >> email | text | not null Vorgabewert >> ''::text >> firstname | character varying(128) | not null Vorgabewert >> ''::character varying >> surname | character varying(128) | not null Vorgabewert >> ''::character varying >> vcard | text | >> words | text | >> user_id | integer | not null >> >> The content of vcard looks as follows (replaced real names with >> placeholders): >> >> BEGIN:VCARD\r\\rVERSION:3.0\r\\rN:;Firstname Lastname >> ;;;\r\\rFN:Firstname >> Lastname\r\\rEMAIL;TYPE=INTERNET;TYPE=OTHER:email\r\\rEND:VCARD >> >> My target is to replace all occurences of '\r\\r' with E'\r\n' to >> comply >> with RFC 6350. >> >> I tried using the function replace and I am faced with a strange >> behaviour. If I use the function with a string as shown above I get >> the >> expected result: >> >> elect replace('BEGIN:VCARD\r\\rVERSION:3.0\r\\rN:;Firstname Lastname >> ;;;\r\\rFN:Firstname >> Lastname\r\\rEMAIL;TYPE=INTERNET;TYPE=OTHER:email\r\\rEND:VCARD','\r\ >> \r',E'\r\n') >> ; >> >> replace >> ---------------------------------------- >> BEGIN:VCARD\r + >> VERSION:3.0\r + >> N:;Firstname Lastname ;;;\r + >> FN:Firstname Lastname\r + >> EMAIL;TYPE=INTERNET;TYPE=OTHER:email\r+ >> END:VCARD >> (1 row) >> >> However, if I use the function on the vcard field nothing is >> replaced: >> >> select replace(vcard,'\r\\r',E'\r\n') from contacts; >> >> >> replace >> ------------------------------------------------------------------- >> ------------------------------------------------------------------- >> ---- >> BEGIN:VCARD\r\\rVERSION:3.0\r\\rN:;Firstname Lastname >> ;;;\r\\rFN:Firstname >> Lastname\r\\rEMAIL;TYPE=INTERNET;TYPE=OTHER:email\r\\rEND:VCARD >> (1 row) >> >> Does anybody have an idea what I am doing wrong? >> Thank you for your help. >> >> Charles >> >> -- >> Swiss PostgreSQL Users Group >> c/o Charles Clavadetscher >> Treasurer >> Motorenstrasse 18 >> CH – 8005 Zürich >> >> http://www.swisspug.org >> >> +-----------------------+ >>> >>> ____ ______ ___ | >>> / )/ \/ \ | >>> ( / __ _\ ) | >>> \ (/ o) ( o) ) | >>> \_ (_ ) \ ) _/ | >>> \ /\_/ \)/ | >>> \/ <//| |\\> | >>> _| | | >>> \|_/ | >>> | >>> PostgreSQL 1996-2016 | >>> 20 Years of Success | >>> | >> +-----------------------+ >> >> > > > > Tested this on 9.6beta3 on a test database and it appears to work fine. > > Inserted one row. > > dinkumerp=> select * from contacts; > LOG: duration: 0.571 ms statement: select * from contacts; > contact_id | changed | del | name | email | > firstname | s > urname > | vcard | > words > | user_id > ------------+-------------------------------+-----+------+-------+----- > ------+-- > -------+--------------------------------------------------------------- > --+------ > -+--------- > 1 | 2016-09-25 21:30:54.788442+10 | 0 > | | | | > | BEGIN:VCARD\r\\rVERSION:3.0\r\\rN:;Firstname > Lastname +| > | > | | | | | > | > | > ;;;\r\\rFN:Firstname +| > | > | | | | | > | > | > Lastname\r\\rEMAIL;TYPE=INTERNET;TYPE=OTHER:email\r\\rEND:VCARD | > | > (1 row) > > Replace select. > > dinkumerp=> select replace(vcard,'\r\\r',E'\r\n') from contacts; > LOG: duration: 0.400 ms statement: select > replace(vcard,'\r\\r',E'\r\n') from contacts; > replace > ---------------------------------------- > BEGIN:VCARD\r + > VERSION:3.0\r + > N:;Firstname Lastname + > ;;;\r + > FN:Firstname + > Lastname\r + > EMAIL;TYPE=INTERNET;TYPE=OTHER:email\r+ > END:VCARD > (1 row) > > > HTH, > Rob > Thank you. Unfortunately this did not help. But it was a confirmation that there must be a difference in what I see in the console and what is actually stored in the DB. I found a way to check that and with that a solution to my problem. First I created a table: create table test (txt text); Then I inserted two rows: One using a value from the table: insert into test values ((select vcard from contacts limit 1)); And one using the string as it appears in the console: insert into test values ('BEGIN:VCARD\r\\rVERSION:3.0\r\\rN:Halbritter;Ursula;;;\r\\rFN:Ursula Halbritter\r\\rEMAIL;TYPE=INTERNET;TYPE=HOME:ursula.halbritter@zweiernet.ch\r\\rADR;TYPE=home:;;strasse;schwerzenbach;;;ooooooooooooooooooooooooooo\r\\rEND:VCARD'); In the console they look exactly the same: roundcubemail2=> select * from test; txt ------------------------------------------------------------------------------------------------------------------------------------------ BEGIN:VCARD\r\\rVERSION:3.0\r\\rN:;Firstname Lastname ;;;\r\\rFN:Firstname Lastname\r\\rEMAIL;TYPE=INTERNET;TYPE=OTHER:email\r\\rEND:VCARD BEGIN:VCARD\r\\rVERSION:3.0\r\\rN:;Firstname Lastname ;;;\r\\rFN:Firstname Lastname\r\\rEMAIL;TYPE=INTERNET;TYPE=OTHER:email\r\\rEND:VCARD Next, I wrote the content of the table to a file: \copy test to test.txt And compared the two rows in the file: charles@as11:~$ cat test.txt BEGIN:VCARD\r\\\rVERSION:3.0\r\\\rN:;Firstname Lastname ;;;\r\\\rFN:Firstname Lastname\r\\\rEMAIL;TYPE=INTERNET;TYPE=OTHER:email\r\\\rEND:VCARD BEGIN:VCARD\\r\\\\rVERSION:3.0\\r\\\\rN:;Firstname Lastname ;;;\\r\\\\rFN:Firstname Lastname\\r\\\\rEMAIL;TYPE=INTERNET;TYPE=OTHER:email\\r\\\\rEND:VCARD They differ. The string from the original table contains '\r\\\r' while the string inserted as such contains '\\r\\\\r' Based on that I could eventually transform the content of the fields: roundcubemail=> update contacts set vcard = replace(vcard,E'\r\\\r',E'\r\n') ; UPDATE 623 SELECT vcard FROM contacts LIMIT 1; replace ---------------------------------------- BEGIN:VCARD\r + VERSION:3.0\r + N:;Firstname Lastname ;;;\r + FN:Firstname Lastname\r + EMAIL;TYPE=INTERNET;TYPE=OTHER:email\r+ END:VCARD (1 row) Honestly I still don't understand why this happened this way. Charles
On 09/25/2016 05:45 AM, Charles Clavadetscher wrote: > Hi Rob > > On 09/25/2016 01:39 PM, rob stone wrote: >> >> On Sun, 2016-09-25 at 10:29 +0200, Charles Clavadetscher wrote: >>> Hello >>> >>> I am using PostgreSQL 9.4.7 on x86_64-unknown-linux-gnu, compiled by >>> gcc >>> (Debian 4.7.2-5) 4.7.2, 64-bit >>> >>> I imported data from a MariaDB table into PostgreSQL and noticed >>> that >>> the content of a field was not correct, but I was not able to change >>> it. >>> The field is called vcard and is of datatye text. >>> >>> The structure of the table: >>> >>> roundcubemail=# \d contacts >>> Tabelle „public.contacts“ >>> Spalte | Typ | >>> Attribute >>> ------------+--------------------------+----------------------------- >>> ----------------------------------- >>> contact_id | integer | not null Vorgabewert >>> nextval(('contacts_seq'::text)::regclass) >>> changed | timestamp with time zone | not null Vorgabewert now() >>> del | smallint | not null Vorgabewert 0 >>> name | character varying(128) | not null Vorgabewert >>> ''::character varying >>> email | text | not null Vorgabewert >>> ''::text >>> firstname | character varying(128) | not null Vorgabewert >>> ''::character varying >>> surname | character varying(128) | not null Vorgabewert >>> ''::character varying >>> vcard | text | >>> words | text | >>> user_id | integer | not null >>> >>> The content of vcard looks as follows (replaced real names with >>> placeholders): >>> >>> BEGIN:VCARD\r\\rVERSION:3.0\r\\rN:;Firstname Lastname >>> ;;;\r\\rFN:Firstname >>> Lastname\r\\rEMAIL;TYPE=INTERNET;TYPE=OTHER:email\r\\rEND:VCARD >>> >>> My target is to replace all occurences of '\r\\r' with E'\r\n' to >>> comply >>> with RFC 6350. >>> >>> I tried using the function replace and I am faced with a strange >>> behaviour. If I use the function with a string as shown above I get >>> the >>> expected result: >>> >>> elect replace('BEGIN:VCARD\r\\rVERSION:3.0\r\\rN:;Firstname Lastname >>> ;;;\r\\rFN:Firstname >>> Lastname\r\\rEMAIL;TYPE=INTERNET;TYPE=OTHER:email\r\\rEND:VCARD','\r\ >>> \r',E'\r\n') >>> ; >>> >>> replace >>> ---------------------------------------- >>> BEGIN:VCARD\r + >>> VERSION:3.0\r + >>> N:;Firstname Lastname ;;;\r + >>> FN:Firstname Lastname\r + >>> EMAIL;TYPE=INTERNET;TYPE=OTHER:email\r+ >>> END:VCARD >>> (1 row) >>> >>> However, if I use the function on the vcard field nothing is >>> replaced: >>> >>> select replace(vcard,'\r\\r',E'\r\n') from contacts; >>> >>> >>> replace >>> ------------------------------------------------------------------- >>> ------------------------------------------------------------------- >>> ---- >>> BEGIN:VCARD\r\\rVERSION:3.0\r\\rN:;Firstname Lastname >>> ;;;\r\\rFN:Firstname >>> Lastname\r\\rEMAIL;TYPE=INTERNET;TYPE=OTHER:email\r\\rEND:VCARD >>> (1 row) >>> >>> Does anybody have an idea what I am doing wrong? >>> Thank you for your help. >>> >>> Charles >>> >>> -- >>> Swiss PostgreSQL Users Group >>> c/o Charles Clavadetscher >>> Treasurer >>> Motorenstrasse 18 >>> CH – 8005 Zürich >>> >>> http://www.swisspug.org >>> >>> +-----------------------+ >>>> >>>> ____ ______ ___ | >>>> / )/ \/ \ | >>>> ( / __ _\ ) | >>>> \ (/ o) ( o) ) | >>>> \_ (_ ) \ ) _/ | >>>> \ /\_/ \)/ | >>>> \/ <//| |\\> | >>>> _| | | >>>> \|_/ | >>>> | >>>> PostgreSQL 1996-2016 | >>>> 20 Years of Success | >>>> | >>> +-----------------------+ >>> >>> >> >> >> >> Tested this on 9.6beta3 on a test database and it appears to work fine. >> >> Inserted one row. >> >> dinkumerp=> select * from contacts; >> LOG: duration: 0.571 ms statement: select * from contacts; >> contact_id | changed | del | name | email | >> firstname | s >> urname >> | vcard | >> words >> | user_id >> ------------+-------------------------------+-----+------+-------+----- >> ------+-- >> -------+--------------------------------------------------------------- >> --+------ >> -+--------- >> 1 | 2016-09-25 21:30:54.788442+10 | 0 >> | | | | >> | BEGIN:VCARD\r\\rVERSION:3.0\r\\rN:;Firstname >> Lastname +| >> | >> | | | | | >> | >> | >> ;;;\r\\rFN:Firstname +| >> | >> | | | | | >> | >> | >> Lastname\r\\rEMAIL;TYPE=INTERNET;TYPE=OTHER:email\r\\rEND:VCARD | >> | >> (1 row) >> >> Replace select. >> >> dinkumerp=> select replace(vcard,'\r\\r',E'\r\n') from contacts; >> LOG: duration: 0.400 ms statement: select >> replace(vcard,'\r\\r',E'\r\n') from contacts; >> replace >> ---------------------------------------- >> BEGIN:VCARD\r + >> VERSION:3.0\r + >> N:;Firstname Lastname + >> ;;;\r + >> FN:Firstname + >> Lastname\r + >> EMAIL;TYPE=INTERNET;TYPE=OTHER:email\r+ >> END:VCARD >> (1 row) >> >> >> HTH, >> Rob >> > > Thank you. Unfortunately this did not help. But it was a confirmation > that there must be a difference in what I see in the console and what is > actually stored in the DB. > > I found a way to check that and with that a solution to my problem. > > First I created a table: > > create table test (txt text); > > Then I inserted two rows: > One using a value from the table: > > insert into test values ((select vcard from contacts limit 1)); > > And one using the string as it appears in the console: > > insert into test values > ('BEGIN:VCARD\r\\rVERSION:3.0\r\\rN:Halbritter;Ursula;;;\r\\rFN:Ursula > Halbritter\r\\rEMAIL;TYPE=INTERNET;TYPE=HOME:ursula.halbritter@zweiernet.ch\r\\rADR;TYPE=home:;;strasse;schwerzenbach;;;ooooooooooooooooooooooooooo\r\\rEND:VCARD'); Short version try the above as: insert into test values (E'BEGIN:VCARD\r\\rVERSION:3.0\r\\rN:Halbritter;Ursula;;;\r\\rFN:Ursula Halbritter\r\\rEMAIL;TYPE=INTERNET;TYPE=HOME:ursula.halbritter@zweiernet.ch\r\\rADR;TYPE=home:;;strasse;schwerzenbach;;;ooooooooooooooooooooooooooo\r\\rEND:VCARD'); Long version: https://www.postgresql.org/docs/9.5/static/sql-syntax-lexical.html 4.1.2.2. String Constants with C-style Escape or 4.1.2.4. Dollar-quoted String Constants > > In the console they look exactly the same: > > roundcubemail2=> select * from test; > txt > ------------------------------------------------------------------------------------------------------------------------------------------ > > BEGIN:VCARD\r\\rVERSION:3.0\r\\rN:;Firstname Lastname > ;;;\r\\rFN:Firstname > Lastname\r\\rEMAIL;TYPE=INTERNET;TYPE=OTHER:email\r\\rEND:VCARD > BEGIN:VCARD\r\\rVERSION:3.0\r\\rN:;Firstname Lastname > ;;;\r\\rFN:Firstname > Lastname\r\\rEMAIL;TYPE=INTERNET;TYPE=OTHER:email\r\\rEND:VCARD > > Next, I wrote the content of the table to a file: > > \copy test to test.txt > > And compared the two rows in the file: > > charles@as11:~$ cat test.txt > BEGIN:VCARD\r\\\rVERSION:3.0\r\\\rN:;Firstname Lastname > ;;;\r\\\rFN:Firstname > Lastname\r\\\rEMAIL;TYPE=INTERNET;TYPE=OTHER:email\r\\\rEND:VCARD > BEGIN:VCARD\\r\\\\rVERSION:3.0\\r\\\\rN:;Firstname Lastname > ;;;\\r\\\\rFN:Firstname > Lastname\\r\\\\rEMAIL;TYPE=INTERNET;TYPE=OTHER:email\\r\\\\rEND:VCARD > > They differ. The string from the original table contains '\r\\\r' while > the string inserted as such contains '\\r\\\\r' Based on that I could > eventually transform the content of the fields: > > roundcubemail=> update contacts set vcard = > replace(vcard,E'\r\\\r',E'\r\n') ; > UPDATE 623 > > SELECT vcard FROM contacts LIMIT 1; > replace > ---------------------------------------- > BEGIN:VCARD\r + > VERSION:3.0\r + > N:;Firstname Lastname ;;;\r + > FN:Firstname Lastname\r + > EMAIL;TYPE=INTERNET;TYPE=OTHER:email\r+ > END:VCARD > (1 row) > > Honestly I still don't understand why this happened this way. > > Charles > > -- Adrian Klaver adrian.klaver@aklaver.com
Charles Clavadetscher <clavadetscher@swisspug.org> writes: > Honestly I still don't understand why this happened this way. I wonder if you have standard_conforming_strings turned off, or did when that data was inserted. That would change the behavior of backslashes in string literals. regards, tom lane
On 09/25/2016 08:39 AM, Tom Lane wrote: > Charles Clavadetscher <clavadetscher@swisspug.org> writes: >> Honestly I still don't understand why this happened this way. > > I wonder if you have standard_conforming_strings turned off, or > did when that data was inserted. That would change the behavior > of backslashes in string literals. That got me to thinking: standard_conforming_strings = on test=# create table test (txt text); CREATE TABLE test=# insert into test values ('BEGIN:VCARD\r\\rVERSION:3.0\r\\rN:Halbritter;Ursula;;;\r\\rFN:Ursula Halbritter\r\\rEMAIL;TYPE=INTERNET;TYPE=HOME:ursula.halbritter@zweiernet.ch\r\\rADR;TYPE=home:;;strasse;schwerzenbach;;;ooooooooooooooooooooooooooo\r\\rEND:VCARD'); INSERT 0 1 test=# insert into test values (E'BEGIN:VCARD\r\\rVERSION:3.0\r\\rN:Halbritter;Ursula;;;\r\\rFN:Ursula Halbritter\r\\rEMAIL;TYPE=INTERNET;TYPE=HOME:ursula.halbritter@zweiernet.ch\r\\rADR;TYPE=home:;;strasse;schwerzenbach;;;ooooooooooooooooooooooooooo\r\\rEND:VCARD'); INSERT 0 1 test=# \copy test to test.txt COPY 2 cat test.txt BEGIN:VCARD\\r\\\\rVERSION:3.0\\r\\\\rN:Halbritter;Ursula;;;\\r\\\\rFN:Ursula Halbritter\\r\\\\rEMAIL;TYPE=INTERNET;TYPE=HOME:ursula.halbritter@zweiernet.ch\\r\\\\rADR;TYPE=home:;;strasse;schwerzenbach;;;ooooooooooooooooooooooooooo\\r\\\\rEND:VCARD BEGIN:VCARD\r\\rVERSION:3.0\r\\rN:Halbritter;Ursula;;;\r\\rFN:Ursula Halbritter\r\\rEMAIL;TYPE=INTERNET;TYPE=HOME:ursula.halbritter@zweiernet.ch\r\\rADR;TYPE=home:;;strasse;schwerzenbach;;;ooooooooooooooooooooooooooo\r\\rEND:VCARD standard_conforming_strings = off test=# truncate test; TRUNCATE TABLE test=# insert into test values ('BEGIN:VCARD\r\\rVERSION:3.0\r\\rN:Halbritter;Ursula;;;\r\\rFN:Ursula Halbritter\r\\rEMAIL;TYPE=INTERNET;TYPE=HOME:ursula.halbritter@zweiernet.ch\r\\rADR;TYPE=home:;;strasse;schwerzenbach;;;ooooooooooooooooooooooooooo\r\\rEND:VCARD'); WARNING: nonstandard use of escape in a string literal LINE 1: insert into test values ('BEGIN:VCARD\r\\rVERSION:3.0\r\\rN:... ^ HINT: Use the escape string syntax for escapes, e.g., E'\r\n'. INSERT 0 1 test=# insert into test values (E'BEGIN:VCARD\r\\rVERSION:3.0\r\\rN:Halbritter;Ursula;;;\r\\rFN:Ursula Halbritter\r\\rEMAIL;TYPE=INTERNET;TYPE=HOME:ursula.halbritter@zweiernet.ch\r\\rADR;TYPE=home:;;strasse;schwerzenbach;;;ooooooooooooooooooooooooooo\r\\rEND:VCARD'); INSERT 0 1 test=# \copy test to test_off.txt COPY 2 cat test_off.txt BEGIN:VCARD\r\\rVERSION:3.0\r\\rN:Halbritter;Ursula;;;\r\\rFN:Ursula Halbritter\r\\rEMAIL;TYPE=INTERNET;TYPE=HOME:ursula.halbritter@zweiernet.ch\r\\rADR;TYPE=home:;;strasse;schwerzenbach;;;ooooooooooooooooooooooooooo\r\\rEND:VCARD BEGIN:VCARD\r\\rVERSION:3.0\r\\rN:Halbritter;Ursula;;;\r\\rFN:Ursula Halbritter\r\\rEMAIL;TYPE=INTERNET;TYPE=HOME:ursula.halbritter@zweiernet.ch\r\\rADR;TYPE=home:;;strasse;schwerzenbach;;;ooooooooooooooooooooooooooo\r\\rEND:VCARD > > regards, tom lane > > -- Adrian Klaver adrian.klaver@aklaver.com
Hello Tom and Adrian > -----Original Message----- > From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Adrian Klaver > Sent: Sonntag, 25. September 2016 18:38 > To: Tom Lane <tgl@sss.pgh.pa.us>; Charles Clavadetscher <clavadetscher@swisspug.org> > Cc: pgsql-general@postgresql.org > Subject: Re: [GENERAL] Question on replace function [solved] > > On 09/25/2016 08:39 AM, Tom Lane wrote: > > Charles Clavadetscher <clavadetscher@swisspug.org> writes: > >> Honestly I still don't understand why this happened this way. > > > > I wonder if you have standard_conforming_strings turned off, or did > > when that data was inserted. That would change the behavior of > > backslashes in string literals. > > That got me to thinking: > > standard_conforming_strings = on I checked the configuration and standard_conforming_strings is on and I did not change it before reading the data. The resultlooks the same as you showed in the first test. In a short test, trying to read the file with this setting off leads to other errors. Possibly I'll be able to make somemore tests later or in the evening and report. Thank you for you input. Charles