Обсуждение: trimming functions.
Hi all. I've got a table with a field called "code". This field is a code of asociated crops and vegetation in one area. As this code is too complex for our purposes. I need to trim the text to the main crop in every row. Sometimes the first character is a '('. So I need to remove this first '(' and extract the first code. This first code is formed by alphabet character and can be up to 3 haracters in length. So, at the moment I've done: SELECT cod_grass,code,substring(ltrim(code,'(') FROM 1 FOR 3) AS code_trimmed FROM landuses WHERE code LIKE '(%'; An extract of the result is: cod_grass | code | code_trimmed -----------+-----+-------------------------+---------------- 1539 | (NJ/LI)+NJ{10:}+LI{10:} | NJ/ 1847 | (AG/L)+AL{10:} | AG/ 2767 | (OL/AL)+L{20:} | OL/ 19 | LI+NJ | LI+ 20 | I | I 29 | NJ | NJ 106 | PH{:LZ40} | PH{ 111 | AG^ | AG^ 112 | PD | PD 187 | L+AL | L+A 189 | M | M 195 | 1 | MD | MD 196 | 2 | L+AL{40:} | L+A ... So I still need to improve the SELECT to remove all possible symbols after the first group of alphabet characters to get a 'code_trimmed' column with just the characters:(NJ, AG, OL, LI, I, NJ, PH, AG, PD, L, M, MD, L). Possible 'non alphabet' symbols are '{+/^(' Any idea? Thanks and regards, Javier
On Mon, 23 Jun 2003 12:41:19 +0200 javier garcia - CEBAS <rn001@cebas.csic.es> wrote: > So I still need to improve the SELECT to remove all possible symbols after > the first group of alphabet characters to get a 'code_trimmed' column with > just the characters:(NJ, AG, OL, LI, I, NJ, PH, AG, PD, L, M, MD, L). > Possible 'non alphabet' symbols are '{+/^(' > Any idea? simple pl/perl function should do the trick. maybe something like this: create function d_trim(text) returns text as ' my $sString = shift; $sString =~ s/^\\(*([A-Z]+).*$/$1/; return $sString; ' language 'plperl'; i'm not entirelly sure if this function will work - i wrote it without anytesting, but it is so simple, one shouldn't haveany problems correcting/modifying it's behaviour. depesz
javier garcia - CEBAS wrote: > I've got a table with a field called "code". This field is a code of > asociated crops and vegetation in one area. As this code is too complex for > our purposes. I need to trim the text to the main crop in every row. > Sometimes the first character is a '('. So I need to remove this first '(' > and extract the first code. This first code is formed by alphabet character > and can be up to 3 haracters in length. > So, at the moment I've done: Your example and the rest of the problem desciption are not consistent, but I'm guessing something like this is at least close: create table landuses(cod_grass int, code text); insert into landuses values(1539,'(NJ/LI)+NJ{10:}+LI{10:}'); insert into landuses values(1847,'(AG/L)+AL{10:}'); insert into landuses values(2767,'(OL/AL)+L{20:}'); insert into landuses values(19,'LI+NJ'); insert into landuses values(20,'I'); insert into landuses values(29,'NJ'); insert into landuses values(106,'PH{:LZ40}'); insert into landuses values(111,'AG^'); insert into landuses values(112,'PD'); insert into landuses values(187,'L+AL'); insert into landuses values(189,'M'); insert into landuses values(195,'MD'); insert into landuses values(196,'L+AL{40:}'); regression=# SELECT cod_grass, code, split_part(translate(ltrim(code,'('),'{}+/^():','\t\t\t\t\t\t\t'),'\t',1) AS code_fixed FROM landuses; cod_grass | code | code_fixed -----------+-------------------------+------------ 1539 | (NJ/LI)+NJ{10:}+LI{10:} | NJ 1847 | (AG/L)+AL{10:} | AG 2767 | (OL/AL)+L{20:} | OL 19 | LI+NJ | LI 20 | I | I 29 | NJ | NJ 106 | PH{:LZ40} | PH 111 | AG^ | AG 112 | PD | PD 187 | L+AL | L 189 | M | M 195 | MD | MD 196 | L+AL{40:} | L (13 rows) Also, you didn't mention a version -- this will work on 7.3.x but not 7.2.x or before (split_part is new in 7.3). HTH, Joe