Обсуждение: trimming functions.

Поиск
Список
Период
Сортировка

trimming functions.

От
javier garcia - CEBAS
Дата:
Hi all.
I've got a table with a field called "code". This field is a code of
asociated crops and vegetation in one area. As this code is too complex for
our purposes. I need to trim the text to the main crop in every row.
Sometimes the first character is a '('. So I need to remove this first '('
and extract the first code. This first code is formed by alphabet character
and can be up to 3 haracters in length.
So, at the moment I've done:

 SELECT cod_grass,code,substring(ltrim(code,'(') FROM 1 FOR 3) AS
code_trimmed FROM landuses WHERE code LIKE '(%';

An extract of the result is:

 cod_grass |         code                       | code_trimmed
-----------+-----+-------------------------+----------------
      1539     | (NJ/LI)+NJ{10:}+LI{10:} | NJ/
      1847     | (AG/L)+AL{10:}              | AG/
      2767     | (OL/AL)+L{20:}              | OL/
        19       |  LI+NJ                            | LI+
        20       |  I                                    | I
        29       | NJ                                  | NJ
       106      | PH{:LZ40}                      | PH{
       111      | AG^                               | AG^
       112      | PD                                 | PD
       187      | L+AL                             | L+A
       189      | M                                   | M
       195 |   1 | MD                              | MD
       196 |   2 | L+AL{40:}                    | L+A
...

So I still need to improve the SELECT to remove all possible symbols after
the first group of alphabet characters to get a 'code_trimmed' column with
just the characters:(NJ, AG, OL, LI, I, NJ, PH, AG, PD, L, M, MD, L).
Possible 'non alphabet' symbols are '{+/^('
Any idea?

Thanks and regards,
Javier

Re: trimming functions.

От
Hubert Lubaczewski
Дата:
On Mon, 23 Jun 2003 12:41:19 +0200
javier garcia - CEBAS <rn001@cebas.csic.es> wrote:

> So I still need to improve the SELECT to remove all possible symbols after
> the first group of alphabet characters to get a 'code_trimmed' column with
> just the characters:(NJ, AG, OL, LI, I, NJ, PH, AG, PD, L, M, MD, L).
> Possible 'non alphabet' symbols are '{+/^('
> Any idea?

simple pl/perl function should do the trick.

maybe something like this:
create function d_trim(text) returns text as '
    my $sString = shift;
    $sString =~ s/^\\(*([A-Z]+).*$/$1/;
    return $sString;
' language 'plperl';

i'm not entirelly sure if this function will work - i wrote it without anytesting, but it is so simple, one shouldn't
haveany problems correcting/modifying it's behaviour. 

depesz

Re: trimming functions.

От
Joe Conway
Дата:
javier garcia - CEBAS wrote:
> I've got a table with a field called "code". This field is a code of
> asociated crops and vegetation in one area. As this code is too complex for
> our purposes. I need to trim the text to the main crop in every row.
> Sometimes the first character is a '('. So I need to remove this first '('
> and extract the first code. This first code is formed by alphabet character
> and can be up to 3 haracters in length.
> So, at the moment I've done:

Your example and the rest of the problem desciption are not consistent,
but I'm guessing something like this is at least close:

create table landuses(cod_grass int, code text);
insert into landuses values(1539,'(NJ/LI)+NJ{10:}+LI{10:}');
insert into landuses values(1847,'(AG/L)+AL{10:}');
insert into landuses values(2767,'(OL/AL)+L{20:}');
insert into landuses values(19,'LI+NJ');
insert into landuses values(20,'I');
insert into landuses values(29,'NJ');
insert into landuses values(106,'PH{:LZ40}');
insert into landuses values(111,'AG^');
insert into landuses values(112,'PD');
insert into landuses values(187,'L+AL');
insert into landuses values(189,'M');
insert into landuses values(195,'MD');
insert into landuses values(196,'L+AL{40:}');

regression=# SELECT cod_grass, code,
split_part(translate(ltrim(code,'('),'{}+/^():','\t\t\t\t\t\t\t'),'\t',1)
AS code_fixed FROM landuses;
  cod_grass |          code           | code_fixed
-----------+-------------------------+------------
       1539 | (NJ/LI)+NJ{10:}+LI{10:} | NJ
       1847 | (AG/L)+AL{10:}          | AG
       2767 | (OL/AL)+L{20:}          | OL
         19 | LI+NJ                   | LI
         20 | I                       | I
         29 | NJ                      | NJ
        106 | PH{:LZ40}               | PH
        111 | AG^                     | AG
        112 | PD                      | PD
        187 | L+AL                    | L
        189 | M                       | M
        195 | MD                      | MD
        196 | L+AL{40:}               | L
(13 rows)

Also, you didn't mention a version -- this will work on 7.3.x but not
7.2.x or before (split_part is new in 7.3).

HTH,

Joe