Обсуждение: regexp_replace puzzle

Поиск
Список
Период
Сортировка

regexp_replace puzzle

От
Harald Fuchs
Дата:
I've got a problem with regexp_replace which I could reduce to the following:

  CREATE FUNCTION digest(text, text) RETURNS bytea
      LANGUAGE c IMMUTABLE STRICT
      AS '$libdir/pgcrypto', 'pg_digest';

  CREATE FUNCTION sha224enc(text) RETURNS text AS $$
  BEGIN
    RAISE WARNING 'arg=»%«', $1;
    RETURN encode(digest($1, 'sha224'), 'hex');
  END;
  $$ LANGUAGE plpgsql IMMUTABLE;

  CREATE TABLE t1 (
    id serial NOT NULL,
    val text NOT NULL,
    PRIMARY KEY (id)
  );

  INSERT INTO t1 (val) VALUES ('d111111');
  INSERT INTO t1 (val) VALUES ('xd222222');
  INSERT INTO t1 (val) VALUES ('x d333333');

  SELECT val,
         regexp_replace(val,
          '^(.*\W)?(C\d{7}|[DI]\d{6}|S\d{10})(\W.*)?$',
          '\1' || '»\2«='|| sha224enc('\2') || '\3', 'i')
  FROM t1
  WHERE val ~*
          '^(.*\W)?(C\d{7}|[DI]\d{6}|S\d{10})(\W.*)?$';

(I want to replace patterns within a string by their SHA-224 hash.)
However, when I run this example I get:

  WARNING:  arg=»\2«
      val    |                            regexp_replace
  -----------+----------------------------------------------------------------------
   d111111   | »d111111«=da2c99be8170ce02b04ee7d90877ae9d86fea467abb973c74c708b27
   x d333333 | x »d333333«=da2c99be8170ce02b04ee7d90877ae9d86fea467abb973c74c708b27
  (2 rows)

i.e. the first '\2' gets properly expanded by the second paren match,
but the second '\2' doesn't get expanded.

What am I overlooking?

Re: regexp_replace puzzle

От
Osvaldo Kussama
Дата:
2010/3/10 Harald Fuchs <hari.fuchs@gmail.com>:
> I've got a problem with regexp_replace which I could reduce to the following:
>
>  CREATE FUNCTION digest(text, text) RETURNS bytea
>      LANGUAGE c IMMUTABLE STRICT
>      AS '$libdir/pgcrypto', 'pg_digest';
>
>  CREATE FUNCTION sha224enc(text) RETURNS text AS $$
>  BEGIN
>    RAISE WARNING 'arg=»%«', $1;
>    RETURN encode(digest($1, 'sha224'), 'hex');
>  END;
>  $$ LANGUAGE plpgsql IMMUTABLE;
>
>  CREATE TABLE t1 (
>    id serial NOT NULL,
>    val text NOT NULL,
>    PRIMARY KEY (id)
>  );
>
>  INSERT INTO t1 (val) VALUES ('d111111');
>  INSERT INTO t1 (val) VALUES ('xd222222');
>  INSERT INTO t1 (val) VALUES ('x d333333');
>
>  SELECT val,
>         regexp_replace(val,
>          '^(.*\W)?(C\d{7}|[DI]\d{6}|S\d{10})(\W.*)?$',
>          '\1' || '»\2«='|| sha224enc('\2') || '\3', 'i')
>  FROM t1
>  WHERE val ~*
>          '^(.*\W)?(C\d{7}|[DI]\d{6}|S\d{10})(\W.*)?$';
>
> (I want to replace patterns within a string by their SHA-224 hash.)
> However, when I run this example I get:
>
>  WARNING:  arg=»\2«
>      val    |                            regexp_replace
>  -----------+----------------------------------------------------------------------
>   d111111   | »d111111«=da2c99be8170ce02b04ee7d90877ae9d86fea467abb973c74c708b27
>   x d333333 | x »d333333«=da2c99be8170ce02b04ee7d90877ae9d86fea467abb973c74c708b27
>  (2 rows)
>
> i.e. the first '\2' gets properly expanded by the second paren match,
> but the second '\2' doesn't get expanded.
>
> What am I overlooking?
>


Use g flag.
"Flag g causes the function to find each match in the string, not only
the first one, and return a row for each such match." [1]

Osvaldo

[1] http://www.postgresql.org/docs/current/interactive/functions-matching.html#FUNCTIONS-POSIX-REGEXP

Re: regexp_replace puzzle

От
David W Noon
Дата:
On Wed, 10 Mar 2010 13:41:54 +0100, Harald Fuchs wrote about [GENERAL]
regexp_replace puzzle:

[snip]
>  SELECT val,
>         regexp_replace(val,
>          '^(.*\W)?(C\d{7}|[DI]\d{6}|S\d{10})(\W.*)?$',
>          '\1' || '»\2«='|| sha224enc('\2') || '\3', 'i')
[snip]
>i.e. the first '\2' gets properly expanded by the second paren match,
>but the second '\2' doesn't get expanded.

The second instance of '\2' is first passed to sha224enc(), then that
function's result is passed to regexp_replace.
--
Regards,

Dave  [RLU #314465]
=======================================================================
david.w.noon@ntlworld.com (David W Noon)
=======================================================================