Обсуждение: table partitioning pl/pgsql helpers

Поиск
Список
Период
Сортировка

table partitioning pl/pgsql helpers

От
Enrico Sirola
Дата:
Hello,

I'm trying to write a trigger on insert which should insert the row in another
table. The table on which to insert the row should be selected at runtime and 
it is not know in advance. For example, let's say we have a table with
two columns, a date and an integer. a row is inserted into table XXX and 

CREATE TABLE XXX
(
refdate date;
x2 integer;
)

when the statement 

insert into XXX VALUES ('2007-11-11', 1);

is executed, a trigger (or rule) should be fired to insert the row into 
table XXX_20071111 (having the same schema). If the XXX_* tables are created beforehand
this is not a problem because you can set up a rule for each date performing

The problem arises when you try to extend the trigger in order to also dinamically
perform table creation is the XXX_20071111 doesn't exist:

I have, for example:


CREATE OR REPLACE FUNCTION add_child_table(parent_name varchar, d date)
RETURNS varchar AS $$
DECLARE
        new_table_name varchar;
BEGIN
      raise notice '%', 'creating table';
        new_table_name := date2tblname(parent_name, d); -- converts table name and date into child table name
        execute 'CREATE TABLE ' || new_table_name || ' ( ) INHERITS (' || parent_name || ')';
        execute 'CREATE RULE '
                || new_table_name
                || '_insert AS ON INSERT TO '
                || parent_name
                || ' WHERE ( refdate = DATE ' /* refdate is the field we use to partition */
                || '''' || d || ''''
                || ' ) DO INSTEAD INSERT INTO '
                || new_table_name
                || ' VALUES ( NEW.* )';
        return new_table_name;
END;
$$ LANGUAGE plpgsql;

I can use the above to add a child table and the rule to implement partitioning

CREATE OR REPLACE FUNCTION child_creation_trigger()
RETURNS "trigger" AS $$
BEGIN
        IF ( child_exist(TG_RELNAME::text, NEW.refdate) = false ) THEN
              raise notice '%', 'creating ' || TG_RELNAME::text || ' for ' || NEW.refdate::text;
              perform add_child_table(TG_RELNAME::text, NEW.refdate);
              --insert into ' || child_table_name || ' values ( NEW.* );
              RETURN NEW;
        ELSE
              raise notice '%', 'NOT creating ' || TG_RELNAME::text || ' for ' || NEW.refdate::text;
              RETURN NEW;
        END IF;
END;
$$ LANGUAGE plpgsql;

The trigger function is hooked to the to-be-partitioned table 


CREATE TRIGGER XXX_trigger BEFORE INSERT
ON XXX FOR EACH ROW
execute procedure child_creation_trigger ();


now, every time I insert a tuple into XXX, the trigger is fired and checks if the needed table exists or
not. If not, it creates the table and rule and goes on. The problem is that in this case the first
row is inserted into the XXX table, not in the (just created) XXX_<refdate> . Then I tried to insert
the row myself from the trigger body (and return null in order to skip the original insertion), but
I'm not able to do it (see the commented insert in the above IF clause) because I can't properly
quote the target table name.
I shoud perform a 

insert into child_table_name values (NEW.*);

obviously written like this the plpgsql complains at runtime because child_table_name is not a table name.
If, on the other side, I dynamically create the query like in

execute 'insert into ' || child_table_name || ' values (NEW.*)';

it complains because NEW in the execution context is unknown.

This should be a rather common problem... Isn't it? Is there a canonical way to
solve it? Maybe there's a trivial answer, but I have no plpgsql programming 
experience.

Thanks a lot in advance, 

Re: table partitioning pl/pgsql helpers

От
Robert Treat
Дата:
On Wednesday 30 May 2007 12:55, Enrico Sirola wrote:
> Hello,
>
> I'm trying to write a trigger on insert which should insert the row
> in another
> table. The table on which to insert the row should be selected at
> runtime and
> it is not know in advance. For example, let's say we have a table with
> two columns, a date and an integer. a row is inserted into table XXX and
>
> CREATE TABLE XXX
> (
>     refdate date;
>     x2 integer;
> )
>
> when the statement
>
> insert into XXX VALUES ('2007-11-11', 1);
>
> is executed, a trigger (or rule) should be fired to insert the row into
> table XXX_20071111 (having the same schema). If the XXX_* tables are
> created beforehand
> this is not a problem because you can set up a rule for each date
> performing
> the needed insert (as documented in http://www.postgresql.org/docs/
> 8.1/interactive/ddl-partitioning.html)
>
> The problem arises when you try to extend the trigger in order to
> also dinamically
> perform table creation is the XXX_20071111 doesn't exist:
>
> I have, for example:
>
>
> CREATE OR REPLACE FUNCTION add_child_table(parent_name varchar, d date)
> RETURNS varchar AS $$
> DECLARE
>          new_table_name varchar;
> BEGIN
>        raise notice '%', 'creating table';
>          new_table_name := date2tblname(parent_name, d); -- converts
> table name and date into child table name
>          execute 'CREATE TABLE ' || new_table_name || ' ( ) INHERITS
> (' || parent_name || ')';
>          execute 'CREATE RULE '
>
>                  || new_table_name
>                  || '_insert AS ON INSERT TO '
>                  || parent_name
>                  || ' WHERE ( refdate = DATE ' /* refdate is the
>
> field we use to partition */
>
>                  || '''' || d || ''''
>                  || ' ) DO INSTEAD INSERT INTO '
>                  || new_table_name
>                  || ' VALUES ( NEW.* )';
>
>          return new_table_name;
> END;
> $$ LANGUAGE plpgsql;
>
> I can use the above to add a child table and the rule to implement
> partitioning
>
> CREATE OR REPLACE FUNCTION child_creation_trigger()
> RETURNS "trigger" AS $$
> BEGIN
>          IF ( child_exist(TG_RELNAME::text, NEW.refdate) = false ) THEN
>                raise notice '%', 'creating ' || TG_RELNAME::text || '
> for ' || NEW.refdate::text;
>                perform add_child_table(TG_RELNAME::text, NEW.refdate);
>                --insert into ' || child_table_name || ' values
> ( NEW.* );
>                RETURN NEW;
>          ELSE
>                raise notice '%', 'NOT creating ' || TG_RELNAME::text
>
> || ' for ' || NEW.refdate::text;
>
>                RETURN NEW;
>          END IF;
> END;
> $$ LANGUAGE plpgsql;
>
> The trigger function is hooked to the to-be-partitioned table
>
>
> CREATE TRIGGER XXX_trigger BEFORE INSERT
> ON XXX FOR EACH ROW
> execute procedure child_creation_trigger ();
>
>
> now, every time I insert a tuple into XXX, the trigger is fired and
> checks if the needed table exists or
> not. If not, it creates the table and rule and goes on. The problem
> is that in this case the first
> row is inserted into the XXX table, not in the (just created)
> XXX_<refdate> . Then I tried to insert
> the row myself from the trigger body (and return null in order to
> skip the original insertion), but
> I'm not able to do it (see the commented insert in the above IF
> clause) because I can't properly
> quote the target table name.
> I shoud perform a
>
> insert into child_table_name values (NEW.*);
>
> obviously written like this the plpgsql complains at runtime because
> child_table_name is not a table name.
> If, on the other side, I dynamically create the query like in
>
> execute 'insert into ' || child_table_name || ' values (NEW.*)';
>
> it complains because NEW in the execution context is unknown.
>
> This should be a rather common problem... Isn't it? Is there a
> canonical way to
> solve it? Maybe there's a trivial answer, but I have no plpgsql
> programming
> experience.
>

We I set these up for our clients, I typically seperate the partition creation
piece from the data insertion piece.  (Mostly as partition creation,
especially with rules, is a table locking event, which is better done in a
non-critical path).  If you really must do it all in one go, you'll have a
much better chance of accomplishing this using all triggers (and triggers are
better anyway)... i think you could do it with a combination of rules and a
trigger (on insert to parent, create new parition and insert into it and
delete from parent) but it would certainly need testing to make sure you dont
have multi-rule evaluation... course since your making a trigger anyway...

--
Robert Treat
Database Architect
http://www.omniti.com

Re: [HACKERS] table partitioning pl/pgsql helpers

От
Jim Nasby
Дата:
Dropping -hackers; that list is for development of the database
engine itself.

The problem is that rules will happen before triggers, so what you're
trying to do will never work. Instead, just have the trigger insert
the data into the appropriate table.

On May 30, 2007, at 9:55 AM, Enrico Sirola wrote:

> Hello,
>
> I'm trying to write a trigger on insert which should insert the row
> in another
> table. The table on which to insert the row should be selected at
> runtime and
> it is not know in advance. For example, let's say we have a table with
> two columns, a date and an integer. a row is inserted into table
> XXX and
>
> CREATE TABLE XXX
> (
>     refdate date;
>     x2 integer;
> )
>
> when the statement
>
> insert into XXX VALUES ('2007-11-11', 1);
>
> is executed, a trigger (or rule) should be fired to insert the row
> into
> table XXX_20071111 (having the same schema). If the XXX_* tables
> are created beforehand
> this is not a problem because you can set up a rule for each date
> performing
> the needed insert (as documented in http://www.postgresql.org/docs/
> 8.1/interactive/ddl-partitioning.html)
>
> The problem arises when you try to extend the trigger in order to
> also dinamically
> perform table creation is the XXX_20071111 doesn't exist:
>
> I have, for example:
>
>
> CREATE OR REPLACE FUNCTION add_child_table(parent_name varchar, d
> date)
> RETURNS varchar AS $$
> DECLARE
>         new_table_name varchar;
> BEGIN
>       raise notice '%', 'creating table';
>         new_table_name := date2tblname(parent_name, d); -- converts
> table name and date into child table name
>         execute 'CREATE TABLE ' || new_table_name || ' ( ) INHERITS
> (' || parent_name || ')';
>         execute 'CREATE RULE '
>                 || new_table_name
>                 || '_insert AS ON INSERT TO '
>                 || parent_name
>                 || ' WHERE ( refdate = DATE ' /* refdate is the
> field we use to partition */
>                 || '''' || d || ''''
>                 || ' ) DO INSTEAD INSERT INTO '
>                 || new_table_name
>                 || ' VALUES ( NEW.* )';
>         return new_table_name;
> END;
> $$ LANGUAGE plpgsql;
>
> I can use the above to add a child table and the rule to implement
> partitioning
>
> CREATE OR REPLACE FUNCTION child_creation_trigger()
> RETURNS "trigger" AS $$
> BEGIN
>         IF ( child_exist(TG_RELNAME::text, NEW.refdate) = false ) THEN
>               raise notice '%', 'creating ' || TG_RELNAME::text ||
> ' for ' || NEW.refdate::text;
>               perform add_child_table(TG_RELNAME::text, NEW.refdate);
>               --insert into ' || child_table_name || ' values
> ( NEW.* );
>               RETURN NEW;
>         ELSE
>               raise notice '%', 'NOT creating ' || TG_RELNAME::text
> || ' for ' || NEW.refdate::text;
>               RETURN NEW;
>         END IF;
> END;
> $$ LANGUAGE plpgsql;
>
> The trigger function is hooked to the to-be-partitioned table
>
>
> CREATE TRIGGER XXX_trigger BEFORE INSERT
> ON XXX FOR EACH ROW
> execute procedure child_creation_trigger ();
>
>
> now, every time I insert a tuple into XXX, the trigger is fired and
> checks if the needed table exists or
> not. If not, it creates the table and rule and goes on. The problem
> is that in this case the first
> row is inserted into the XXX table, not in the (just created)
> XXX_<refdate> . Then I tried to insert
> the row myself from the trigger body (and return null in order to
> skip the original insertion), but
> I'm not able to do it (see the commented insert in the above IF
> clause) because I can't properly
> quote the target table name.
> I shoud perform a
>
> insert into child_table_name values (NEW.*);
>
> obviously written like this the plpgsql complains at runtime
> because child_table_name is not a table name.
> If, on the other side, I dynamically create the query like in
>
> execute 'insert into ' || child_table_name || ' values (NEW.*)';
>
> it complains because NEW in the execution context is unknown.
>
> This should be a rather common problem... Isn't it? Is there a
> canonical way to
> solve it? Maybe there's a trivial answer, but I have no plpgsql
> programming
> experience.
>
> Thanks a lot in advance,
>
> Enrico Sirola
> enrico.sirola@gmail.com
>
>
>

--
Jim Nasby                                            jim@nasby.net
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)



Re: [HACKERS] table partitioning pl/pgsql helpers

От
Enrico Sirola
Дата:
Il giorno 02/giu/07, alle ore 00:53, Jim Nasby ha scritto:

> Dropping -hackers; that list is for development of the database
> engine itself.

ok, sorry
e.


Re: table partitioning pl/pgsql helpers

От
Enrico Sirola
Дата:
Hi Robert,

Il giorno 01/giu/07, alle ore 04:08, Robert Treat ha scritto:

[...]

We I set these up for our clients, I typically seperate the partition creation 
piece from the data insertion piece.  (Mostly as partition creation, 
especially with rules, is a table locking event, which is better done in a 
non-critical path).  If you really must do it all in one go, you'll have a 

I see, and now I agree with you it's better to decouple partition creation
and data insertion.

much better chance of accomplishing this using all triggers (and triggers are 
better anyway)... i think you could do it with a combination of rules and a 
trigger (on insert to parent, create new parition and insert into it and 
delete from parent) but it would certainly need testing to make sure you dont 
have multi-rule evaluation... course since your making a trigger anyway...  

Even more importantly, I just discovered (trying and then reading pgsql docs) that the rule
system is completely bypassed by the COPY FROM statement, so I think I'll rewrite
everything using some sort of trigger-generating procedure because I want this stuff
to work transparently (and we do lots of copy from).
Thanks for the advices,

Enrico Sirola