Обсуждение: SQL design pattern for a delta trigger?

Поиск
Список
Период
Сортировка

SQL design pattern for a delta trigger?

От
Ted Byers
Дата:
IS there such a thing?  I can be the first to consider
this.

What I am aiming for is a solution with a couple
coupled tables, one of which represents state through
time and the other represents transactions or deltas
on the state.  With one field (a floating point
number) in the state table (or should I say a number
for each id field), it obviously has a time interval
for which it is valid: a start time and an end time.

What I am after is a situation where the moment a
record is inserted in the deltas table, a trigger
function first looks to see if the id provided
presently has a state in the state table.  If not,
then it creates one.  Then, the  end date for that
state record gets set to the current time and a new
record is inserted with the new state (computed by
applying the delta to the value in the previous record
for the state), the current date as the start date and
null for the end date.

This seems like an obvious thing to try, but I am
floundering a little and am therefore wondering if
anyone has seen an SQL design pattern that talks about
this, and an url where I can see such a discussion
and, better, an example.  The first concern is to
ensure that every record inserted into the deltas
table is immediately reflected in the state table, and
the second is that the history of state can be
reconstructed from a suitable query on the state
table.

I can do this easily in client code, but isn't this
the sort of thing best suited to living within the
database itself?

Thanks

Ted

Re: SQL design pattern for a delta trigger?

От
Ted Byers
Дата:
--- Ted Byers <r.ted.byers@rogers.com> wrote:

> IS there such a thing?  I can be the first to
> consider
> this.
>
OOPS.  The mind is faster than the fingers.  That
should have been "I can NOT be the first to consider
this.

Ted

Re: SQL design pattern for a delta trigger?

От
Erik Jones
Дата:
On Dec 6, 2007, at 2:36 PM, Ted Byers wrote:

> IS there such a thing?  I can be the first to consider
> this.

> What I am aiming for is a solution with a couple
> coupled tables, one of which represents state through
> time and the other represents transactions or deltas
> on the state.  With one field (a floating point
> number) in the state table (or should I say a number
> for each id field), it obviously has a time interval
> for which it is valid: a start time and an end time.
>
> What I am after is a situation where the moment a
> record is inserted in the deltas table, a trigger
> function first looks to see if the id provided
> presently has a state in the state table.  If not,
> then it creates one.  Then, the  end date for that
> state record gets set to the current time and a new
> record is inserted with the new state (computed by
> applying the delta to the value in the previous record
> for the state), the current date as the start date and
> null for the end date.
>
> This seems like an obvious thing to try, but I am
> floundering a little and am therefore wondering if
> anyone has seen an SQL design pattern that talks about
> this, and an url where I can see such a discussion
> and, better, an example.  The first concern is to
> ensure that every record inserted into the deltas
> table is immediately reflected in the state table, and
> the second is that the history of state can be
> reconstructed from a suitable query on the state
> table.
>
> I can do this easily in client code, but isn't this
> the sort of thing best suited to living within the
> database itself?

What you want to do here for handling the update v. insert is called
an "UPSERT".  Basically, what you do is run the update as if the row
exists and catch the exception that is thrown if it doesn't at which
point you insert the record with the end date = now().  After that
you can proceed normally with creating the new record with start date
= now() and end date = NULL.

Erik Jones

Software Developer | Emma®
erik@myemma.com
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com



Re: SQL design pattern for a delta trigger?

От
Ted Byers
Дата:
--- Erik Jones <erik@myemma.com> wrote:

>
> On Dec 6, 2007, at 2:36 PM, Ted Byers wrote:
>
> [snip]
> What you want to do here for handling the update v.
> insert is called
> an "UPSERT".  Basically, what you do is run the
> update as if the row
> exists and catch the exception that is thrown if it
> doesn't at which
> point you insert the record with the end date =
> now().  After that
> you can proceed normally with creating the new
> record with start date
> = now() and end date = NULL.
>
Thanks Eric.  Do you know of an URL where this is
discussed or where I can find an example.  None of my
books discuss this, and my search using google has so
far produced only noise.

Thanks again.

Ted

Re: SQL design pattern for a delta trigger?

От
Colin Wetherbee
Дата:
Ted Byers wrote:
> --- Erik Jones <erik@myemma.com> wrote:
>
>> On Dec 6, 2007, at 2:36 PM, Ted Byers wrote:
>>
>> [snip]
>> What you want to do here for handling the update v.
>> insert is called
>> an "UPSERT".  Basically, what you do is run the
>> update as if the row
>> exists and catch the exception that is thrown if it
>> doesn't at which
>> point you insert the record with the end date =
>> now().  After that
>> you can proceed normally with creating the new
>> record with start date
>> = now() and end date = NULL.
>>
> Thanks Eric.  Do you know of an URL where this is
> discussed or where I can find an example.  None of my
> books discuss this, and my search using google has so
> far produced only noise.

You can do this with a conditional.  Something like the following should
work.

IF
   NOT (a query matching your data returns rows)
THEN
   INSERT (your new data)

AFAIK, the developers are working on implementing the {MERGE, UPDATE OR
ON FAILURE INSERT, UPSERT} statement.  Until then, you have to build
your upsert manually.

Colin

Re: SQL design pattern for a delta trigger?

От
Steve Atkins
Дата:
On Dec 7, 2007, at 6:29 AM, Ted Byers wrote:

>
> --- Erik Jones <erik@myemma.com> wrote:
>
>>
>> On Dec 6, 2007, at 2:36 PM, Ted Byers wrote:
>>
>> [snip]
>> What you want to do here for handling the update v.
>> insert is called
>> an "UPSERT".  Basically, what you do is run the
>> update as if the row
>> exists and catch the exception that is thrown if it
>> doesn't at which
>> point you insert the record with the end date =
>> now().  After that
>> you can proceed normally with creating the new
>> record with start date
>> = now() and end date = NULL.
>>
> Thanks Eric.  Do you know of an URL where this is
> discussed or where I can find an example.  None of my
> books discuss this, and my search using google has so
> far produced only noise.

http://www.postgresql.org/docs/current/static/plpgsql-control-
structures.html#PLPGSQL-UPSERT-EXAMPLE might be a good place to start.

Cheers,
   Steve


Re: SQL design pattern for a delta trigger?

От
Erik Jones
Дата:
On Dec 7, 2007, at 8:29 AM, Ted Byers wrote:

>
> --- Erik Jones <erik@myemma.com> wrote:
>
>>
>> On Dec 6, 2007, at 2:36 PM, Ted Byers wrote:
>>
>> [snip]
>> What you want to do here for handling the update v.
>> insert is called
>> an "UPSERT".  Basically, what you do is run the
>> update as if the row
>> exists and catch the exception that is thrown if it
>> doesn't at which
>> point you insert the record with the end date =
>> now().  After that
>> you can proceed normally with creating the new
>> record with start date
>> = now() and end date = NULL.
>>
> Thanks Eric.  Do you know of an URL where this is
> discussed or where I can find an example.  None of my
> books discuss this, and my search using google has so
> far produced only noise.
>
> Thanks again.

The basic pseudo-code (not really SQL) is something like:

TRY:
    run UPDATE
EXCEPT not found:
    make INSERT

If you can give me some table layouts I can probably work out a
better example for you.

Erik Jones

Software Developer | Emma®
erik@myemma.com
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com



Re: SQL design pattern for a delta trigger?

От
"John D. Burger"
Дата:
So two design patterns for a makeshift UPSERT have been presented -
one is to check beforehand, and only insert if the item isn't present
already, the other is to do the insert blindly and let PG check for
you, and catch any exceptions.

I'm also wondering what people's ideas are for a sort of BULK
UPSERT.  I often find myself inserting the results of a SELECT and
wanting a similar check for already existing rows.  The idiom I've
stumbled upon looks like this:

insert into foo (x, y, z)
    select a, b, c from bar join bax ...
    EXCEPT
    select x, y, z from foo;

Namely, I subtract from the results to be inserted any rows that are
already present in the target table.  This can actually even be used
for UPSERTing a single row, and has the virtue of being pure SQL, but
I've wondered about its efficiency.  One alternative would be to
iterate over the SELECT result with a procedural language, and do a
series of UPSERTS, but that seems unlikely to be as efficient for a
large result set.  Any comments about the relative merits of these or
other alternatives?

Thanks.

- John Burger
   MITRE

Re: SQL design pattern for a delta trigger?

От
Steve Atkins
Дата:
On Dec 8, 2007, at 7:54 AM, John D. Burger wrote:

> So two design patterns for a makeshift UPSERT have been presented -
> one is to check beforehand, and only insert if the item isn't
> present already

... which will give the wrong results if there's any concurrent
updates...

> , the other is to do the insert blindly and let PG check for you,
> and catch any exceptions.
>
> I'm also wondering what people's ideas are for a sort of BULK
> UPSERT.  I often find myself inserting the results of a SELECT and
> wanting a similar check for already existing rows.  The idiom I've
> stumbled upon looks like this:
>
> insert into foo (x, y, z)
>     select a, b, c from bar join bax ...
>     EXCEPT
>     select x, y, z from foo;
>
> Namely, I subtract from the results to be inserted any rows that
> are already present in the target table.


> This can actually even be used for UPSERTing a single row, and has
> the virtue of being pure SQL, but I've wondered about its efficiency.

Worry more about it's correctness. Doing entirely the wrong thing,
quickly, isn't always what you want. If there's any concurrency
involved at all, this is likely to do the wrong thing.

> One alternative would be to iterate over the SELECT result with a
> procedural language, and do a series of UPSERTS, but that seems
> unlikely to be as efficient for a large result set.

Just take the idiom that's been pointed out in the documentation and
wrap a loop around it.

Cheers,
   Steve


Re: SQL design pattern for a delta trigger?

От
Ted Byers
Дата:
Thanks all.  I tried the appended code in a trigger
function, but postgresql won't take it.

It complains that assets.quantity is not a scalar.
However, the WHERE clause in that select statement
guarantees that at most only one record will be
returned.  An open position on a given kind of asset
is represented by null in the end_valid_time field,
and the combination of asset_type_id, portfolio_id and
end_valid_time is certain to be unique, if there is a
record for that asset type in that porfolio at all.

I thought I'd try checking for an open position first
because the manual indicated that exception handling
is quite expensive.  But I must have missed something,
because it doesn't like how I tried to define my
trigger function.

I have four sequences, one each for four tables.  Two
of the tables are just look up tables, for asset types
and portfolios; trivial for test case with only an
autoincrementing integer primary key and a "name".
The other two are the ones of interest.  Assets is
treated as read only as far as the user is concerned.
The user's data in the assets table is mediated
through transactions inserted (and NEVER deleted or
updated), into the transactions table.  Assets has the
minimal suite of columns (autoincrementing integer
primary key, asset_typeID, portfolio_id, all integers,
quantity with is a floating point number and two
dates: start_valid_time and end_valid_time).
Transactions has only a transaction_id, portfolio_id,
asset_type_id, quantity and transaction_date.  There
are of course foreign keys connectin the assets and
transactions tables to the lookup tables, and a
composite index on assets to make looking up records
based on portfolio_id, asset_id and end_valid_time as
quick as possible.  It couldn't be simpler,
conceptually!  yet I must have missed something, cause
postgresql won't accept the function body I show
below.

If I can't get this working quickly, I may just resort
to creating a stored procedure that takes the
transaction details as arguments and processes both
tables appropriately without relying on a trigger.
:-(

Thanks for everyone's help.

Ted



===========================================
DECLARE
  id BIGINT;
  q DOUBLE PRECISION;
BEGIN
  SELECT assets.id INTO id, assets.quantity INTO q
    FROM assets
    WHERE assets.asset_type_id = NEW.asset_type_id
      AND assets.portfolio_id = NEW.portfolio_id
      AND assets.end_valid_time IS NULL;
  IF (id IS NULL) THEN
    INSERT INTO assets (asset_type_id,
portfolio_id,quantity,start_valid_stime,end_valid_time)
      VALUES (NEW.asset_type_id,NEW.portfolio_id,
NEW.quantity, NEW.transaction_date,NULL);
  ELSE
    UPDATE assets SET end_valid_time =
NEW.transaction_date WHERE id = id;
    INSERT INTO assets (asset_type_id,
portfolio_id,quantity,start_valid_stime,end_valid_time)
      VALUES (NEW.asset_type_id,NEW.portfolio_id, q +
NEW.quantity, NEW.transaction_date,NULL);
  END
END


Re: SQL design pattern for a delta trigger?

От
Vivek Khera
Дата:
On Dec 7, 2007, at 11:42 AM, Colin Wetherbee wrote:

> You can do this with a conditional.  Something like the following
> should work.
>
> IF
>  NOT (a query matching your data returns rows)
> THEN
>  INSERT (your new data)

There exists a race condition here unless you've locked your tables.

Re: SQL design pattern for a delta trigger?

От
Colin Wetherbee
Дата:
Vivek Khera wrote:
> On Dec 7, 2007, at 11:42 AM, Colin Wetherbee wrote:
>
>> You can do this with a conditional.  Something like the following
>> should work.
>>
>> IF
>>  NOT (a query matching your data returns rows)
>> THEN
>>  INSERT (your new data)
>
> There exists a race condition here unless you've locked your tables.

Yes, clearly.  In the context of the thread, I was assuming my algorithm
would be implemented as an atomic transaction.

For what it's worth, the real algorithm would be as follows.  I hadn't
had enough coffee yet, and I forgot the UPDATE bit.

IF
   (a query matching your old data returns rows)
THEN
   UPDATE with your new data
ELSE
   INSERT your new data

Colin

Re: SQL design pattern for a delta trigger?

От
Vivek Khera
Дата:
On Dec 10, 2007, at 5:04 PM, Colin Wetherbee wrote:

> For what it's worth, the real algorithm would be as follows.  I
> hadn't had enough coffee yet, and I forgot the UPDATE bit.
>
> IF
>  (a query matching your old data returns rows)
> THEN
>  UPDATE with your new data
> ELSE
>  INSERT your new data

Still exists race condition.  Your race comes from testing existence,
then creating/modifying data afterwards.  You need to make the test/
set atomic else you have race.


Re: SQL design pattern for a delta trigger?

От
Ted Byers
Дата:
--- Vivek Khera <khera@kcilink.com> wrote:

>
> On Dec 10, 2007, at 5:04 PM, Colin Wetherbee wrote:
>
> > For what it's worth, the real algorithm would be
> as follows.  I
> > hadn't had enough coffee yet, and I forgot the
> UPDATE bit.
> >
> > IF
> >  (a query matching your old data returns rows)
> > THEN
> >  UPDATE with your new data
> > ELSE
> >  INSERT your new data
>
> Still exists race condition.  Your race comes from
> testing existence,
> then creating/modifying data afterwards.  You need
> to make the test/
> set atomic else you have race.
>

Yes, but how do you do that in a stored function or
procedure or in a trigger.  It would be obvious to me
if I were writing this in C++ or Java, but how do you
do it using SQL in an RDBMS?

I saw something about table locks, but that doesn't
seem wise, WRT performance.

The classic example of a race condition, involving a
bank account, was used in the manual to introduce the
idea of a transaction, but we can't use a transaction
in a trigger, can we?

It is one thing to point out a race condition, but a
pointer to a solution that would work in the context
of the problem at hand would be useful and
appreciated.

Thanks all.

Ted

Re: SQL design pattern for a delta trigger?

От
Erik Jones
Дата:
On Dec 10, 2007, at 4:48 PM, Ted Byers wrote:

>
> --- Vivek Khera <khera@kcilink.com> wrote:
>
>>
>> On Dec 10, 2007, at 5:04 PM, Colin Wetherbee wrote:
>>
>>> For what it's worth, the real algorithm would be
>> as follows.  I
>>> hadn't had enough coffee yet, and I forgot the
>> UPDATE bit.
>>>
>>> IF
>>>  (a query matching your old data returns rows)
>>> THEN
>>>  UPDATE with your new data
>>> ELSE
>>>  INSERT your new data
>>
>> Still exists race condition.  Your race comes from
>> testing existence,
>> then creating/modifying data afterwards.  You need
>> to make the test/
>> set atomic else you have race.
>>
>
> Yes, but how do you do that in a stored function or
> procedure or in a trigger.  It would be obvious to me
> if I were writing this in C++ or Java, but how do you
> do it using SQL in an RDBMS?
>
> I saw something about table locks, but that doesn't
> seem wise, WRT performance.
>
> The classic example of a race condition, involving a
> bank account, was used in the manual to introduce the
> idea of a transaction, but we can't use a transaction
> in a trigger, can we?
>
> It is one thing to point out a race condition, but a
> pointer to a solution that would work in the context
> of the problem at hand would be useful and
> appreciated.
>
> Thanks all.

In a stored procedure you'd just execute the UPDATE and then check
the FOUND variable to see if it found a row to update:

UPDATE table_name SET foo='bar' WHERE id=5;

IF NOT FOUND THEN
    INSERT INTO table_name (id, foo) VALUES (5, 'bar');
END IF;

Erik Jones

Software Developer | Emma®
erik@myemma.com
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com



Re: SQL design pattern for a delta trigger?

От
Ted Byers
Дата:
Thanks Erik
>
> In a stored procedure you'd just execute the UPDATE
> and then check
> the FOUND variable to see if it found a row to
> update:
>
> UPDATE table_name SET foo='bar' WHERE id=5;
>
> IF NOT FOUND THEN
>     INSERT INTO table_name (id, foo) VALUES (5, 'bar');
> END IF;
>
To be clear, if I understand you correctly, with your
example, if there is no record where id=5, nothing
happens except FOUND is set to false?  Can I, then,
declare a variable prior to your update statement, and
then modify your update statement so that the value in
a particular field on the row where id=5 can be
captured?  Bearing in mind this is to be in a row
level trigger after an insert into table_name,
something like:

DECLARE q DOUBLE;
UPDATE  table_name
   SET foo='bar',
       q = table_name.quantity
     WHERE id=5;

And then follow that with something like:
IF FOUND THEN
  INSERT INTO another_table (baz,quantity)
    VALUES (foo,q+NEW.quantity);
ELSE
  INSERT INTO another_table (baz,quantity)
    VALUES (foo,NEW.quantity);
END IF

Thanks again,

Ted

Re: SQL design pattern for a delta trigger?

От
Richard Broersma Jr
Дата:
--- On Mon, 12/10/07, Ted Byers <r.ted.byers@rogers.com> wrote:

> but how do you
> do it using SQL in an RDBMS?

I believe that there is an ANSI SQL command "MERGE" that is yet to be implemented into PostgreSQL.

Regards,
Richard Broersma Jr.

Re: SQL design pattern for a delta trigger?

От
Alvaro Herrera
Дата:
Richard Broersma Jr wrote:
> --- On Mon, 12/10/07, Ted Byers <r.ted.byers@rogers.com> wrote:
>
> > but how do you
> > do it using SQL in an RDBMS?
>
> I believe that there is an ANSI SQL command "MERGE" that is yet to be implemented into PostgreSQL.

IIRC the standard's definition of MERGE is still subject to the race
condition :-)  It seems mostly defined for OLAP, and assumes rather
static data.

--
Alvaro Herrera                 http://www.amazon.com/gp/registry/DXLWNGRJD34J
"La tristeza es un muro entre dos jardines" (Khalil Gibran)

Re: SQL design pattern for a delta trigger?

От
Colin Wetherbee
Дата:
Vivek Khera wrote:
> On Dec 10, 2007, at 5:04 PM, Colin Wetherbee wrote:
>
>> For what it's worth, the real algorithm would be as follows.  I
>> hadn't had enough coffee yet, and I forgot the UPDATE bit.
>>
>> IF (a query matching your old data returns rows) THEN UPDATE with
>> your new data ELSE INSERT your new data
>
> Still exists race condition.  Your race comes from testing existence,
>  then creating/modifying data afterwards.  You need to make the
> test/set atomic else you have race.

I guess when I wrote that the algorithm would have to be implemented in
an atomic manner, it fell on deaf ears.

That said, perhaps implementing a good MERGE would be not such a bad
idea for PostgreSQL 8.4.

Colin


Re: SQL design pattern for a delta trigger?

От
"Trevor Talbot"
Дата:
On 12/10/07, Colin Wetherbee <cww@denterprises.org> wrote:
> Vivek Khera wrote:
> > On Dec 10, 2007, at 5:04 PM, Colin Wetherbee wrote:

> >> IF (a query matching your old data returns rows) THEN UPDATE with
> >> your new data ELSE INSERT your new data

> > Still exists race condition.  Your race comes from testing existence,
> >  then creating/modifying data afterwards.  You need to make the
> > test/set atomic else you have race.

> I guess when I wrote that the algorithm would have to be implemented in
> an atomic manner, it fell on deaf ears.

The problem is that there isn't a good atomic method for that order of
operations, short of locking the entire table first. A concurrent
transaction might insert a row after your test but before your own
INSERT. Even a SERIALIZABLE transaction won't help, as PostgreSQL
doesn't implement predicate locking.

That's why the example in the docs is a loop with result checking on
both operations, and requires a UNIQUE constraint to work correctly.

If high concurrency isn't a concern, table locking is the simpler approach.