Обсуждение: Postgresql long transaction support

Поиск
Список
Период
Сортировка

Postgresql long transaction support

От
Kirmo Uusitalo
Дата:

Hi,

I'm looking for a solution for version managed (or long transaction) data in PostgreSQL (and Oracle).

This means database objects having different properties in different versions of the data set. Versions could be organized in tree-like hierarchy (each versios can have subversions). Objects can even be deleted in some of the versions of the data.

I am looking for something which would require minimal change in the application logic, like

    create table my_obj(id serial primary key,prop_1 text);        insert into my_obj (prop_1) values ('object 1');        insert into my_obj (prop_1) values ('object 2 (to be deleted)');


select * from my_obj;

1   object 1
2   object 2 (to be deleted)
create version 'xx'; //this functionality i'm looking for
switch to version 'xx'; //this functionality i'm looking for
select * from my_obj; //same data as in top version as we haven't changed anything yet

1   object 1
2   object 2 (to be deleted)
update my_obj set prop_1='updated in version xx' where id = 1;
delete from my_obj where id=2;
select * from my_obj;

1   'updated in version xx'

switch to version 'top'; //let's go back to top version

select * from my_obj; //same data as before as we haven't changed posted our changes made in 'xx' yet to top version

1   object 1
2   object 2 (to be deleted)
insert into my_obj (prop_1) values ('object 3 (inserted after creation of xx)');

switch to version 'xx'; //this functionality i'm looking for

merge; //brings changes made in parent 'top' version

1   updated in version xx
3   object 3 (inserted after creation of xx)

post; //sends all changes upwards

This functionality exists in GE Smallworld spatial database (see https://www.ge.com/content/dam/gepower-new/global/en_US/downloads/gas-new-site/resources/reference/ger-4231-smallworld-4-managing-change-spatial-data.pdf) and the use case is quite common (plan some changes in data which will happen in real life in far future, much further away than when current database session ends). Also the conflict management should exist (same object being changed in multiple versions).

Does Postgresql have anything like this?

Re: Postgresql long transaction support

От
Samed YILDIRIM
Дата:
Hi Kirmo,

This is a pretty interesting feature. I'm not sure if I've seen something like that before. I recommend you to check the pg_bitemporal[1] project. It may be useful for you.


Best regards.
Samed YILDIRIM


On Wed, 14 Dec 2022 at 13:55, Kirmo Uusitalo <kirmo.uusitalo@gmail.com> wrote:

Hi,

I'm looking for a solution for version managed (or long transaction) data in PostgreSQL (and Oracle).

This means database objects having different properties in different versions of the data set. Versions could be organized in tree-like hierarchy (each versios can have subversions). Objects can even be deleted in some of the versions of the data.

I am looking for something which would require minimal change in the application logic, like

    create table my_obj(id serial primary key,prop_1 text);        insert into my_obj (prop_1) values ('object 1');        insert into my_obj (prop_1) values ('object 2 (to be deleted)');


select * from my_obj;

1   object 1
2   object 2 (to be deleted)
create version 'xx'; //this functionality i'm looking for
switch to version 'xx'; //this functionality i'm looking for
select * from my_obj; //same data as in top version as we haven't changed anything yet

1   object 1
2   object 2 (to be deleted)
update my_obj set prop_1='updated in version xx' where id = 1;
delete from my_obj where id=2;
select * from my_obj;

1   'updated in version xx'

switch to version 'top'; //let's go back to top version

select * from my_obj; //same data as before as we haven't changed posted our changes made in 'xx' yet to top version

1   object 1
2   object 2 (to be deleted)
insert into my_obj (prop_1) values ('object 3 (inserted after creation of xx)');

switch to version 'xx'; //this functionality i'm looking for

merge; //brings changes made in parent 'top' version

1   updated in version xx
3   object 3 (inserted after creation of xx)

post; //sends all changes upwards

This functionality exists in GE Smallworld spatial database (see https://www.ge.com/content/dam/gepower-new/global/en_US/downloads/gas-new-site/resources/reference/ger-4231-smallworld-4-managing-change-spatial-data.pdf) and the use case is quite common (plan some changes in data which will happen in real life in far future, much further away than when current database session ends). Also the conflict management should exist (same object being changed in multiple versions).

Does Postgresql have anything like this?

Re: Postgresql long transaction support

От
Kirmo Uusitalo
Дата:
Hi Samed,

thanks. The bitemporal tables are somewhat similar except that there is not a date range to distinguish between versions of objects. One could use this long transaction feature for comparing the results of different plans of implementing something - and these could be affecting many objects from the parent version (or it's parent and so forth).

Similarly as with bitemporal tables the FK constraints are problematic. If an object is to be deleted in parent version, any created child object of this in child version cannot be posted to parent version later. That's why I believe it is best first merge the changes done in parent version to current version before you are allowed to post.

To solve this in application layer for an existing application is quite complex and this is why I am looking for a more generic solution within the database.

On Wed, Dec 14, 2022 at 2:02 PM Samed YILDIRIM <samed@reddoc.net> wrote:
Hi Kirmo,

This is a pretty interesting feature. I'm not sure if I've seen something like that before. I recommend you to check the pg_bitemporal[1] project. It may be useful for you.


Best regards.
Samed YILDIRIM


On Wed, 14 Dec 2022 at 13:55, Kirmo Uusitalo <kirmo.uusitalo@gmail.com> wrote:

Hi,

I'm looking for a solution for version managed (or long transaction) data in PostgreSQL (and Oracle).

This means database objects having different properties in different versions of the data set. Versions could be organized in tree-like hierarchy (each versios can have subversions). Objects can even be deleted in some of the versions of the data.

I am looking for something which would require minimal change in the application logic, like

    create table my_obj(id serial primary key,prop_1 text);        insert into my_obj (prop_1) values ('object 1');        insert into my_obj (prop_1) values ('object 2 (to be deleted)');


select * from my_obj;

1   object 1
2   object 2 (to be deleted)
create version 'xx'; //this functionality i'm looking for
switch to version 'xx'; //this functionality i'm looking for
select * from my_obj; //same data as in top version as we haven't changed anything yet

1   object 1
2   object 2 (to be deleted)
update my_obj set prop_1='updated in version xx' where id = 1;
delete from my_obj where id=2;
select * from my_obj;

1   'updated in version xx'

switch to version 'top'; //let's go back to top version

select * from my_obj; //same data as before as we haven't changed posted our changes made in 'xx' yet to top version

1   object 1
2   object 2 (to be deleted)
insert into my_obj (prop_1) values ('object 3 (inserted after creation of xx)');

switch to version 'xx'; //this functionality i'm looking for

merge; //brings changes made in parent 'top' version

1   updated in version xx
3   object 3 (inserted after creation of xx)

post; //sends all changes upwards

This functionality exists in GE Smallworld spatial database (see https://www.ge.com/content/dam/gepower-new/global/en_US/downloads/gas-new-site/resources/reference/ger-4231-smallworld-4-managing-change-spatial-data.pdf) and the use case is quite common (plan some changes in data which will happen in real life in far future, much further away than when current database session ends). Also the conflict management should exist (same object being changed in multiple versions).

Does Postgresql have anything like this?

Re: Postgresql long transaction support

От
Steve Midgley
Дата:


On Wed, Dec 14, 2022 at 4:46 AM Kirmo Uusitalo <kirmo.uusitalo@gmail.com> wrote:
Hi Samed,

thanks. The bitemporal tables are somewhat similar except that there is not a date range to distinguish between versions of objects. One could use this long transaction feature for comparing the results of different plans of implementing something - and these could be affecting many objects from the parent version (or it's parent and so forth).

Similarly as with bitemporal tables the FK constraints are problematic. If an object is to be deleted in parent version, any created child object of this in child version cannot be posted to parent version later. That's why I believe it is best first merge the changes done in parent version to current version before you are allowed to post.

To solve this in application layer for an existing application is quite complex and this is why I am looking for a more generic solution within the database.


This seems pretty complex to address at any layer! But it would be a useful feature for some situations, for sure. It seems like the approach taken by Ruby/Rails and similar frameworks for "db migrations" might have some useful design patterns for you. Each transaction has a link to its future and past transaction. If the system wants to traverse from "migration 6 to migration 3" and the DB is currently at migration 6 (stored a version number in the db itself), it runs the exit function to downgrade to migration 5, then runs the logic for migration 5, then runs the downgrade to 4, etc.

It can be slow, as implemented, but it allows you to attach logic to every step forward and backward in time that ensures you can handle even structure changes in the tables, etc (as well as changing lookup values in tables, etc).

Might be worth considering as a design model?
Steve

Re: Postgresql long transaction support

От
Tom Lane
Дата:
Steve Midgley <science@misuse.org> writes:
> This seems pretty complex to address at any layer! But it would be a useful
> feature for some situations, for sure. It seems like the approach taken by
> Ruby/Rails and similar frameworks for "db migrations" might have some
> useful design patterns for you. Each transaction has a link to its future
> and past transaction. If the system wants to traverse from "migration 6 to
> migration 3" and the DB is currently at migration 6 (stored a version
> number in the db itself), it runs the exit function to downgrade to
> migration 5, then runs the logic for migration 5, then runs the downgrade
> to 4, etc.

> It can be slow, as implemented, but it allows you to attach logic to every
> step forward and backward in time that ensures you can handle even
> structure changes in the tables, etc (as well as changing lookup values in
> tables, etc).

> Might be worth considering as a design model?

If you end up building things that way, PG's "extensions" could be a
useful way to package it.  What Steve is calling a "migration" corresponds
to an extension version, and you can provide scripts to go in either
direction between versions.

            regards, tom lane