Обсуждение: SQL:2011 PERIODS vs Postgres Ranges?

Поиск

Список

Период

Сортировка

SQL:2011 PERIODS vs Postgres Ranges?

От

Paul A Jungwirth

Дата:

21 октября 2018 г., 21:17:44

Hello,

I'm interested in contributing some temporal database functionality to
Postgres, starting with temporal primary and foreign keys. I know some
other folks nearby interested in helping out, too. But before we begin
I'd like to ask the community about complying with the SQL:2011
standard [1] for these things.

In SQL:2011, temporal features all build upon PERIODs, which are a new
concept you can attach to tables. Each PERIOD is composed of a start
column and an end column (both of some date/time type). You define
PERIODs when you CREATE TABLE or ALTER TABLE. Then you refer to the
periods when you create primary keys or foreign keys to make them
temporal. There are also a handful of new operators for testing two
ranges for overlap/succession/etc.[2] Most PERIODs are for tracking
the history of a *thing* over time, but if the PERIOD is named
SYSTEM_TIME it instead tracks the history of changes to *your
database*.[3] (Google for "bitemporal" to read more about this.)

Personally I think PERIODs are quite disappointing. They are not part
of relational theory. They are not a column, but something else. If
you say `SELECT * FROM t` you don't get `PERIODs` (as far as I can
tell). But you can mention PERIODs approximately wherever you can
mention columns [4], so now we have to support them when projecting,
selecting, joining, aggregating, etc. (Or if we are permitted to not
support them in some of those places, isn't that even worse?)

You can see that PERIODs share a lot with Postgres's own range types.
But ranges are a real column, requiring no special-case behavior,
either for RDBMS implementers or SQL users. They have a richer set of
operators.[5] They don't require any special declarations to put them
in a table. They aren't limited to just date/time types. You can even
define new range types yourself (e.g. I've found it helpful before to
define inetrange and floatrange). Also the start/end columns of a
PERIOD must be not nullable,[6] so that unbounded ranges must use
sentinels like `01 JAN 0000` or `01 JAN 3000` instead. Also there is
no way (as far as I can tell) to define and use a period within a
subquery or CTE or view. Many of these criticisms of PERIODs you can
find in [7], pages 403 - 410 (where "interval" means basically our own
range types), plus others: for example PERIODs are always closed/open,
you can only have a single application PERIOD per table, they are
wordy, etc.

I expect that any Postgres implementation of the standard would wind
up using ranges internally. For example a temporal primary key would
use an exclusion constraint based on a range expression, so if you had
a PERIOD defined on columns named `valid_start` and `valid_end`, the
PK would use something like `EXCLUDE USING gist (id WITH =,
tstzrange(valid_start, valid_end) WITH &&)`. Also the new SQL:2011
operators would be easy to implement on top of our range operators.
And then a temporal foreign key implementation would use either those
or raw range operators.

So is there any way for Postgres to offer the same temporal features,
but give users the choice of using either PERIODs or ranges? If we
built that, would the community be interested in it? I think there are
several possible ways to go about it:

1. Permit defining PERIODs on either a start/end column pair, or an
existing range column. Then everything else continues to use PERIODs.
This seems tidy to implement, although since it acquiesces to the
PERIOD-based approach for temporal functionality, it doesn't solve all
the problems above. Also as [9] points out, it would lead to
incompatibilities in the new `information_schema` views. E.g.
`periods` is supposed to have `start_column_name` and
`end_column_name` columns.[8]

2. Permit either ranges or PERIODs in the new syntax, e.g. `PRIMARY
KEY (id, valid_at WITHOUT OVERLAPS)` where `valid_at` is either a
PERIOD or a range column. Similarly with foreign keys. There is
probably some `information_schema` messiness here too, but perhaps
less than with #1. This seems like a great alternative to
application-time PERIODs, but I'm not sure how you'd tell Postgres to
use a range column for the system-time dimension.[3] Perhaps just a
function, and then the PERIOD of `SYSTEM_TIME` would call that
function (with a range expression).

3. Build our own abstractions on top of ranges, and then use those to
implement PERIOD-based features. This is the least clear option, and I
imagine it would require a lot more design effort. Our range types are
already a step in this direction. Does anyone think this approach has
promise? If so I can start thinking about how we'd do it. I imagine we
could use a lot of the ideas in [7].

4. Just give up and follow the standard to the letter. I'm not
enthusiastic about this, but I also really want temporal features, so
I might still do the work if that's what folks preferred.

Left to my own devices I would probably go with a mix of #2 & #3,
where temporal functionality is exposed by a layer of public functions
that use ranges (maybe accepting PERIODs too), and then implement the
PERIOD-based syntax by calling those functions. Using functions for
the range-based layer isn't as "strong" an abstraction as designing
SQL syntax, so it should be less effort, and also reduce risk of
future conflicts. I'd still personally really appreciate *also* doing
some #2 though, so that I could access those features via SQL syntax
(not functions), but with ranges instead of PERIODs. What do the rest
of you think?

Also, just how strictly do we have to follow the standard? Requiring
sentinels like '01 JAN 3000` just seems so silly. Could Postgres
permit nullable start/end PERIOD columns, and give them the same
meaning as ranges (unbounded)? Even if I forgot about ranges
altogether, I'd sure love to avoid these sentinels.

Finally: I know Vik Fearing already made a start at defining
PERIODs.[9] I don't know if he's gone any further, but perhaps he can
chime in if so. I'd be happy to build on what he's done already.

I'm eager to start work on this, but I also want to get some community
buy-in before I go too far. Temporal is such a massive set of
concepts, I believe it's important to have some discussion before just
jumping in. (Btw I've written an annotated bibliography about temporal
databases at [10] if anyone wants to read more.) If I can do anything
to facilitate a fuller plan, let me know. I can write up a more
detailed proposal, etc. Thanks for your feedback!

Yours,
Paul

[1] I'm using the draft docs at
https://www.wiscorp.com/SQLStandards.html at the link titled "SQL:20nn
Working Draft Documents". Several of the PDFs in that zip file mention
the new temporal features, but by far the most important is Part 2
(7IWD2-02-Foundation-2011-12.pdf). If you search for "period" you
should find lots of results.

[2] 4.14.2: The operators are overlaps, equals, contains, precedes,
succeeds, immediately precedes, and immediate succeeds.

[3] See 4.14.1 for more about the special PERIOD named SYSTEM_TIME.
Whereas application-time PERIODs store a history of a *thing*, the
SYSTEM_TIME PERIOD stores a history of changes to the *database*
itself.

[4] In Part 12 the `<identifier chain>` of SQL syntax is amended to
include `PERIODs` as well as columns.

[5] https://www.postgresql.org/docs/current/static/functions-range.html

[6] 4.6.5.3: "The columns shall both be of a datetime data type and
known not nullable."

[7] C. J. Date, Hugh Darwen, Nikos Lorentzos. Time and Relational
Theory, Second Edition: Temporal Databases in the Relational Model and
SQL. 2nd edition, 2014.

[8] 5.38 (7IWD2-11-Schemata-2011-12.pdf).

[9] https://www.postgresql-archive.org/Periods-td6022563.html

[10] https://illuminatedcomputing.com/posts/2017/12/temporal-databases-bibliography/

Re: SQL:2011 PERIODS vs Postgres Ranges?

От

Isaac Morland

Дата:

21 октября 2018 г., 21:56:08

On Sun, 21 Oct 2018 at 14:18, Paul A Jungwirth <pj@illuminatedcomputing.com> wrote:

Also, just how strictly do we have to follow the standard? Requiring
sentinels like '01 JAN 3000` just seems so silly. Could Postgres
permit nullable start/end PERIOD columns, and give them the same
meaning as ranges (unbounded)? Even if I forgot about ranges
altogether, I'd sure love to avoid these sentinels.

We have "infinity" and "-infinity" values in our date and timestamp types:

https://www.postgresql.org/docs/current/static/datatype-datetime.html

I think this avoids the silliness with sentinel values.

For myself, I don't care about PERIOD etc. one bit. The "every new capability gets its own syntax" model that SQL follows is very old-fashioned, and for good reason. I'm happy with ranges and exclusion constraints. But if we can provide an implementation of PERIOD that makes it easier to port applications written for legacy database systems, it might be worthwhile.

Re: SQL:2011 PERIODS vs Postgres Ranges?

От

Heikki Linnakangas

Дата:

21 октября 2018 г., 22:10:55

On 21/10/2018 21:17, Paul A Jungwirth wrote:
> 3. Build our own abstractions on top of ranges, and then use those to
> implement PERIOD-based features. This is the least clear option, and I
> imagine it would require a lot more design effort. Our range types are
> already a step in this direction. Does anyone think this approach has
> promise? If so I can start thinking about how we'd do it. I imagine we
> could use a lot of the ideas in [7].
> ...
> [7] C. J. Date, Hugh Darwen, Nikos Lorentzos. Time and Relational
> Theory, Second Edition: Temporal Databases in the Relational Model and
> SQL. 2nd edition, 2014.

+1 on this approach. I think [7] got the model right. If we can 
implement SQL-standard PERIODs on top of it, then that's a bonus, but 
having sane, flexible, coherent set of range operators is more important 
to me.

What are we missing? It's been years since I read that book, but IIRC 
temporal joins is one thing, at least. What features do you have in mind?

- Heikki

Re: SQL:2011 PERIODS vs Postgres Ranges?

От

Paul A Jungwirth

Дата:

21 октября 2018 г., 22:46:58

On Sun, Oct 21, 2018 at 12:11 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> On 21/10/2018 21:17, Paul A Jungwirth wrote:
> > 3. Build our own abstractions on top of ranges, and then use those to
> > implement PERIOD-based features.
> +1 on this approach. I think [7] got the model right. If we can
> implement SQL-standard PERIODs on top of it, then that's a bonus, but
> having sane, flexible, coherent set of range operators is more important
> to me.

Okay, I'm surprised to hear from you and Isaac that following the
standard isn't as important as I thought, but I'm certainly pleased
not to make it the focus. I just thought that Postgres's reputation
was to be pretty careful about sticking to it. (I think we could still
add a standard-compliant layer, but like you I don't feel a duty to
suffer from it.) It sounds like I should work out some proposed
function signatures and write up how to use them, and see what people
think. Is that a useful approach?

> What are we missing?

Here are a few big ones:

1. Define temporal primary keys and foreign keys that are known to the
database catalog and controlled as higher-level objects. For instance
I wrote an extension at https://github.com/pjungwir/time_for_keys to
create temporal foreign keys, but the database isn't "aware" of them.
That means they are more cluttered in `\d foo` than necessary (you see
the trigger constraints instead of something about a foreign key),
they don't automatically disappear if you drop the column, it is hard
to make them "polymorphic" (My extension supports only
int+tstzrange.), they don't validate that the referenced table has a
declared temporal PK, they probably have slightly different
locking/transaction semantics than the real RI code, etc. This is what
I'd like to implement right now.

2. System time: automatically track DML changes to the table, and let
you query "as of" a given time.

3. Temporal joins. I don't want to tackle this myself, because there
is already an amazing proposed patch that does everything we could ask
for at https://www.postgresql-archive.org/PROPOSAL-Temporal-query-processing-with-range-types-tt5913058.html
(recently updated btw, so I hope someone will look at it!).

4. Temporal UPDATE/DELETE: these should be converted to instead change
the end time of old rows and insert new rows with the changed
attributes. I'm interested in implementing this too, but one thing at
a time. . . .

I really appreciate your sharing your thoughts!

Paul

Re: SQL:2011 PERIODS vs Postgres Ranges?

От

Pavel Stehule

Дата:

22 октября 2018 г., 07:31:53

ne 21. 10. 2018 v 21:47 odesílatel Paul A Jungwirth <pj@illuminatedcomputing.com> napsal:

On Sun, Oct 21, 2018 at 12:11 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> On 21/10/2018 21:17, Paul A Jungwirth wrote:
> > 3. Build our own abstractions on top of ranges, and then use those to
> > implement PERIOD-based features.
> +1 on this approach. I think [7] got the model right. If we can
> implement SQL-standard PERIODs on top of it, then that's a bonus, but
> having sane, flexible, coherent set of range operators is more important
> to me.

Okay, I'm surprised to hear from you and Isaac that following the
standard isn't as important as I thought, but I'm certainly pleased
not to make it the focus. I just thought that Postgres's reputation
was to be pretty careful about sticking to it. (I think we could still
add a standard-compliant layer, but like you I don't feel a duty to
suffer from it.) It sounds like I should work out some proposed
function signatures and write up how to use them, and see what people
think. Is that a useful approach?

It can be very unhappy if we cannot to implement standard syntax and behave. The implementation behind or another is not too important. We should not to accept any design that don't allow implement standard.

The world is 10 years after standards (maybe more). Now, this feature is implemented in MySQL/MariaDB, and I expecting a press to have standardized syntax after 5 years.

Regards

Pavel

> What are we missing?

Here are a few big ones:

1. Define temporal primary keys and foreign keys that are known to the
database catalog and controlled as higher-level objects. For instance
I wrote an extension at https://github.com/pjungwir/time_for_keys to
create temporal foreign keys, but the database isn't "aware" of them.
That means they are more cluttered in `\d foo` than necessary (you see
the trigger constraints instead of something about a foreign key),
they don't automatically disappear if you drop the column, it is hard
to make them "polymorphic" (My extension supports only
int+tstzrange.), they don't validate that the referenced table has a
declared temporal PK, they probably have slightly different
locking/transaction semantics than the real RI code, etc. This is what
I'd like to implement right now.

2. System time: automatically track DML changes to the table, and let
you query "as of" a given time.

3. Temporal joins. I don't want to tackle this myself, because there
is already an amazing proposed patch that does everything we could ask
for at https://www.postgresql-archive.org/PROPOSAL-Temporal-query-processing-with-range-types-tt5913058.html
(recently updated btw, so I hope someone will look at it!).

4. Temporal UPDATE/DELETE: these should be converted to instead change
the end time of old rows and insert new rows with the changed
attributes. I'm interested in implementing this too, but one thing at
a time. . . .

I really appreciate your sharing your thoughts!

Paul

Re: SQL:2011 PERIODS vs Postgres Ranges?

От

Jeff Davis

Дата:

28 октября 2018 г., 23:24:21

On Sun, 2018-10-21 at 22:10 +0300, Heikki Linnakangas wrote:
> On 21/10/2018 21:17, Paul A Jungwirth wrote:
> > 3. Build our own abstractions on top of ranges, and then use those
> > to
> > implement PERIOD-based features. This is the least clear option,
> > and I
> > imagine it would require a lot more design effort. Our range types
> > are
> > already a step in this direction. Does anyone think this approach
> > has
> > promise? If so I can start thinking about how we'd do it. I imagine
> > we
> > could use a lot of the ideas in [7].
> > ...
> > [7] C. J. Date, Hugh Darwen, Nikos Lorentzos. Time and Relational
> > Theory, Second Edition: Temporal Databases in the Relational Model
> > and
> > SQL. 2nd edition, 2014.
> 
> +1 on this approach. I think [7] got the model right. If we can 
> implement SQL-standard PERIODs on top of it, then that's a bonus,
> but 
> having sane, flexible, coherent set of range operators is more
> important 
> to me.

+1 for approach #3 from me as well. It was my original intention for
range types, though my first priority was utility and not the standard.
I think we are likely to run into a few areas where they aren't a
perfect fit to the standard, but I think it's a promising approach and
we can probably work around those issues by using special operators.

> What are we missing? It's been years since I read that book, but
> IIRC 
> temporal joins is one thing, at least. What features do you have in
> mind?

We do support temporal joins, just not as efficiently as I'd like, and
the language doesn't make it quite as clear as it could be.

I look at that book as a source of inspiration, but I don't think it's
simple to map features one-to-one. For instance, the model in [7] is
based heavily on pack/unpack operators, and it's hard for me to see how
those fit into SQL. Also, the pack/unpack operators have some
theoretical weirdness that the book does not make clear*.

Regards,
    Jeff Davis

*: I asked in a temporal discussion group (that was unfortunately a
part of LinkedIn circa 2011 and I can't find any reference to the
discussion outside my mailbox). My question was about the significance
of the order when packing on two intervals. Hugh Darwen was kind enough
to reply at length, and offered a lot of insight, but was still
somewhat inconclusive.

Re: SQL:2011 PERIODS vs Postgres Ranges?

От

Paul Jungwirth

Дата:

29 октября 2018 г., 00:29:11

Hi Jeff,

Thanks for sharing your thoughts and encouragement! :-)

 > The model in [7] is
 > based heavily on pack/unpack operators, and it's hard for me to see
 > how those fit into SQL. Also, the pack/unpack operators have some
 > theoretical weirdness that the book does not make clear*.
 >
 > *: My question was about the significance
 > of the order when packing on two intervals. Hugh Darwen was kind
 > enough to reply at length, and offered a lot of insight, but was still
 > somewhat inconclusive.

I'd be interested in seeing that conversation if you ever find it again.

I really like how Date/Darwen/Lorentzos use pack/unpack to explain 
temporal operations as operating on every concurrent "instant" 
separately, and then bringing the adjacent instants back together into 
ranges again. Even if you don't materialize that approach, conceptually 
it makes it easy to understand what's going on.

So what is great about the patch from Anton Dignös 
(https://www.postgresql-archive.org/PROPOSAL-Temporal-query-processing-with-range-types-tt5913058.html) 
is that (like Date/Darwen/Lorentzos) you still have temporal variants 
for every operator in the relational algebra, but they give 
straightforward & efficient implementations of each based on traditional 
operators plus just their two new "normalize" and "align" operations. (I 
think they renamed these in later papers/patches though?) Their main 
paper is at https://files.ifi.uzh.ch/boehlen/Papers/modf174-dignoes.pdf 
if anyone wants to read it. It's short! :-)

The biggest challenge implementing temporal operators in plain SQL is 
merging/splitting ranges from the left & right sides of an operator so 
they line up. A single row can get split into multiple rows, or several 
rows might be merged into one, etc. You can see how tricky Snodgrass's 
"coalesce" operation is in his book. I gave some example SQL to 
implement coalesce with UNNEST plus a range_agg function at 
https://github.com/pjungwir/range_agg but with the Dignös approach I 
don't think you'd need that. Normalize/align targets roughly the same 
problem.

Anyway I'd be curious whether the theoretical weirdness you found in 
pack/unpack also applies to normalize/align.

Yours,

-- 
Paul              ~{:-)
pj@illuminatedcomputing.com

Re: SQL:2011 PERIODS vs Postgres Ranges?

От

Paul A Jungwirth

Дата:

24 ноября 2018 г., 02:41:25

Here is a patch for my progress on this so far. I'd love some comments
on the general approach, as I've never contributed anything this
involved before. It's not ready for a commitfest, but it would help me
to have some feedback. There are TODO comments with my major
questions.

This patch lets you say `CONSTRAINT foo PRIMARY KEY (cols, WITHOUT
OVERLAPS some_range_col)`, both in `CREATE TABLE` and `ALTER TABLE`.
It doesn't support foreign keys yet, and it only supports range
columns, not PERIODs. (I'm starting to realize that adding PERIODs
will be a lot of work, although I'm still up for it. :-) The approach
isn't exactly the #2+#3 approach I suggested previously, since
user-exposed functions seem like an odd fit with how things normally
flow out of the grammar, but it follows the goal of permitting either
ranges or PERIODs for temporal keys without breaking the SQL:2011
standard.

It adds regression and pg_dump tests, although no documentation yet. A
few of my new regress tests fail, but only the ones for PERIODs. I
don't know if I need to do anything for pg_dump's custom format. For
the SQL format it exports correct `ALTER TABLE ... ADD CONSTRAINT ...
(... WITHOUT OVERLAPS ...)` statements. Also I left a question in
bin/psql/describe.c about how to make \d show a PK WITHOUT OVERLAPS.

It is based on 3be97b97ed37b966173f027091f21d8a7605e2a5 from Nov 14,
but I can rebase it if you like.

If it's easier to read this in smaller bits, you can find my (somewhat
messy) commit history here:
https://github.com/pjungwir/postgresql/commits/temporal-pks

For a next step (assuming what I've done already isn't too bad): I
could either work on PERIODs (building on Vik Fearing's patch from a
few months ago), or add range-based temporal foreign keys. Any
suggestions?

Thanks!
Paul
On Sun, Oct 28, 2018 at 2:29 PM Paul Jungwirth
<pj@illuminatedcomputing.com> wrote:
>
> Hi Jeff,
>
> Thanks for sharing your thoughts and encouragement! :-)
>
>  > The model in [7] is
>  > based heavily on pack/unpack operators, and it's hard for me to see
>  > how those fit into SQL. Also, the pack/unpack operators have some
>  > theoretical weirdness that the book does not make clear*.
>  >
>  > *: My question was about the significance
>  > of the order when packing on two intervals. Hugh Darwen was kind
>  > enough to reply at length, and offered a lot of insight, but was still
>  > somewhat inconclusive.
>
> I'd be interested in seeing that conversation if you ever find it again.
>
> I really like how Date/Darwen/Lorentzos use pack/unpack to explain
> temporal operations as operating on every concurrent "instant"
> separately, and then bringing the adjacent instants back together into
> ranges again. Even if you don't materialize that approach, conceptually
> it makes it easy to understand what's going on.
>
> So what is great about the patch from Anton Dignös
> (https://www.postgresql-archive.org/PROPOSAL-Temporal-query-processing-with-range-types-tt5913058.html)
> is that (like Date/Darwen/Lorentzos) you still have temporal variants
> for every operator in the relational algebra, but they give
> straightforward & efficient implementations of each based on traditional
> operators plus just their two new "normalize" and "align" operations. (I
> think they renamed these in later papers/patches though?) Their main
> paper is at https://files.ifi.uzh.ch/boehlen/Papers/modf174-dignoes.pdf
> if anyone wants to read it. It's short! :-)
>
> The biggest challenge implementing temporal operators in plain SQL is
> merging/splitting ranges from the left & right sides of an operator so
> they line up. A single row can get split into multiple rows, or several
> rows might be merged into one, etc. You can see how tricky Snodgrass's
> "coalesce" operation is in his book. I gave some example SQL to
> implement coalesce with UNNEST plus a range_agg function at
> https://github.com/pjungwir/range_agg but with the Dignös approach I
> don't think you'd need that. Normalize/align targets roughly the same
> problem.
>
> Anyway I'd be curious whether the theoretical weirdness you found in
> pack/unpack also applies to normalize/align.
>
> Yours,
>
> --
> Paul              ~{:-)
> pj@illuminatedcomputing.com

Вложения

temporal_pks_v0001.patch

Re: SQL:2011 PERIODS vs Postgres Ranges?

От

Paul A Jungwirth

Дата:

24 ноября 2018 г., 05:55:54

On Fri, Nov 23, 2018 at 3:41 PM Paul A Jungwirth
<pj@illuminatedcomputing.com> wrote:
> Here is a patch for my progress on this so far.

Well this is embarrassing, but my last patch used the mistaken syntax
`PRIMARY KEY (cols, WITHOUT OVERLAPS col)`. Here is a new patch which
uses the correct syntax `PRIMARY KEY (cols, col WITHOUT OVERLAPS)`.
Sorry about that! Also I went ahead and rebased it off current master.

Yours,
Paul

Вложения

temporal_pks_v0002.patch

Re: Re: SQL:2011 PERIODS vs Postgres Ranges?

От

David Steele

Дата:

05 марта 2019 г., 11:35:13

Hi Paul,

On 11/24/18 4:55 AM, Paul A Jungwirth wrote:
> On Fri, Nov 23, 2018 at 3:41 PM Paul A Jungwirth
> <pj@illuminatedcomputing.com> wrote:
>> Here is a patch for my progress on this so far.
> 
> Well this is embarrassing, but my last patch used the mistaken syntax
> `PRIMARY KEY (cols, WITHOUT OVERLAPS col)`. Here is a new patch which
> uses the correct syntax `PRIMARY KEY (cols, col WITHOUT OVERLAPS)`.
> Sorry about that! Also I went ahead and rebased it off current master.

I have marked this patch as targeting PG13 since it is clearly not 
material for PG12.  I also added you as the patch author.

Regards,
-- 
-David
david@pgmasters.net

Re: Re: SQL:2011 PERIODS vs Postgres Ranges?

От

Paul A Jungwirth

Дата:

10 марта 2019 г., 03:41:44

On Tue, Mar 5, 2019 at 12:35 AM David Steele <david@pgmasters.net> wrote:
> I have marked this patch as targeting PG13 since it is clearly not
> material for PG12.  I also added you as the patch author.

Thanks David! Targeting PG13 was my intention, so sorry if I messed up
the commitfest entry.

Here is a new patch rebased on top of master. My questions are inline
as TODO comments for whoever does the review. I'm pretty far along
with an add-on patch to create temporal *foreign* keys too, which I
think should be part of this same bundle of work. If anyone happens to
review the PK patch soon, it might help me avoid the same mistakes in
the FK work, but if not that's fine too. :-)

Yours,
Paul

Вложения

temporal_pks_v0003.patch

Re: SQL:2011 PERIODS vs Postgres Ranges?

От

David Steele

Дата:

10 марта 2019 г., 09:42:08

Hi Pail,

On 3/10/19 2:41 AM, Paul A Jungwirth wrote:
> On Tue, Mar 5, 2019 at 12:35 AM David Steele <david@pgmasters.net> wrote:
>> I have marked this patch as targeting PG13 since it is clearly not
>> material for PG12.  I also added you as the patch author.
> 
> Thanks David! Targeting PG13 was my intention, so sorry if I messed up
> the commitfest entry.

No worries, that's what I'm here for!

> Here is a new patch rebased on top of master. My questions are inline
> as TODO comments for whoever does the review. I'm pretty far along
> with an add-on patch to create temporal *foreign* keys too, which I
> think should be part of this same bundle of work. If anyone happens to
> review the PK patch soon, it might help me avoid the same mistakes in
> the FK work, but if not that's fine too. :-)

Don't worry if you don't attract review in this CF since most people are 
focused on PG12 items.  Even so, getting your patch in early helps 
because it will have history by the time the final CFs for PG13 come around.

Regards,
-- 
-David
david@pgmasters.net

Re: SQL:2011 PERIODS vs Postgres Ranges?

От

Paul A Jungwirth

Дата:

19 мая 2019 г., 10:11:20

On Sat, Mar 9, 2019 at 10:42 PM David Steele <david@pgmasters.net> wrote:
> On 3/10/19 2:41 AM, Paul A Jungwirth wrote:
> > I'm pretty far along
> > with an add-on patch to create temporal *foreign* keys too, which I
> > think should be part of this same bundle of work.

Here is that patch. I've changed the title from "temporal_pks" to
"temporal_fks" but it includes both.

I've rebased on top of latest master, but also the patch assumes my
other range_agg patch is already included. (I use range_agg to help
implement the foreign key checks.)

This patch is hopefully getting close, but I'd still call it a WIP. It
has docs and tests, but it hasn't had any review yet (and neither has
range_agg). Here are a few limitations:

- It supports only range types, not SQL:2011 PERIODs. I'd like to add
PERIODs too, but that seems like a major undertaking. I would probably
need to partner with someone more experienced to get it done. Maybe it
should be a separate CF entry.
- The foreign keys support only NO ACTION and RESTRICT, not
CASCADE/SET NULL/SET DEFAULT. The latter options are hard to implement
now but would be trivial after adding UPDATE/DELETE FROM t FOR PORTION
OF r FROM t1 TO t2, so I'm thinking that will be next on my list. :-)
- I'm sure there are plenty of errors & infelicities people more
knowledgeable than me can point out.

Any feedback is gratefully welcomed. :-)

Yours,
Paul

Вложения

temporal_fks_v0001.patch

Re: SQL:2011 PERIODS vs Postgres Ranges?

От

Paul A Jungwirth

Дата:

17 июня 2019 г., 07:54:40

> Here is that patch. I've changed the title from "temporal_pks" to
> "temporal_fks" but it includes both.

Here are rebased patches to add temporal PKs and FKs. Note they depend
on my other commitfest entry adding a range_agg function. The PKs
patch should be applied first, then the FKs patch. These are mostly
the same as before, but I've added a small optimization to the FK
patch. The traditional FK code skips the UPDATE triggers if the PK/FK
columns didn't change. For temporal RI, we can broaden the condition:
on a PK update, if the new PK range is a superset of the old (and the
other parts of the key are equal), we can skip the check. On an FK
update, if the new FK range is a subset of the old (and the other
parts of the key are equal), we can skip the check.

These are still very WIP patches. They include a bunch of TODO
comments where I'm hoping to get feedback. Also their "taste" is
surely questionable. In many cases I took the least-obtrusive path
even where that might not be the best one. If anyone wants to offer
some corrections, I would gratefully pay attention. :-)

My plan is to work on UPDATE/DELETE FOR PORTION OF next, then use that
to add CASCADE functionality to temporal FKs. I've started reading how
executor nodes work, but I'll probably reach out with some questions.
. . . :-) Also I need to decide how triggers should behave in a
temporal update/delete. I have my own opinion about what is sensible,
but I haven't found guidance in the standard, so I'm going to see what
other RDBMSes are doing. (If someone has a reference to a part of the
standard I overlooked, let me know! :-)

Yours,
Paul

Вложения

Re: SQL:2011 PERIODS vs Postgres Ranges?

От

Ibrar Ahmed

Дата:

26 июля 2019 г., 16:34:30

The patch requires to rebase on the master branch.

The new status of this patch is: Waiting on Author

Re: SQL:2011 PERIODS vs Postgres Ranges?

От

Ibrar Ahmed

Дата:

30 июля 2019 г., 16:00:46

Hi Paul,

I have rebased the patch to master (1e2fddfa33d3c7cc93ca3ee0f32852699bd3e012) and fixed some compilation warning. Now I am reviewing the actual code.

On Fri, Jul 26, 2019 at 6:35 PM Ibrar Ahmed <ibrar.ahmad@gmail.com> wrote:

The patch requires to rebase on the master branch.

The new status of this patch is: Waiting on Author

Ibrar Ahmed

Вложения

temporal_pks_fks_v005.patch

Re: SQL:2011 PERIODS vs Postgres Ranges?

От

Thomas Munro

Дата:

01 августа 2019 г., 13:22:23

On Wed, Jul 31, 2019 at 1:01 AM Ibrar Ahmed <ibrar.ahmad@gmail.com> wrote:
> I have rebased the patch to master (1e2fddfa33d3c7cc93ca3ee0f32852699bd3e012) and fixed some compilation warning. Now
Iam reviewing the actual code.
 

Thanks for doing that Ibrar.  I think the right status for this CF
entry is now:  Needs review.  I have set it that way, in the September
CF.   By the way, there are some test failures:

https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.50280

-- 
Thomas Munro
https://enterprisedb.com

Re: SQL:2011 PERIODS vs Postgres Ranges?

От

Ibrar Ahmed

Дата:

02 августа 2019 г., 23:28:23

The patch does not work.

postgres=# CREATE TABLE foo (id int,r int4range, valid_at tsrange, CONSTRAINT bar_pk PRIMARY KEY (r, valid_at WITHOUT
OVERLAPS));
CREATE TABLE
postgres=# CREATE TABLE bar (id int,r int4range, valid_at tsrange, CONSTRAINT bar_fk FOREIGN KEY (r, PERIOD valid_at)
REFERENCESfoo);
 
ERROR:  cache lookup failed for type 0

The new status of this patch is: Waiting on Author

Re: SQL:2011 PERIODS vs Postgres Ranges?

От

Ibrar Ahmed

Дата:

02 августа 2019 г., 23:48:31

Hi Paul,

I did some clean-up on this patch. I have also refactored a small portion of the code

to reduce the footprint of the patch. For simplicity, I have divided the patch into 6

patches, now it is easy to review and debug.

001_temporal_table_grammer_v006.patch

002_temporal_table_doc_v006.patch

003_temporal_table_backend_v006.patch

004_temporal_table_pgdump_v006.patch

005_temporal_table_regression_v006.patch

006_temporal_table_psql_v006.patch

Please follow the PostgreSQL coding guidelines. I have found places where you missed that, secondly code even in WIP stage must not have WARNING because it looks ugly.

On Sat, Aug 3, 2019 at 1:29 AM Ibrar Ahmed <ibrar.ahmad@gmail.com> wrote:

The patch does not work.

postgres=# CREATE TABLE foo (id int,r int4range, valid_at tsrange, CONSTRAINT bar_pk PRIMARY KEY (r, valid_at WITHOUT OVERLAPS));
CREATE TABLE
postgres=# CREATE TABLE bar (id int,r int4range, valid_at tsrange, CONSTRAINT bar_fk FOREIGN KEY (r, PERIOD valid_at) REFERENCES foo);
ERROR: cache lookup failed for type 0

The new status of this patch is: Waiting on Author

Ibrar Ahmed

Вложения

Re: SQL:2011 PERIODS vs Postgres Ranges?

От

Paul A Jungwirth

Дата:

05 августа 2019 г., 01:11:21

On Fri, Aug 2, 2019 at 1:49 PM Ibrar Ahmed <ibrar.ahmad@gmail.com> wrote:
> I did some clean-up on this patch. I have also refactored a small portion of the code
> to reduce the footprint of the patch. For simplicity, I have divided the patch into 6
> patches, now it is easy to review and debug.
> Please follow the PostgreSQL coding guidelines. I have found places where you missed that, secondly code even in WIP
stagemust not have WARNING because it looks ugly.

Thank you for the cleanup Ibrar! I'll try to stick to the coding
standards more closely going forward. If you have any review comments
I would certainly appreciate them, especially about the overall
approach. I know that the implementation in its current form is not
very tasteful, but I wanted to get some feedback on the ideas.

Also just to reiterate: this patch depends on my other CF entry
(range_agg), whose scope has expanded considerably. Right now I'm
focusing on that. And if you're trying to make this code work, it's
important to apply the range_agg patch first, since the temporal
foreign key implementation calls that function.

Also: since this patch raises the question of how to conform to
SQL:2011 while still supporting Postgres range types, I wrote an
article that surveys SQL:2011 temporal features in MariaDB, DB2,
Oracle, and MS SQL Server:

https://illuminatedcomputing.com/posts/2019/08/sql2011-survey/

A few highlights are:

- Everyone lets you define PERIODs, but what you can do with them is
still *very* limited.
- No one treats PERIODs as first-class types or expressions; they are
more like table metadata.
- Oracle PERIODs do permit NULL start/end values, and it interprets
them as "unbounded". That goes against the standard but since it's
what Postgres does with ranges, it suggests to me that maybe we should
follow their lead. Anyway I think a NULL is nicer than a sentinel for
this purpose.

Regards,
Paul

Re: SQL:2011 PERIODS vs Postgres Ranges?

От

Ibrar Ahmed

Дата:

06 августа 2019 г., 13:26:36

Hi Paul,

On Mon, Aug 5, 2019 at 3:11 AM Paul A Jungwirth <pj@illuminatedcomputing.com> wrote:

On Fri, Aug 2, 2019 at 1:49 PM Ibrar Ahmed <ibrar.ahmad@gmail.com> wrote:
> I did some clean-up on this patch. I have also refactored a small portion of the code
> to reduce the footprint of the patch. For simplicity, I have divided the patch into 6
> patches, now it is easy to review and debug.
> Please follow the PostgreSQL coding guidelines. I have found places where you missed that, secondly code even in WIP stage must not have WARNING because it looks ugly.

Thank you for the cleanup Ibrar! I'll try to stick to the coding
standards more closely going forward. If you have any review comments
I would certainly appreciate them, especially about the overall
approach. I know that the implementation in its current form is not
very tasteful, but I wanted to get some feedback on the ideas.

I have reviewed the main design, and in my opinion, it is a good start.

- Why we are not allowing any other datatype other than ranges in the

primary key. Without that there is no purpose of a primary key.

- Thinking about some special token to differentiate between normal

primary key and temporal primary key

Also just to reiterate: this patch depends on my other CF entry
(range_agg), whose scope has expanded considerably. Right now I'm
focusing on that. And if you're trying to make this code work, it's
important to apply the range_agg patch first, since the temporal
foreign key implementation calls that function.

Also: since this patch raises the question of how to conform to
SQL:2011 while still supporting Postgres range types, I wrote an
article that surveys SQL:2011 temporal features in MariaDB, DB2,
Oracle, and MS SQL Server:

https://illuminatedcomputing.com/posts/2019/08/sql2011-survey/

A few highlights are:

- Everyone lets you define PERIODs, but what you can do with them is
still *very* limited.
- No one treats PERIODs as first-class types or expressions; they are
more like table metadata.

- Oracle PERIODs do permit NULL start/end values, and it interprets
them as "unbounded". That goes against the standard but since it's
what Postgres does with ranges, it suggests to me that maybe we should
follow their lead. Anyway I think a NULL is nicer than a sentinel for
this purpose.

That is an open debate, that we want to strictly follow the standard or map that

to PostgreSQL range type which allows NULL. But how you will define a primary

key on that?

Regards,
Paul

Ibrar Ahmed

Re: SQL:2011 PERIODS vs Postgres Ranges?

От

Paul Jungwirth

Дата:

06 августа 2019 г., 18:28:05

Hi Ibrar,

On 8/6/19 3:26 AM, Ibrar Ahmed wrote:
> - Why we are not allowing any other datatype other than ranges in the
> primary key. Without that there is no purpose of a primary key.

A temporal primary key always has at least one ordinary column (of any 
type), so it is just a traditional primary key *plus* a PERIOD and/or 
range column to indicate when the record was true.

> - Thinking about some special token to differentiate between normal
> primary key and temporal primary key

There is already some extra syntax. For the time part of a PK, you say 
`WITHOUT OVERLAPS`, like this:

     CONSTRAINT pk_on_t PRIMARY KEY (id, valid_at WITHOUT OVERLAPS)

In this example `id` is an ordinary column, and `valid_at` is either a 
Postgres range or a SQL:2011 PERIOD. (The latter is not yet implemented 
in my patch but there are some placeholder comments.)

Similarly a foreign key has one or more traditional columns *plus* a 
range/PERIOD. It needs to have a range/PERIOD on both sides. It too has 
some special syntax, but instead of `WITHOUT OVERLAPS` it is `PERIOD`. 
(Don't blame me, I didn't write the standard.... :-) So here is an example:

     CONSTRAINT fk_t2_to_t FOREIGN KEY (id, PERIOD valid_at)
       REFERENCES t (id, PERIOD valid_at)

You should be able to see my changes to gram.y to support this new syntax.

I hope this clears up how it works! I'm happy to answer more questions 
if you have any. Also if you want to read more:

- This paper by Kulkarni & Michels is a 10-page overview of SQL:2011:

https://sigmodrecord.org/publications/sigmodRecord/1209/pdfs/07.industry.kulkarni.pdf

- This is a talk I gave at PGCon 2019 going over the concepts, with a 
lot of pictures. You can find text, slides, and a link to the video here:

https://github.com/pjungwir/postgres-temporal-talk

- This link is ostensibly an annotated bibliography but really tells a 
story about how the research has developed:

https://illuminatedcomputing.com/posts/2017/12/temporal-databases-bibliography/

- There is also some discussion about PERIODs vs ranges upthread here, 
as well as here:

https://www.postgresql-archive.org/Periods-td6022563.html

Yours,

-- 
Paul              ~{:-)
pj@illuminatedcomputing.com

Re: SQL:2011 PERIODS vs Postgres Ranges?

От

Ibrar Ahmed

Дата:

06 августа 2019 г., 21:07:20

On Tue, Aug 6, 2019 at 8:28 PM Paul Jungwirth <pj@illuminatedcomputing.com> wrote:

Hi Ibrar,

On 8/6/19 3:26 AM, Ibrar Ahmed wrote:
> - Why we are not allowing any other datatype other than ranges in the
> primary key. Without that there is no purpose of a primary key.

A temporal primary key always has at least one ordinary column (of any
type), so it is just a traditional primary key *plus* a PERIOD and/or
range column to indicate when the record was true.

> - Thinking about some special token to differentiate between normal
> primary key and temporal primary key

There is already some extra syntax. For the time part of a PK, you say
`WITHOUT OVERLAPS`, like this:

CONSTRAINT pk_on_t PRIMARY KEY (id, valid_at WITHOUT OVERLAPS)

In this example `id` is an ordinary column, and `valid_at` is either a
Postgres range or a SQL:2011 PERIOD. (The latter is not yet implemented
in my patch but there are some placeholder comments.)

Similarly a foreign key has one or more traditional columns *plus* a
range/PERIOD. It needs to have a range/PERIOD on both sides. It too has
some special syntax, but instead of `WITHOUT OVERLAPS` it is `PERIOD`.
(Don't blame me, I didn't write the standard.... :-) So here is an example:

CONSTRAINT fk_t2_to_t FOREIGN KEY (id, PERIOD valid_at)
REFERENCES t (id, PERIOD valid_at)

You should be able to see my changes to gram.y to support this new syntax.

I hope this clears up how it works! I'm happy to answer more questions
if you have any. Also if you want to read more:

- This paper by Kulkarni & Michels is a 10-page overview of SQL:2011:

https://sigmodrecord.org/publications/sigmodRecord/1209/pdfs/07.industry.kulkarni.pdf

- This is a talk I gave at PGCon 2019 going over the concepts, with a
lot of pictures. You can find text, slides, and a link to the video here:

https://github.com/pjungwir/postgres-temporal-talk

- This link is ostensibly an annotated bibliography but really tells a
story about how the research has developed:

https://illuminatedcomputing.com/posts/2017/12/temporal-databases-bibliography/

- There is also some discussion about PERIODs vs ranges upthread here,
as well as here:

https://www.postgresql-archive.org/Periods-td6022563.html

Thanks, Paul for the explanation. I think its good start, now I am looking at the

range_agg patch to integrate that with that and test that.

Yours,

--
Paul ~{:-)
pj@illuminatedcomputing.com

Ibrar Ahmed

Re: SQL:2011 PERIODS vs Postgres Ranges?

От

Paul A Jungwirth

Дата:

07 ноября 2019 г., 08:31:51

On Tue, Aug 6, 2019 at 11:07 AM Ibrar Ahmed <ibrar.ahmad@gmail.com> wrote:
> Thanks, Paul for the explanation.  I think its good start, now I am looking at the
> range_agg patch to integrate that with that and test that.

Since we've started another commitfest, here is an updated version of
this patch. I've rebased it on the latest multirange patch (which is
rebased on the latest master). I've incorporated your feedback. I've
also added some progress on adding FOR PORTION OF to UPDATE and DELETE
(mostly UPDATE). That is even more WIP than the PK/FK work, because
I'm still working on the executor phase, so feel free to ignore it or
offer feedback. I've put the DML work in a separate patch file. (I'm
planning to roughly follow ON CONFLICT DO UPDATE for the ModifyTable
changes, since temporal DML is also an extra clause that transforms
your commands into something else. For example a temporal UPDATE could
become an UPDATE plus two INSERTs. I'm probably going to need some
help eventually getting the concurrency stuff right here though.) (I
think temporal DML makes sense to include in this patch because it is
required for cascading FKs. I think once everything is working I'll
give you a patch series that goes PKs - DML - FKs. Or even better
PERIODs - PKs - DML - FKs.)

Thanks!
Paul

Вложения

Re: SQL:2011 PERIODS vs Postgres Ranges?

От

Paul A Jungwirth

Дата:

11 ноября 2019 г., 23:13:20

On Wed, Nov 6, 2019 at 9:31 PM Paul A Jungwirth
<pj@illuminatedcomputing.com> wrote:
> I've also added some progress on adding FOR PORTION OF to UPDATE and DELETE
> (mostly UPDATE).

I could use some guidance on where in the query-processing pipeline I
should implement some things here. Basically if you say

    UPDATE t FOR PORTION OF valid_at FROM t1 TO t2

then we need to do several things:

- Add a qual like `valid_at && tsrange(t1, t2)`. (I'll assume valid_at
is a tsrange column for example's sake, but really it can be any range
type. Also valid_at may be a PERIOD instead of a range, which means
the start/end are two concrete columns instead, but that doesn't
change anything notable here.)
- Add a target entry like `SET valid_at = valid_at * tsrange(t1, t2)`.
(* = intersection. Basically each bound should be truncated to fit
within t1/t2.)
- If either bound was "cut" then also do an INSERT to restore the
cut-off part, leaving all columns unchanged except for the time part.

(DELETE t FOR PORTION OF is very similar.)

I think I understand the ModifyTable executor node enough to be able
to add the optional INSERTs there when necessary. Adding the qual and
the target entry is where I want advice.

So far I've been able to add a ForPortionOfClause when parsing and a
ForPortionOfExpr when analyzing (much like how we handle ON CONFLICT).
I could use those to add a qual and a target list entry during
analysis (in fact I've tried that and it seems to work), but I'm
pretty sure that's wrong. I recall a long post to pgsql-hackers a
month or three back lamenting how new contributors often do work in
the analysis phase that should happen later. (I can't find that now,
but if anyone has a link I'd appreciate it!) Some considerations (not
an exhaustive list):

- FOR PORTION OF should work on partitioned tables.
- It should work on automatically-updateable views.
- It should work on views with CHECK OPTION.
- It should work on views with an UPDATE rule.
- It should do the right thing for EXPLAIN output (whatever that is).
- If a function does a FOR PORTION OF command, then printing the
function definition should show that clause (and nothing extra).
- Same for printing a rule definition.
- Probably if you give a FOR PORTION OF we should forbid you from
SETting the time column(s) at the same time, since we want to set them
automatically.
- Triggers should work normally. (We *should* fire ROW triggers for
the INSERTs of the "cut off" bits. Mariadb fires them in this order,
which seems correct to me: BEFORE UPDATE, BEFORE INSERT, AFTER INSERT,
BEFORE INSERT, AFTER INSERT, AFTER UPDATE. I guess we probably want to
fire STATEMENT triggers too, probably once for each INSERT. I'll check
what other systems do there.)

So I'm thinking the right place to add the quals & target entry is
either the end of the rewriting phase or the beginning of the planning
phase. (I can still build the expressions in the analysis phase, but I
need to keep them "off to the side" in a new forPortionOf attribute
until the right time.) We definitely want the extra qual soon enough
to help choose indexes. Perhaps we even want to see it in EXPLAIN
output (which happens if I add it during analysis); personally I kind
of find that helpful. Do we want to add it after processing rewrite
rules (and will that change EXPLAIN output)?

For adding the target entry, if we are forbidding the user from
SETting things, that check needs to happen after processing rewrite
rules, right? (Of course it doesn't hurt to check in several places if
there is some reason to do that.)

Btw I thought about whether we could implement this feature completely
on top of either triggers or rules, but I don't think it's quite that
simple. Basically: because you could also UPDATE/DELETE the table
*without* a FOR PORTION OF, sometimes we need to do the extra things
and sometimes not, and we need a way of knowing which is which. And
then supporting PERIODs requires a little extra "magic" beyond that.
But if someone has a great idea I'm open to hearing about it. :-)

Thanks,
Paul

Re: SQL:2011 PERIODS vs Postgres Ranges?

От

Michael Paquier

Дата:

28 ноября 2019 г., 07:14:14

On Mon, Nov 11, 2019 at 12:13:20PM -0800, Paul A Jungwirth wrote:
> I could use some guidance on where in the query-processing pipeline I
> should implement some things here. Basically if you say
> [...]

Paul, please be careful to update correctly the entry of the patch in
the CF app.  This was marked as waiting on author, but you are
obviously looking for reviews.  I have updated the status of the patch
accordingly, then moved it again.
--
Michael

Вложения

signature.asc

Re: SQL:2011 PERIODS vs Postgres Ranges?

От

Paul A Jungwirth

Дата:

12 марта 2020 г., 02:27:53

Here is a patch rebasing on master (meant to be applied on top of my
other multirange patch) and newly including UPDATE/DELETE FOR PORTION
OF. FOR PORTION OF works on any table with a temporal primary key. It
restricts the UPDATE/DELETE to the given time frame, and then if the
affected row(s) had any "leftovers" above or below the targeted range,
it INSERTs new rows to preserve the untouched intervals.

I put the implementation into execModifyTable.c (mostly), which I
think is the preferred approach. (There was a great message on
-hackers a year or two ago lamenting how new contributors want to do
all the work in the parse/analysis phase, so I tried to avoid that. I
wish I could find the thread. I want to say Robert Haas wrote it, but
it could have been anyone.)

The executor is new territory for me, so even though this is WIP I'd
love to have someone look at it now. I'm sure I'm doing all kinds of
bad things re locking, transaction isolation, snapshots, etc. (There
are some comments in the .patch for specific worries.)

Also, since I have to do range calculations to handle the leftovers,
this adds knowledge about a specific type (well a specific type of
types) to the executor. I felt like that was mixing abstraction layers
a bit, so perhaps someone will have an opinion/suggestion there. I
could probably build an Expr earlier in the pipeline if I had a way to
feed it the pre-UPDATE values of the row (suggestions anyone?), and
then the executor could just evaluate it without knowing anything
about ranges.

I'm also not yet handling FDW tables, updatable views, or partitioned
tables. Perhaps we don't support FDWs at all here, or leave it up to
the FDW implementation to decide.

I haven't thought much about updatable views yet....

For partitioned tables, I think I can add support without too much
trouble; I'll give it a try soon.

Possibly the temporal PK requirement rules out using this for all
three (FDWs, updatable views, partitioned tables), at least for now.

This is mostly "happy path" so there is probably some error handling to add.

Previously a DELETE never updated indexes, and now I *do* update
indexes if a DELETE has a FOR PORTION OF clause, so that they see the
potential INSERTs. I don't know if that adds any risks around deadlock
etc.

I have a test verifying that we do fire triggers on the implicit
INSERTs (which I think is what the spec requires and is what MariaDB
and IBM DB2 do). Right now my triggers are firing in this order:

    BEFORE UPDATE/DELETE
    BEFORE INSERT
    BEFORE INSERT
    AFTER INSERT
    AFTER INSERT
    AFTER UPDATE/DELETE

In MariaDB they fire in this order:

    BEFORE UPDATE/DELETE
    BEFORE INSERT
    AFTER INSERT
    BEFORE INSERT
    AFTER INSERT
    AFTER UPDATE/DELETE

I haven't yet tested DB2 to see which order it uses. (It does fire the
INSERT triggers though.) I don't know if the spec has an opinion (I've
never found anything explicit, but it talks about "primary" vs
"secondary" operations), and I'm not actually sure how to get
MariaDB's order if we wanted to.

Instead of implementing so much in the executor, I could *almost* have
built FOR PORTION OF based on hidden triggers (sort of like how we
implement FKs). Probably AFTER ROW triggers. Then I'd hardly have to
touch the executor at all, and I wouldn't need to worry as much about
locking/isolation/snapshots. I would just need to add something to the
TriggerData struct giving the FOR PORTION OF bounds. (For an UPDATE I
could pull this out of NEW.valid_at (or whatever you call your
column), but for a DELETE that is NULL.) Basically a trigger needs to
know (1) if the query had a FOR PORTION OF clause (2) what the bounds
were. My triggers would be in C, so I think adding a new field to the
struct is sufficient. Of course we could still expose the values to
user-defined triggers if we wanted with e.g. TG_TARGET_INTERVAL.

With a trigger-based implementation, I'd add the update/delete
triggers whenever someone adds a temporal primary key. That means you
can't use FOR PORTION OF on arbitrary ranges, which is a little sad,
but no worse than the spec. And for now I was only supporting it on PK
columns anyway.

Also with a trigger-based implementation I don't think there is a way
to force our hidden trigger to fire before user-defined triggers,
which might be nice. But if that's not a problem with FKs then it's
probably not a problem here.

A trigger-based implementation also is less flexible in how it
interacts with GENERATED columns, but I think an AFTER trigger would
still do the right thing there.

A trigger-based implementation should give us the same firing order as
MariaDB (IIUC), which might be nice.

So let me know if anyone thinks this would be better implemented as
triggers instead. I'm kind of leaning that way myself to be honest.

One weird issue I noticed concerns dealing with unbounded endpoints.
According to the spec a PERIOD's endpoints must be not-NULL, so you
have to use sentinel values like 3000-JAN-01. Oracle ignores this
though, and since our own ranges already interpret a null bound to
mean infinite, I think we should accept a non-sentinel way of saying
"from now on" or "since the beginning". Right now I accept 'Infinity'
and '-Infinity' in FOR PORTION OF, because those are familiar and
valid values for timestamps/dates/floats. But I translate them to
`NULL` to get proper range behavior. That's because to ranges,
'Infinity' is "right before" an upper null, and '-Infinity' is "right
after" a lower null. For example:

=# select tsrange('2020-01-01', null) - tsrange('2020-01-01', 'Infinity');
  ?column?
-------------
 [infinity,)

That will leave a lot of ugly records in your table if you don't take
care to avoid it.

I guess alternately I could let people say FOR PORTION OF FROM
'2020-01-01' TO NULL. But that is less obvious, and it leaves the
footgun still there. So right now I'm translating +/-Infinity into
NULL to prevent the problem.

Anyway, now that FOR PORTION OF works, I want to add CASCADE support
to temporal FKs. I don't think that will take long.

Then I'd like to bring in Vik Fearing's old patch adding PERIODs, and
start supporting those too. Vik, do you have any objection or advice
about that?

Yours,

Вложения

v0003-temporal-pk-fk-update-delete.patch

Re: SQL:2011 PERIODS vs Postgres Ranges?

От

Michael Paquier

Дата:

17 сентября 2020 г., 10:51:01

On Wed, Mar 11, 2020 at 04:27:53PM -0700, Paul A Jungwirth wrote:
> Here is a patch rebasing on master (meant to be applied on top of my
> other multirange patch) and newly including UPDATE/DELETE FOR PORTION
> OF. FOR PORTION OF works on any table with a temporal primary key. It
> restricts the UPDATE/DELETE to the given time frame, and then if the
> affected row(s) had any "leftovers" above or below the targeted range,
> it INSERTs new rows to preserve the untouched intervals.

This patch had no reviews, unfortunately it cannot be applied
cleanly.  Could you send a rebase please?
--
Michael

Вложения

signature.asc

Re: SQL:2011 PERIODS vs Postgres Ranges?

От

Michael Paquier

Дата:

30 сентября 2020 г., 10:39:08

On Thu, Sep 17, 2020 at 04:51:01PM +0900, Michael Paquier wrote:
> This patch had no reviews, unfortunately it cannot be applied
> cleanly.  Could you send a rebase please?

This had no replies after two weeks, so I have marked the patch as
RwF.  Please feel free to resubmit if you are planning to work more on
that.
--
Michael

Вложения

signature.asc

Re: SQL:2011 PERIODS vs Postgres Ranges?

От

Ibrar Ahmed

Дата:

27 октября 2020 г., 17:11:02

On Wed, Sep 30, 2020 at 12:39 PM Michael Paquier <michael@paquier.xyz> wrote:

On Thu, Sep 17, 2020 at 04:51:01PM +0900, Michael Paquier wrote:
> This patch had no reviews, unfortunately it cannot be applied
> cleanly. Could you send a rebase please?

This had no replies after two weeks, so I have marked the patch as
RwF. Please feel free to resubmit if you are planning to work more on
that.
--
Michael

I have spent some more time on the patch and did a lot of cleanup along with some fixes, compilation errors, and warnings. There is a need for another round of clean up to make that patch perfect. Currently, it based on master (0525572860335d050a1bea194a5278c8833304d1). Regression is not passing because it is dependent on the range_agg patch. Currently, I am collecting all the TODO's and try to finish all of these.

The patch requires some reviews, so I am putting that again in the current commitfest.

Ibrar Ahmed

Вложения

v0004-temporal-pk-fk-update-delete.patch

Re: SQL:2011 PERIODS vs Postgres Ranges?

От

Paul Jungwirth

Дата:

27 октября 2020 г., 19:34:48

On 10/27/20 7:11 AM, Ibrar Ahmed wrote:
> I have spent some more time on the patch and did a lot of cleanup along 
> with some fixes, compilation errors, and warnings.

Thank you for taking a look at this! I've been swamped with ordinary 
work and haven't had a chance to focus on it for a while, but I'm hoping 
to make some improvements over the coming holidays, especially based on 
feedback from my talk at PgCon. There are a handful of small specific 
things I'd like to do, and then one big thing: add support for PERIODs. 
Vik said I could include his old patch for PERIODs, so I'd like to get 
that working on the latest master, and then rebase my own work on top of 
it. Then we can accept either ranges or PERIODs in various places 
(marked by TODOs in the code).

Vik also pointed out a way to check foreign keys without using 
range_agg. He thinks it may even be more efficient. On the other hand 
it's a much more complicated SQL statement. I'd like to do a performance 
comparison to get concrete numbers, but if we did use his query, then 
this patch wouldn't depend on multiranges anymore---which seems like a 
big aid to moving it forward. Assuming multiranges gets committed, we 
can always swap in the range_agg query depending on the performance 
comparison results.

I apologize for the slow progress here, and thank you for your help!

Yours,

-- 
Paul              ~{:-)
pj@illuminatedcomputing.com

Re: SQL:2011 PERIODS vs Postgres Ranges?

От

David Steele

Дата:

08 апреля 2021 г., 17:22:15

On 10/27/20 12:34 PM, Paul Jungwirth wrote:
> On 10/27/20 7:11 AM, Ibrar Ahmed wrote:
>> I have spent some more time on the patch and did a lot of 
>> cleanup along with some fixes, compilation errors, and warnings.
> 
> Thank you for taking a look at this! I've been swamped with ordinary 
> work and haven't had a chance to focus on it for a while, but I'm hoping 
> to make some improvements over the coming holidays, especially based on 
> feedback from my talk at PgCon. There are a handful of small specific 
> things I'd like to do, and then one big thing: add support for PERIODs. 
> Vik said I could include his old patch for PERIODs, so I'd like to get 
> that working on the latest master, and then rebase my own work on top of 
> it. Then we can accept either ranges or PERIODs in various places 
> (marked by TODOs in the code).
> 
> Vik also pointed out a way to check foreign keys without using 
> range_agg. He thinks it may even be more efficient. On the other hand 
> it's a much more complicated SQL statement. I'd like to do a performance 
> comparison to get concrete numbers, but if we did use his query, then 
> this patch wouldn't depend on multiranges anymore---which seems like a 
> big aid to moving it forward. Assuming multiranges gets committed, we 
> can always swap in the range_agg query depending on the performance 
> comparison results.
> 
> I apologize for the slow progress here, and thank you for your help!

Looks like Ibrar reopened this patch in the 2020-09 CF rather than 
moving it to a new one. Given that Paul has not had a chance to look at 
it since then I'm setting it back to RwF.

Paul, you can submit to the next CF when you are ready with a new patch.

Regards,
-- 
-David
david@pgmasters.net

Re: SQL:2011 PERIODS vs Postgres Ranges?

От

Paul A Jungwirth

Дата:

09 апреля 2021 г., 02:40:51

On Thu, Apr 8, 2021 at 7:22 AM David Steele <david@pgmasters.net> wrote:
>
> Paul, you can submit to the next CF when you are ready with a new patch.

Thanks David! I've made a lot of progress but still need to finish
support for CASCADE on temporal foreign keys. I've been swamped with
other things, but hopefully I can get something during this current
CF.

Paul

Re: SQL:2011 PERIODS vs Postgres Ranges?

От

David Steele

Дата:

09 апреля 2021 г., 14:54:29

On 4/8/21 7:40 PM, Paul A Jungwirth wrote:
> On Thu, Apr 8, 2021 at 7:22 AM David Steele <david@pgmasters.net> wrote:
>>
>> Paul, you can submit to the next CF when you are ready with a new patch.
> 
> Thanks David! I've made a lot of progress but still need to finish
> support for CASCADE on temporal foreign keys. I've been swamped with
> other things, but hopefully I can get something during this current
> CF.

The next CF starts on July 1 so you have some time.

Regards,
-- 
-David
david@pgmasters.net

Re: SQL:2011 PERIODS vs Postgres Ranges?

От

Ibrar Ahmed

Дата:

15 июля 2021 г., 16:20:27

On Fri, Apr 9, 2021 at 4:54 PM David Steele <david@pgmasters.net> wrote:

On 4/8/21 7:40 PM, Paul A Jungwirth wrote:
> On Thu, Apr 8, 2021 at 7:22 AM David Steele <david@pgmasters.net> wrote:
>>
>> Paul, you can submit to the next CF when you are ready with a new patch.
>
> Thanks David! I've made a lot of progress but still need to finish
> support for CASCADE on temporal foreign keys. I've been swamped with
> other things, but hopefully I can get something during this current
> CF.

The next CF starts on July 1 so you have some time.

Regards,
--
-David
david@pgmasters.net

Based on last comments of Paul and David S I am changing the status to "Waiting on Author".

Ibrar Ahmed

Re: SQL:2011 PERIODS vs Postgres Ranges?

От

Paul A Jungwirth

Дата:

15 июля 2021 г., 17:29:42

On Thu, Jul 15, 2021 at 6:21 AM Ibrar Ahmed <ibrar.ahmad@gmail.com> wrote:
> Based on last comments of Paul and David S I am changing the status to "Waiting on Author".

I thought the subject was quite out of date, so I sent my last patch
here:
https://www.postgresql.org/message-id/CA%2BrenyUApHgSZF9-nd-a0%2BOPGharLQLO%3DmDHcY4_qQ0%2BnoCUVg%40mail.gmail.com

I also added that thread to the commitfest item.

I'm going to change the commitfest entry back to Needs Review, but
please let me know if you disagree. Sorry for the confusion!

Yours,
Paul

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: SQL:2011 PERIODS vs Postgres Ranges?

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения