Обсуждение: SSI predicate locking on heap -- tuple or row?
As background, for most of SSI development I had a TODO comment in the source which was basically around the question of whether a predicate lock on a visible heap tuple should propagate to later versions of the row as long as the original lock was material. In early February Heikki (quite reasonably) insisted that I resolve that and either add code to do so or a comment about why it wasn't necessary. After looking at it for many hours without getting to a logical proof, I thought I had a proof by example: http://archives.postgresql.org/pgsql-hackers/2011-02/msg00325.php We added code for this, but it has been problematic; probably over half the SSI bugs which have been found since SSI went into the code tree have been related to this, and Robert Haas has just found a couple more.In discussing how to address this with Jeff Davis(over beers at PGCon), he provided good advice about how to properly fix these issues, but posed some very good questions about whether it was really necessary. Rather than fix this aspect of the code, we might be able to eliminate it, which would reduce the code size and some overhead. Since this code is more fragile than the rest of SSI, this is particularly appealing -- my favorite way to deal with fragile code is to remove it. I went back to the example which persuaded me and took another look. On review I see that this didn't prove the point because there was a dangerous structure with T1 as a pivot which should have caused SSI to break the cycle. Since it didn't, either I got careless in my testing methodology (which I will recheck when I get back to Wisconsin) or there was a bug -- but either way, this example can't prove that the predicate locks need to follow the row to new versions. I haven't been able to come up with an example where it actually *is* needed, but failure to come up with an example where it is needed doesn't prove that it isn't needed. Here's my attempt at a logical proof of whether it is needed. It would be great if anyone with a grasp of SSI concepts could confirm its validity or shoot it down. (Hopefully Friday's presentation expanded the number of those who can do so.) The issue can be rephrased to this: If transaction T1 reads a row (thus acquiring a predicate lock on it) and a second transaction T2 updates that row, must a third transaction T3 which updates the new version of the row have a rw-conflict in from T1? Now, the first thing which jumps out is the timing requirements -- T3 must overlap T1 or there is no conflict (and therefore no need to replicate the predicate locks), but T3 must *not* overlap T2 or there would be ww-conflict and one of them would roll back. If T2 committed before T1 acquired its snapshot, T1 would have gotten its predicate lock on the second version of the row, so for a situation where this could possibly matter, the following actual timings must exist (read "starts" as meaning "acquires a snapshot"): T1 and T2 start in either order. T1 reads the row and T2 updates the row in either order. T2 commits. T3 starts. T3 updates the row. So far, using broken lines for rw-conflicts and solid for wr-dependencies, we have this for apparent order of execution: T1 - - -> T2 -----> T3 Not on the slides for the presentation, but briefly mentioned, is that Alan Fekete and others have proven that serialization anomalies can only occur if the transaction which appears on the rw-conflict *out* side of the pivot (the transaction which appears to have executed third) actually commits first. T2 committed before T1, so there can't be a cycle involving a rw-conflict *in* to T1 because that would complete the dangerous structure -- unless it commits before T2 or is read only and gets its snapshot before the commit of T2 (another special case which we left out of the presentation in the interest of time). Since T3 must get its snapshot after the commit of T2, if it develops a rw-conflict with T1 there will be an SSI serialization failure without needing the lock propagation. So now the question breaks down to whether we can have other transactions in the mix which, given the above, cause T3 to appear to have executed before T1. Since T3 cannot have committed before T1 (given what we know about the actual timings required), that has to involve a rw-conflict somewhere in the graph. Since a wr-dependency is caused by the writer actually committing before its work is read by another transaction, causing apparent order of execution at that point to match actual serial order of execution, such transactions would be benign in the transaction graph -- they might exist in a problem graph but would not help to cause a later-starting transaction to appear to have executed first. We can look just a rw-conflicts to cause the apparent order of execution to loop back in time. Since the time-arrow for rw-conflict points in the direction of the write, we can ignore read only transactions. So to complete the cycle back to T1, the transaction T0 with the rw-conflict to T1 must have committed before T2 committed. This means that it can't overlap with T3, because it started after the commit of T2. So the question is now whether T3 can have a rw-conflict (or chain of conflicts) to T0.The timings required to make it necessary to replicate predicate locks to later versions of the row now are: T0, T1, and T2 start in any order. T1 reads the row and T2 updates the row in either order; also T1 updates some other data which conflicts with a T0 read, in any order. T0 commits. T2 commits. T3 starts. T3 updates the row. Apparent order of execution is now: T0 - - -> T1 - - -> T2 -----> T3 We need at least one more transaction to complete the cycle here. For only one to do it, it must overlap both T0 and T3, it must do some read which conflicts with a write of T0 and some write which conflicts with a read of T3. Listing the timings at this point would be pretty chaotic -- T4 starts and develops a rw-conflict with T0 anytime before T0 commits, and develops a rw-conflict with T3 at any point after T3 starts. Apparent order of execution is now: T0 - - -> T1 - - -> T2 -----> T3 - - -> T4 - - -> T0 But wait -- T4 is now a pivot with T0 committing first and T3 is not read only. That part of the cycle is a dangerous structure and one of those transactions will be rolled back. So far we have now proven that you can't cause a problem with only five transactions. The question is now whether T4 can be replaced with multiple transactions forming a rw-conflict without one of them committing at a time which would create a dangerous structure and break the cycle. [Anyone who has followed along this far has earned my undying gratitude.] Since the chain of transactions has to overlap T0 and T3, I don't see how that can happen. We established that T0 has to commit before T3 can start, so the chain will ultimately have to get to that T0 commit. I don't want to start ripping out the apparently useless code without someone checking my logic here. One big gaff in this area is quite enough for me. :-/ Anyone? -Kevin
On Sat, May 21, 2011 at 4:09 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote:
It would be great if anyone with a grasp
of SSI concepts could confirm its validity or shoot it down. (Hopefully
Friday's presentation expanded the number of those who can do so.)
As a first step, it would be great if you can upload the slides on the conference website. To expect that the attendees would have understood the nitty-gritties of SSI just listening to the presentation is so unhuman :-)
Thanks,
Pavan
Pavan Deolasee wrote: > As a first step, it would be great if you can upload the slides on > the conference website. To expect that the attendees would have > understood the nitty-gritties of SSI just listening to the > presentation is so unhuman :-) I'm sure Dan will be doing that soon; meanwhile, maybe this page will help: http://wiki.postgresql.org/wiki/Serializable -Kevin
On Sat, May 21, 2011 at 04:45:15PM -0400, Pavan Deolasee wrote: > As a first step, it would be great if you can upload the slides on the > conference website. To expect that the attendees would have understood the > nitty-gritties of SSI just listening to the presentation is so unhuman :-) I just posted them at http://drkp.net/drkp/papers/ssi-pgcon11-slides.pdf ...and they'll be linked from the Serializable wiki page as soon as I remember how to edit it. :-) Dan -- Dan R. K. Ports MIT CSAIL http://drkp.net/
On Sat, May 21, 2011 at 4:09 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: > [Anyone who has followed along this far has earned my undying > gratitude.] > > Since the chain of transactions has to overlap T0 and T3, I don't see > how that can happen. We established that T0 has to commit before T3 can > start, so the chain will ultimately have to get to that T0 commit. > > I don't want to start ripping out the apparently useless code without > someone checking my logic here. One big gaff in this area is quite > enough for me. :-/ Anyone? How is an UPDATE different from a DELETE and an INSERT? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas wrote: > How is an UPDATE different from a DELETE and an INSERT? That's the question Jeff asked which caused us to revisit this. There are two answers -- conceptual and implementation. Conceptually, an updated row is the same row, and a row inserted after a delete is a new row. Note that READ COMMITTED doesn't treat them the same on a write conflict. To give a practical example, police departments in Wisconsin typically reassign a badge number to a new officer after an existing officer leaves. Updating an officer record keyed by badge number (say, with a new address or a name change) would be qualitatively different from deleting an old officer record and inserting a new one for a different person now getting the badge number.(OK, so this is somewhat artificial, because theytrack who had the number in what temporal ranges, but you get the idea.) In the implementation the only difference between an UPDATE in a table and a DELETE and INSERT in the same table in the same transaction (besides concurrency handling) is the ctid linkage, at least as far as I can see. So I think that you can't just treat the two things the same in SSI just because the PostgreSQL implementation details make them similar; but I think that you can treat the two things the same for the reasons I worked out at the start of the thread. -Kevin
On Sat, May 21, 2011 at 8:50 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: > Robert Haas wrote: >> How is an UPDATE different from a DELETE and an INSERT? > > That's the question Jeff asked which caused us to revisit this. > > There are two answers -- conceptual and implementation. > > Conceptually, an updated row is the same row, and a row inserted after a > delete is a new row. Note that READ COMMITTED doesn't treat them the > same on a write conflict. To give a practical example, police > departments in Wisconsin typically reassign a badge number to a new > officer after an existing officer leaves. Updating an officer record > keyed by badge number (say, with a new address or a name change) would > be qualitatively different from deleting an old officer record and > inserting a new one for a different person now getting the badge number. > (OK, so this is somewhat artificial, because they track who had the > number in what temporal ranges, but you get the idea.) > > In the implementation the only difference between an UPDATE in a table > and a DELETE and INSERT in the same table in the same transaction > (besides concurrency handling) is the ctid linkage, at least as far as I > can see. I think the implementation is what matters here. I understand that there are certain situations in which the user might choose to UPDATE a row and other situations in which they might choose to DELETE and then INSERT: but the user's intent is basically irrelevant. If the system treats those operations in basically the same way, then it shouldn't be necessary to follow the CTID chain in one case given that there is no CTID chain in the other case. Or to put that another way, if it is necessary to follow the CTID chain, then we should be able to articulate a reason for that necessity -- something that is materially different in the UPDATE case. Otherwise, we're just following the chain "because it's there". It seems to me that we can actually state with some degree of precision what that "material difference" would need to be. The goal of SSI is to prevent serialization anomalies that would not be prevented by snapshot isolation. Let's suppose that it successfully does that in the DELETE/INSERT case. Suppose further that we change SSI so that it handles the UPDATE case in the same way that it handles the DELETE/INSERT case. This change will be incorrect only if there is a serialization anomaly that snapshot isolation *would have prevented* in the DELETE/INSERT case that *it does not prevent* in the update case. In other words, if SSI needs to be more rigorous in the UPDATE case, it can only be because snapshot isolation is less rigorous in that case, and the additional rigor that SSI must apply there must be exactly equal to whatever snapshot isolation isn't picking up (as compared with the DELETE/INSERT case). Does that make any sense? It seems logical to me, but IJWH. > So I think that you can't just treat the two things the same in SSI just > because the PostgreSQL implementation details make them similar; but I > think that you can treat the two things the same for the reasons I > worked out at the start of the thread. Your argument seems reasonable to me; but it would be nice if we could find a simpler one, because simpler arguments are less likely to be incorrect. :-) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas wrote: > I think the implementation is what matters here. I understand that > there are certain situations in which the user might choose to > UPDATE a row and other situations in which they might choose to > DELETE and then INSERT: but the user's intent is basically > irrelevant. We don't consider it irrelevant when we decide which triggers to fire. We do have update triggers distinct from the insert and delete triggers. We also consider it relevant when dealing with a write conflict in READ COMMITTED mode. Those facts make me very reluctant to move based on a simple assertion that it doesn't matter. > If the system treats those operations in basically the same way, > then it shouldn't be necessary to follow the CTID chain in one case > given that there is no CTID chain in the other case. Or to put that > another way, if it is necessary to follow the CTID chain, then we > should be able to articulate a reason for that necessity -- > something that is materially different in the UPDATE case. There is a wealth of research on which SSI is based. I've read many (although by no means *all*) of the papers on the topic, and all of the ones I've read have been based around the concept of a row which can be updated and retain its identity. I trust the research, but I think it is incumbent on us to prove, rather than just assert, that it can be applied just as well to a row-version tuple. I sure hope it can, because we can have faster, leaner, less fragile code that way. I've attempted to write out a proof; although I won't trust that without further review -- by me and by others. > Otherwise, we're just following the chain "because it's there". Why would you say it *is* there? > It seems to me that we can actually state with some degree of > precision what that "material difference" would need to be. The > goal of SSI is to prevent serialization anomalies that would not be > prevented by snapshot isolation. Let's suppose that it successfully > does that in the DELETE/INSERT case. Suppose further that we change > SSI so that it handles the UPDATE case in the same way that it > handles the DELETE/INSERT case. This change will be incorrect only > if there is a serialization anomaly that snapshot isolation *would > have prevented* in the DELETE/INSERT case that *it does not > prevent* in the update case. I don't see that -- it could be correct because of the conceptual difference between an UPDATE and a DELETE/INSERT pair. > In other words, if SSI needs to be more rigorous in the UPDATE > case, it can only be because snapshot isolation is less rigorous in > that case, and the additional rigor that SSI must apply there must > be exactly equal to whatever snapshot isolation isn't picking up > (as compared with the DELETE/INSERT case). > > Does that make any sense? It seems logical to me, but IJWH. I've always loved logic, but one of the most intriguing aspects is identifying the unproven assumptions in an argument. You have a built-in premise that there is no significant difference between an UPDATE and a DELETE/INSERT pair, in which case the logic is flawless which is leading you to the conclusion that a lock on the visible tuple is enough. I'm not confident in that premise, so the simple argument doesn't persuade me. > Your argument seems reasonable to me; Thanks much for fighting through it. It is heartening that you couldn't punch any holes in it. > but it would be nice if we could find a simpler one, because > simpler arguments are less likely to be incorrect. :-) All generalizations are false. :-) -Kevin
On Mon, May 23, 2011 at 2:26 AM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: > I don't see that -- it could be correct because of the conceptual > difference between an UPDATE and a DELETE/INSERT pair. > >> In other words, if SSI needs to be more rigorous in the UPDATE >> case, it can only be because snapshot isolation is less rigorous in >> that case, and the additional rigor that SSI must apply there must >> be exactly equal to whatever snapshot isolation isn't picking up >> (as compared with the DELETE/INSERT case). >> >> Does that make any sense? It seems logical to me, but IJWH. > > I've always loved logic, but one of the most intriguing aspects is > identifying the unproven assumptions in an argument. You have a > built-in premise that there is no significant difference between an > UPDATE and a DELETE/INSERT pair, in which case the logic is flawless > which is leading you to the conclusion that a lock on the visible > tuple is enough. I'm not confident in that premise, so the simple > argument doesn't persuade me. I *think* (but am in no means familiar with SSI, or an expert on the problems it's trying to solve), that Robert was only arguing that SSI is only "relevant" to solve problems that the non SSI wouldn't catch. And the "sameness" of UPDATE vs DELETE+INSERT, is simply because if you can only see the data as it was *completely before* or *completely after* a transaction (not as it was after the delete, before the insert), then to you, it doesn't matter if the transaction did an UPDATE, or an DELETE+INSERT. All you see is either $OLDROW, or $NEWROW, depending if you see it before, or after, not the transformation from $OLDROW to $NEWROW. So, if SSI conflicts something on the UPDATE case, it would necessrily have to conflict the DELETE+INSERT case as well, and vice-versa. a. -- Aidan Van Dyk Create like a god, aidan@highrise.ca command like a king, http://www.highrise.ca/ work like a slave.
Aidan Van Dyk <aidan@highrise.ca> writes: > So, if SSI conflicts something on the UPDATE case, it would necessrily > have to conflict the DELETE+INSERT case as well, and vice-versa. This argument is fundamentally bogus, because an UPDATE is not the same thing as DELETE followed by INSERT. In the former case the new tuple will have a ctid link from the old one; in the latter case not. And that will affect the behavior of subsequent operations. Another user-visible difference is that if the table has OIDs, the latter case will (usually) give the new tuple a different OID. Or if you don't like OIDs, think about a serial column. Or an ON INSERT trigger with side-effects. It may be that the result Kevin seeks to prove is in fact true, but an argument that requires the assumption that UPDATE is indistinguishable from DELETE+INSERT isn't going to persuade me, because I don't have any trouble whatsoever in distinguishing them. regards, tom lane
On Mon, May 23, 2011 at 10:20 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Aidan Van Dyk <aidan@highrise.ca> writes: >> So, if SSI conflicts something on the UPDATE case, it would necessrily >> have to conflict the DELETE+INSERT case as well, and vice-versa. > > This argument is fundamentally bogus, because an UPDATE is not the same > thing as DELETE followed by INSERT. In the former case the new tuple > will have a ctid link from the old one; in the latter case not. And > that will affect the behavior of subsequent operations. Right. The point I was driving at is - in what way will that affect the behavior of subsequent operations? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> wrote: > The point I was driving at is - in what way will that affect the > behavior of subsequent operations? You haven't answered why you think it should matter for OID or for write conflict on READ COMMITTED but not here. The logical concept of a row which is modified is a meaningful one regardless of any particular database's internal implementation. The fact that the proofs of SSI techniques use row-oriented terminology shouldn't be simply ignored. The fact that the two prototype implementations in the papers on the topic used databases with "update in place with a rollback log" for their MVCC implementations, and took their predicate locks on the "in place" row, reinforce that. No flaws jumped out at me in a review of the longer logical argument after sleeping on it, Robert said it looked good to him, and Dan said he would look at it soon. If it looks good to Dan, and nobody else pokes a hole in the logic, I'll have enough confidence to proceed. -Kevin
On Sat, May 21, 2011 at 03:09:12PM -0500, Kevin Grittner wrote: > I went back to the example which persuaded me and took another look. On > review I see that this didn't prove the point because there was a > dangerous structure with T1 as a pivot which should have caused SSI to > break the cycle. Since it didn't, either I got careless in my testing > methodology (which I will recheck when I get back to Wisconsin) or there > was a bug -- but either way, this example can't prove that the predicate > locks need to follow the row to new versions. I'm still working through the more general question, but I wanted to see what was going on with this example. It appears there's a bug, because this doesn't cause a rollback without the version linking. Specifically, the problem is a missing check in OnConflict_CheckForSerializationFailure. We check whether the conflict has caused the writer to become a pivot, but only if neither the reader or writer is committed. Why is that last condition there? In this case, the reader (T4) has committed but the writer (T1) hasn't. It would be worth making sure there aren't any other cases we're missing here, although I don't see any right now. Dan -- Dan R. K. Ports MIT CSAIL http://drkp.net/
Dan Ports <drkp@csail.mit.edu> wrote: > On Sat, May 21, 2011 at 03:09:12PM -0500, Kevin Grittner wrote: >> I went back to the example which persuaded me and took another >> look. On review I see that this didn't prove the point because >> there was a dangerous structure with T1 as a pivot which should >> have caused SSI to break the cycle. Since it didn't, either I >> got careless in my testing methodology (which I will recheck when >> I get back to Wisconsin) or there was a bug -- but either way, >> this example can't prove that the predicate locks need to follow >> the row to new versions. > > I'm still working through the more general question, but I wanted > to see what was going on with this example. It appears there's a > bug, because this doesn't cause a rollback without the version > linking. > > Specifically, the problem is a missing check in > OnConflict_CheckForSerializationFailure. We check whether the > conflict has caused the writer to become a pivot, but only if > neither the reader or writer is committed. Why is that last > condition there? Because Fekete et al proved that the pivot doesn't cause an anomaly unless the transaction read by the pivot commits before the pivot or the transaction reading the pivot. The problem has to be somewhere that fails to detect the T1 pivot. > In this case, the reader (T4) has committed but the writer (T1) > hasn't. The T4 commit-first makes this safe. Everything should be OK until the session 1 activity at the end, which makes T1 a pivot with T2 committing first. I believe we should get the error on this statement: update t set txt = 'a' where id = 1; I was too wiped out from travel to look at it last night, and can't spend any time on it during business hours today, but after 18:00 CDT today this has my attention. -Kevin
Dan Ports <drkp@csail.mit.edu> wrote: > Specifically, the problem is a missing check in > OnConflict_CheckForSerializationFailure. We check whether the > conflict has caused the writer to become a pivot, but only if > neither the reader or writer is committed. Why is that last > condition there? In this case, the reader (T4) has committed but > the writer (T1) hasn't. OK, I misread your post -- you are looking at T1 as the pivot, and that test *is* the problem. When T1 becomes the pivot the reader (T4) is committed, but it committed *after* T2. I can submit a patch for that this evening, after testing to confirm that if finds the T1 pivot, unless you want to get it. Sorry for the misunderstanding. I'm sneaking peeks at this during compiles of other stuff.... -Kevin
I have thought about this quite a bit and am fairly certain we do not need to track this linkage between row versions. One strong hint to this is that all the work I've seen on multiversion serializability theory defines a rw-conflict to be one transaction reading an object and the other writing *the next version* of the same object. But maybe that doesn't answer the question conclusively -- after all, the differences between an object, a tuple, and a row are subtle. So see the argument below: [I think it's similar to the argument you were making.] On Sat, May 21, 2011 at 03:09:12PM -0500, Kevin Grittner wrote: > The issue can be rephrased to this: If transaction T1 reads a row (thus > acquiring a predicate lock on it) and a second transaction T2 updates > that row, must a third transaction T3 which updates the new version of > the row have a rw-conflict in from T1? OK, that's a good way to phrase the question. Does it matter whether this edge T1 -> T3 is there? If T1 has a conflict in, it certainly doesn't. Adding the edge T1 -> T3 would create a dangerous structure, but we already had one from the edge T1 -> T2, so we would have aborted something anyway. Now let's consider the case where T1 doesn't have a conflict in. If that's the case, for this edge T1 -> T3 to make a difference, T3 must have a rw-conflict out that induces a cycle in the dependency graph, i.e. a conflict out to some transaction preceding T1 in the serial order. (A conflict out to T1 would work too, but that would mean T1 has a conflict in and we would have rolled back.) So now we're trying to figure out if there can be an rw-conflict edge T3 -> T0, where T0 is some transaction that precedes T1. For T0 to precede T1, there has to be has to be some edge, or sequence of edges, from T0 to T1. At least the last edge has to be a wr-dependency or ww-dependency rather than a rw-conflict, because T1 doesn't have a rw-conflict in. And that gives us enough information about the order of transactions to see that T3 can't have a rw-dependency to T0:- T0 committed before T1 started (the wr/ww-dependency impliesthis)- T1 started before T2 committed (the T1->T2 rw-conflict implies this)- T2 committed before T3 started (otherwise,T3 would be aborted because of an update conflict) That means T0 committed before T3 started, and therefore there can't be a rw-conflict from T3 to T0. In both cases, we didn't need the T1 -> T3 edge. Does that make sense to you? Unless anyone can poke a hole in that reasoning, I think we can get rid of the code to handle later versions, which would make me happy for many reasons. It will not be missed. Dan -- Dan R. K. Ports MIT CSAIL http://drkp.net/
Dan Ports <drkp@csail.mit.edu> wrote: > [I think it's similar to the argument you were making.] Similar, but you found a simpler, shorter way to the end, which should make Robert happy. ;-) > On Sat, May 21, 2011 at 03:09:12PM -0500, Kevin Grittner wrote: >> The issue can be rephrased to this: If transaction T1 reads a row >> (thus acquiring a predicate lock on it) and a second transaction >> T2 updates that row, must a third transaction T3 which updates >> the new version of the row have a rw-conflict in from T1? > > OK, that's a good way to phrase the question. Does it matter > whether this edge T1 -> T3 is there? > > If T1 has a conflict in, it certainly doesn't. Adding the edge T1 > -> T3 would create a dangerous structure, but we already had one > from the edge T1 -> T2, so we would have aborted something anyway. I had to pause a moment on that because you didn't mention the "early enough commit" aspects; but certainly if T3 commits early enough to cause a problem then T2 does, so this is solid. > Now let's consider the case where T1 doesn't have a conflict in. > If that's the case, for this edge T1 -> T3 to make a difference, > T3 must have a rw-conflict out that induces a cycle in the > dependency graph, i.e. a conflict out to some transaction > preceding T1 in the serial order. (A conflict out to T1 would work > too, but that would mean T1 has a conflict in and we would have > rolled back.) Agreed. > So now we're trying to figure out if there can be an rw-conflict > edge T3 -> T0, where T0 is some transaction that precedes T1. For > T0 to precede T1, there has to be has to be some edge, or sequence > of edges, from T0 to T1. At least the last edge has to be a > wr-dependency or ww-dependency rather than a rw-conflict, because > T1 doesn't have a rw-conflict in. > > And that gives us enough information about the order of > transactions to see that T3 can't have a rw-dependency to T0: > - T0 committed before T1 started (the wr/ww-dependency implies > this) > - T1 started before T2 committed (the T1->T2 rw-conflict implies > this) > - T2 committed before T3 started (otherwise, T3 would be aborted > because of an update conflict) > > That means T0 committed before T3 started, [scribbles diagrams] Yep. > and therefore there can't be a rw-conflict from T3 to T0. No overlap means no rw-conflict; so that has to be true. > In both cases, we didn't need the T1 -> T3 edge. > > > Does that make sense to you? Makes sense to me. Like the proof I offered, you have shown that there is no cycle which can develop with the locks copied which isn't there anyway if we don't copy the locks. > Unless anyone can poke a hole in that reasoning, I think we can > get rid of the code to handle later versions, Actually, we now have two independent proofs which don't have much in common beyond the initial statement of the problem. Someone would need to show a flaw in that premise or punch holes in both proofs. I think we can get by with just the shorter proof (yours) in the README file, though. -Kevin
"Kevin Grittner" wrote: > Dan Ports wrote: >> Does that make sense to you? > > Makes sense to me. Like the proof I offered, you have shown that > there is no cycle which can develop with the locks copied which > isn't there anyway if we don't copy the locks. I woke up with the nagging thought that while the above is completely accurate, it deserves some slight elaboration. These proofs show that there is no legitimate cycle which could cause an anomaly which the move from row-based to tuple-based logic will miss. They don't prove that the change will generate all the same serialization failures; and in fact, some false positives are eliminated by the change. That's a good thing. In addition to the benefits mentioned in prior posts, there will be a reduction in the rate of rollbacks (in particular corner cases) from what people see in beta1 without a loss of correctness. -Kevin
On Tue, May 24, 2011 at 04:18:37AM -0500, Kevin Grittner wrote: > These proofs show that > there is no legitimate cycle which could cause an anomaly which the > move from row-based to tuple-based logic will miss. They don't prove > that the change will generate all the same serialization failures; > and in fact, some false positives are eliminated by the change. Yes, that's correct. That's related to the part in the proof where I claimed T3 couldn't have a conflict out *to some transaction T0 that precedes T1*. I originally tried to show that T3 couldn't have any conflicts out that T2 didn't have, which would mean we got the same set of serialization failures, but that's not true. In fact, it's not too hard to come up with an example where there would be a serialization failure with the row version links, but not without. However, because the rw-conflict can't be pointing to a transaction that precedes T1 in the serial order, it won't create a cycle. In other words, there are serialization failures that won't happen anymore, but they were false positives. Dan -- Dan R. K. Ports MIT CSAIL http://drkp.net/
> Dan Ports wrote: > On Tue, May 24, 2011 at 04:18:37AM -0500, Kevin Grittner wrote: >> These proofs show that there is no legitimate cycle which could >> cause an anomaly which the move from row-based to tuple-based >> logic will miss. They don't prove that the change will generate >> all the same serialization failures; and in fact, some false >> positives are eliminated by the change. > > Yes, that's correct. That's related to the part in the proof where > I claimed T3 couldn't have a conflict out *to some transaction T0 > that precedes T1*. > > I originally tried to show that T3 couldn't have any conflicts out > that T2 didn't have, which would mean we got the same set of > serialization failures, but that's not true. In fact, it's not too > hard to come up with an example where there would be a > serialization failure with the row version links, but not without. > However, because the rw-conflict can't be pointing to a transaction > that precedes T1 in the serial order, it won't create a cycle. In > other words, there are serialization failures that won't happen > anymore, but they were false positives. Dan and I went around a couple times chasing down all code, comment, and patch changes needed, resulting in the attached patch. We found and fixed the bug which originally manifested in a way which I confused with a need for row locks, as well as another which was nearby in the code. We backed out the changes which were causing merge problems for Robert, as those were part of the attempt at the row locking (versus tuple locking). We removed a function which is no longer needed. We adjusted the comments and an affected isolation test. As might be expected from removing an unnecessary feature, the lines of code went down -- a net decrease of 93 lines. This patch applies against HEAD, `make check-world` passes as does `make -C src/test/isolation installcheck` and `make installcheck-world` against a database with default_transaction_isolation = 'serializable'. Dan is running stress testing against the patched version tonight with DBT-2. These changes generate merge conflicts with the work I've done on handling CLUSTER, DROP INDEX, etc. It seems to me that the best course would be to commit this, then I can rebase the other work and post it. Since these issues are orthogonal, it didn't seem like a good idea to combine them in one patch, and this one seems more urgent. -Kevin
Вложения
On 26.05.2011 06:19, Kevin Grittner wrote: > Dan and I went around a couple times chasing down all code, comment, > and patch changes needed, resulting in the attached patch. We found > and fixed the bug which originally manifested in a way which I > confused with a need for row locks, as well as another which was > nearby in the code. We backed out the changes which were causing > merge problems for Robert, as those were part of the attempt at the > row locking (versus tuple locking). We removed a function which is > no longer needed. We adjusted the comments and an affected isolation > test. Could you explain in the README, why it is safe to only take the lock on the visible row version, please? It's not quite obvious, as we've seen from this discussion, and if I understood correctly the academic papers don't touch that subject either. > As might be expected from removing an unnecessary feature, the lines > of code went down -- a net decrease of 93 lines. That's the kind of patch I like :-). > These changes generate merge conflicts with the work I've done on > handling CLUSTER, DROP INDEX, etc. It seems to me that the best > course would be to commit this, then I can rebase the other work and > post it. Since these issues are orthogonal, it didn't seem like a > good idea to combine them in one patch, and this one seems more > urgent. Agreed. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > Could you explain in the README, why it is safe to only take the > lock on the visible row version, please? Sure. I actually intended to do this last night but ran out of steam and posted what I had, planning on following up with that. The place it seemed to fit best was in the "Innovations" section, since the SSI papers and their prototype implementations seemed oriented toward "rows" -- certainly the SIREAD locks were at the row level, versus a row version level. Since this doesn't touch any of the files in yesterday's patch, and it seems entirely within the realm of possibility that people will want to argue about how best to document this more than the actual fix, I'm posting it as a separate patch -- README-SSI only. I mostly just copied from Dan's posted proof verbatim. -Kevin
Вложения
[The first attempt to send this shows as undeliverable to the list. Resending for completeness and coherency of archives. Apologies to those getting duplicates.] > Heikki Linnakangas wrote: > On 26.05.2011 06:19, Kevin Grittner wrote: > >> /* >> * Check whether the writer has become a pivot with an >> * out-conflict committed transaction, while neither reader >> * nor writer is committed. If the reader is a READ ONLY >> * transaction, there is only a serialization failure if an >> * out-conflict transaction causing the pivot committed >> * before the reader acquired its snapshot. (That is, the >> * reader must not have been concurrent with the out-conflict >> * transaction.) >> */ > > The comment is not in sync with the code. The code is not checking > that "neither reader or writer has committed", but something more > complicated. True. We changed the code but not the comment. Sorry for missing that. Basically the quoted clause would be correct by changing it to "neither reader nor writer committed first." (Your rewrite, discussed below, is better, though.) > I find it helps tremendously to draw the dangerous structures being > checked, in addition to just explaining them in text. Agreed -- I spent many an hour with the "dangerous structure" diagram in front of me when thinking through the mechanics of implementation. > Ascii art is a bit clunky, but I think we have to do it here. OK. > I did some of that in the comments, and I think I understand it > now. See attached patch. Does that look right to you? ! * A serialization failure can only occur if there is a dangerous ! * structure in the dependency graph: ! * ! * Tin ------> Tpivot ------> Tout ! * rw rw ! * ! * Furthermore, Tout must commit first. I agree that this is easier to read and understand than just straight text; but you dropped one other condition, which we use rather heavily. We should probably add back something like: * If Tin is declared READ ONLY (or commits without writing), we can * only have a problem if Tout committed before Tin acquired its * snapshot. > * XXX: Where does that last condition come from? This optimization is an original one, not yet appearing in any academic papers; Dan and I are both convinced it is safe, and in off-list correspondence with Michael Cahill after he left Oracle, he said that he discussed this with Alan Fekete and they both concur that it is a safe and good optimization. This optimization is mentioned in the README-SSI file in the "Innovations" section. Do you think that file needs to have more on the topic? > * XXX: for an anomaly to occur, T2 must've committed before W. > * Couldn't we easily check that here, or does the fact that W > * committed already somehow imply it? The flags and macros should probably be renamed to make it more obvious that this is covered. The flag which SxactHasConflictOut is based on is only set when the conflict out is to a transaction which committed ahead of the flagged transaction. Regarding the other test -- since we have two transactions (Tin and Tout) which have not been summarized, and transactions are summarized in order of commit, SxactHasSummaryConflictOut implies that the transaction with the flag set has not committed before the transaction to which it has a rw-conflict out. The problem is coming up with a more accurate name which isn't really long. The best I can think of is SxactHasConflictOutToEarlierCommit, which is a little long. If nobody has a better idea, I guess it's better to have a long name than a misleading one. Do you think we need to rename SxactHasSummaryConflictOut or just add a comment on that use that a summarized transaction will always be committed ahead of any transactions which are not summarized? > (note that I swapped the second and third check in the function, I > think it's more straightforward that way). It doesn't matter to me either way, so if it seems clearer to you, that's a win. > Should we say "Cancelled on identification as pivot, during ...", > or "Cancelled on identification as a pivot, during ..." ? We seem > to use both in the error messages. Consistency is good. I think it sounds better with the indefinite article in there. Do you want me to post another patch based on this, or are these changes from what you posted small enough that you would prefer to just do it as part of the commit? Thanks for taking the time to work through this. As always, your ideas improve things. -Kevin
On 26.05.2011 06:19, Kevin Grittner wrote: > /* > * Check whether the writer has become a pivot with an out-conflict > * committed transaction, while neither reader nor writer is committed. If > * the reader is a READ ONLY transaction, there is only a serialization > * failure if an out-conflict transaction causing the pivot committed > * before the reader acquired its snapshot. (That is, the reader must not > * have been concurrent with the out-conflict transaction.) > */ > if (!failure) > { > if (SxactHasSummaryConflictOut(writer)) > { > failure = true; > conflict = NULL; > } > else > conflict = (RWConflict) > SHMQueueNext(&writer->outConflicts, > &writer->outConflicts, > offsetof(RWConflictData, outLink)); > while (conflict) > { > if (SxactIsCommitted(conflict->sxactIn) > && (!SxactIsCommitted(reader) > || conflict->sxactIn->commitSeqNo <= reader->commitSeqNo) > && (!SxactIsCommitted(writer) > || conflict->sxactIn->commitSeqNo <= writer->commitSeqNo) > && (!SxactIsReadOnly(reader) > || conflict->sxactIn->commitSeqNo <= reader->SeqNo.lastCommitBeforeSnapshot)) > { > failure = true; > break; > } > conflict = (RWConflict) > SHMQueueNext(&writer->outConflicts, > &conflict->outLink, > offsetof(RWConflictData, outLink)); > } > } The comment is not in sync with the code. The code is not checking that "neither reader or writer has committed", but something more complicated. Looking at OnConflict_CheckForSerializationFailure(), it's really hard to see how it works, and why the conditions it checks are sufficient. I find it helps tremendously to draw the dangerous structures being checked, in addition to just explaining them in text. Ascii art is a bit clunky, but I think we have to do it here. I did some of that in the comments, and I think I understand it now. See attached patch. Does that look right to you? (note that I swapped the second and third check in the function, I think it's more straightforward that way). I also added a couple of questions about the conditions, marked with XXX comments. Can you answer those, please? PS. Should we say "Cancelled on identification as pivot, during ...", or "Cancelled on identification as a pivot, during ..." ? We seem to use both in the error messages. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Вложения
On 30.05.2011 17:10, Kevin Grittner wrote: >> Heikki Linnakangas wrote: >> On 26.05.2011 06:19, Kevin Grittner wrote: >> I did some of that in the comments, and I think I understand it >> now. See attached patch. Does that look right to you? > > ! * A serialization failure can only occur if there is a dangerous > ! * structure in the dependency graph: > ! * > ! * Tin ------> Tpivot ------> Tout > ! * rw rw > ! * > ! * Furthermore, Tout must commit first. > > I agree that this is easier to read and understand than just straight > text; but you dropped one other condition, which we use rather > heavily. We should probably add back something like: > > * If Tin is declared READ ONLY (or commits without writing), we can > * only have a problem if Tout committed before Tin acquired its > * snapshot. > >> * XXX: Where does that last condition come from? > > This optimization is an original one, not yet appearing in any > academic papers; Dan and I are both convinced it is safe, and in > off-list correspondence with Michael Cahill after he left Oracle, he > said that he discussed this with Alan Fekete and they both concur > that it is a safe and good optimization. > > This optimization is mentioned in the README-SSI file in the > "Innovations" section. Do you think that file needs to have more on > the topic? Oh, I see this: > o We can avoid ever rolling back a transaction when the > transaction on the conflict *in* side of the pivot is explicitly or > implicitly READ ONLY unless the transaction on the conflict *out* > side of the pivot committed before the READ ONLY transaction acquired > its snapshot. (An implicit READ ONLY transaction is one which > committed without writing, even though it was not explicitly declared > to be READ ONLY.) Since this isn't coming from the papers, it would be nice to explain why that is safe. >> * XXX: for an anomaly to occur, T2 must've committed before W. >> * Couldn't we easily check that here, or does the fact that W >> * committed already somehow imply it? > > The flags and macros should probably be renamed to make it more > obvious that this is covered. The flag which SxactHasConflictOut is > based on is only set when the conflict out is to a transaction which > committed ahead of the flagged transaction. Regarding the other > test -- since we have two transactions (Tin and Tout) which have not > been summarized, and transactions are summarized in order of commit, > SxactHasSummaryConflictOut implies that the transaction with the flag > set has not committed before the transaction to which it has a > rw-conflict out. Ah, got it. > The problem is coming up with a more accurate name which isn't really > long. The best I can think of is SxactHasConflictOutToEarlierCommit, > which is a little long. If nobody has a better idea, I guess it's > better to have a long name than a misleading one. Do you think we > need to rename SxactHasSummaryConflictOut or just add a comment on > that use that a summarized transaction will always be committed ahead > of any transactions which are not summarized? Dunno. I guess a comment would do. Can you write a separate patch for that, please? >> (note that I swapped the second and third check in the function, I >> think it's more straightforward that way). > > It doesn't matter to me either way, so if it seems clearer to you, > that's a win. FWIW, the reason I prefer it that way is that the function is checking three scenarios: R ----> W ----> T2, where W committed after T2 R ----> W ----> T2, "else" T0 ----> R ----> W It seems more straightforward to keep both "R ----> W ----> T2" cases together. >> Should we say "Cancelled on identification as pivot, during ...", >> or "Cancelled on identification as a pivot, during ..." ? We seem >> to use both in the error messages. > > Consistency is good. I think it sounds better with the indefinite > article in there. Ok, will do. > Do you want me to post another patch based on this, or are these > changes from what you posted small enough that you would prefer to > just do it as part of the commit? I've committed with the discussed changes. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > On 30.05.2011 17:10, Kevin Grittner wrote: >> This optimization is an original one, not yet appearing in any >> academic papers; Dan and I are both convinced it is safe, and in >> off-list correspondence with Michael Cahill after he left Oracle, >> he said that he discussed this with Alan Fekete and they both >> concur that it is a safe and good optimization. >> >> This optimization is mentioned in the README-SSI file in the >> "Innovations" section. Do you think that file needs to have more on >> the topic? > > Oh, I see this: > >> o We can avoid ever rolling back a transaction when the >> transaction on the conflict *in* side of the pivot is explicitly or >> implicitly READ ONLY unless the transaction on the conflict *out* >> side of the pivot committed before the READ ONLY transaction >> acquired its snapshot. (An implicit READ ONLY transaction is one >> which committed without writing, even though it was not >> explicitly declared to be READ ONLY.) > > Since this isn't coming from the papers, it would be nice to > explain why that is safe. I see that this issue first surfaced on the Wiki page 2 April, 2010, and was never really described in detail on the -hackers list. To ensure that it has some documentation here (in case of any possible IP controversy), I will describe a proof now. For earlier references one could dig around in Wiki history, posted patches during CFs, and the git repository history. I have kept a copy of the repo before the official conversion from CVS. From many academic papers, there is well-established proof that serialization anomalies can only occur under snapshot isolation when there is a cycle in the graph of apparent order of execution of the transactions, and that in such a cycle the following pattern of rw-dependencies always occurs: Tin - - -> Tpivot - - -> Tout A rw-dependency (also called a rw-conflict) exists when a read by one transaction doesn't see the write of another transaction because the two transactions overlap, regardless of whether the read or the write actually happens first. Since the reader doesn't see the work of the writer, the reader appears to have executed first, regardless of the actual order of snapshot acquisition or commits. The arrows show the apparent order of execution of the transactions -- Tin first, Tout last. Published papers have further proven that the transaction which appears to have executed last of these three must actually commit before either of the others for an anomaly to occur. Tin and Tout may be the same transaction (as in the case of simple write skew), or two distinct transactions. SSI relies on recognition of this "dangerous structure" to decide when to cancel a transaction to preserve data integrity. No attempt is made to complete the cycle from Tout back to Tin. Partly this is because of the expense of doing so -- there could be a long chain of rw-dependencies, wr-dependencies (where the reader *can* see the work of another transaction because it *did* commit first), and ww-dependencies (where the writer is updating a row version left by another transaction, which must have committed first). There is also the uncomfortable possibility that a client application *outside* the database ran a transaction which made some change and, after observing the successful commit of this transaction, is proceeding with the knowledge that it ran before the next transaction is submitted. The novel optimization we've used in the PostgreSQL implementation of SSI is based on the fact that of these four ways of determining which transaction appears to have executed first, all but the rw-dependency are based on the transaction which appears to have executed first committing before the apparent later transaction acquires its snapshot. When you combine this with the fact that the transaction which appears to execute later in a rw-dependency is the one which performed the write, you have the basis of an interesting optimization for READ ONLY transactions. In the above diagram, if Tin is READ ONLY, it cannot have a rw-dependency *in* from some other transaction. The only way Tout can directly appear to have executed before Tin is if Tout committed before Tin acquired its snapshot, so that Tin can read something Tout wrote, or an external application can know that it successfully committed Tout before beginning Tin. The published conditions must all still hold -- Tin and Tout must both overlap Tpivot, the same rw-dependencies must exist, and Tout must still commit first of the three; but when Tin doesn't write, Tout must actually commit *even earlier* -- before Tin gets started -- to have an anomaly. For a proof the question becomes: If Tin does not write and Tin and Tout overlap, can a dependency or chain of dependencies develop which makes it appear that Tout executed before Tin? If this can't happen without creating a dangerous structure around a different pivot, then no cycle can develop and the optimization is safe. Since Tin doesn't write, no transaction can appear to come before it because of failure to read its writes -- no rw-dependency in to this transaction is possible. The only transactions Tin can appear to follow are transactions which committed in time to make their work visible to Tin. Let's assume such a transaction exists, called T0: T0 -----> Tin - - -> Tpivot - - -> Tout It would be possible for Tout to overlap T0 and develop a rw-dependency out to it, but T0 must commit before Tin starts, so for Tout to overlap Tin, T0 must commit before Tout, so a rw-dependency from Tout to T0 would make Tout a pivot and cause a rollback which would break the cycle. Any other transaction to which Tout developed a rw-dependency out would have the same problem. Any *other* type of dependency out from Tout would require the transaction to acquire its snapshot after the commit of Tout. Since T0 commits before Tin begins and Tout overlaps Tin, the commit of Tout must come after the commit of T0, so no transaction which begins after Tout commits can overlap T0. This leaves no possible way to form a cycle when Tin is READ ONLY and Tout overlaps Tin. I won't be shocked if Dan can come up with a shorter proof, but I'm confident this one is solid. Patch to come soon, but I thought it best to post the proof here first to allow a chance to refine it based on discussion -- or shorter proofs. -Kevin
On Wed, Jun 01, 2011 at 05:09:09PM -0500, Kevin Grittner wrote: > I won't be shocked if Dan can come up with a shorter proof, but I'm > confident this one is solid. Well, so happens I wrote a proof on the airplane today, before I saw your mail. It's actually quite straightforward... (well, at least more so than I was expecting) > From many academic papers, there is well-established proof that > serialization anomalies can only occur under snapshot isolation when > there is a cycle in the graph of apparent order of execution of the > transactions, and that in such a cycle the following pattern of > rw-dependencies always occurs: > > Tin - - -> Tpivot - - -> Tout > > A rw-dependency (also called a rw-conflict) exists when a read by > one transaction doesn't see the write of another transaction because > the two transactions overlap, regardless of whether the read or the > write actually happens first. Since the reader doesn't see the work > of the writer, the reader appears to have executed first, regardless > of the actual order of snapshot acquisition or commits. The arrows > show the apparent order of execution of the transactions -- Tin > first, Tout last. Published papers have further proven that the > transaction which appears to have executed last of these three must > actually commit before either of the others for an anomaly to occur. We can actually say something slightly stronger than that last sentence: Tout has to commit before *any* other transaction in the cycle. That doesn't help us implement SSI, because we never try to look at an entire cycle, but it's still true and useful for proofs like this. Now, supposing Tin is read-only... Since there's a cycle, there must also be a transaction that precedes Tin in the serial order. Call it T0. (T0 might be the same transaction as Tout, but that doesn't matter.) There's an edge in the graph from T0 to Tin. It can't be a rw-conflict, because Tin was read-only, so it must be a ww- or wr-dependency. Either means T0 committed before Tin started. Because Tout committed before any other transaction in the cycle, Tout has to commit before T0 commits -- and thus before Tin starts. Dan -- Dan R. K. Ports MIT CSAIL http://drkp.net/
Dan Ports <drkp@csail.mit.edu> wrote: > On Wed, Jun 01, 2011 at 05:09:09PM -0500, Kevin Grittner wrote: >> Published papers have further proven that the transaction which >> appears to have executed last of these three must actually commit >> before either of the others for an anomaly to occur. > > We can actually say something slightly stronger than that last > sentence: Tout has to commit before *any* other transaction in the > cycle. That doesn't help us implement SSI, because we never try to > look at an entire cycle, but it's still true and useful for proofs > like this. I didn't know that, although it doesn't seem too surprising. With that as a given, the proof can be quite short and straightforward. > Now, supposing Tin is read-only... > > Since there's a cycle, there must also be a transaction that > precedes Tin in the serial order. Call it T0. (T0 might be the > same transaction as Tout, but that doesn't matter.) There's an > edge in the graph from T0 to Tin. It can't be a rw-conflict, > because Tin was read-only, so it must be a ww- or wr-dependency. > Either means T0 committed before Tin started. > > Because Tout committed before any other transaction in the cycle, > Tout has to commit before T0 commits -- and thus before Tin > starts. If we're going to put this into the README-SSI as the proof of the validity of this optimization, I'd like to have a footnote pointing to a paper describing the "first commit in the cycle" aspect of a dangerous structure. Got any favorites, or should I fall back on a google search? -Kevin
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > On 30.05.2011 17:10, Kevin Grittner wrote: >> Heikki Linnakangas wrote: >>> * XXX: for an anomaly to occur, T2 must've committed >>> * before W. Couldn't we easily check that here, or does >>> * the fact that W committed already somehow imply it? >> >> The flags and macros should probably be renamed to make it more >> obvious that this is covered. The flag which SxactHasConflictOut >> is based on is only set when the conflict out is to a transaction >> which committed ahead of the flagged transaction. Regarding the >> other test -- since we have two transactions (Tin and Tout) which >> have not been summarized, and transactions are summarized in >> order of commit, SxactHasSummaryConflictOut implies that the >> transaction with the flag set has not committed before the >> transaction to which it has a rw-conflict out. > > Ah, got it. > >> The problem is coming up with a more accurate name which isn't >> really long. The best I can think of is >> SxactHasConflictOutToEarlierCommit, which is a little long. If >> nobody has a better idea, I guess it's better to have a long name >> than a misleading one. Do you think we need to rename >> SxactHasSummaryConflictOut or just add a comment on that use that >> a summarized transaction will always be committed ahead of any >> transactions which are not summarized? > > Dunno. I guess a comment would do. Can you write a separate patch > for that, please? Attached is a comments-only patch for this, along with one correction to the comments you added and a couple other typos. I'll submit a patch for the README-SSI file once I find a reference I like to a paper describing what Dan's proof uses as a premise -- that the transaction on the rw-conflict *out* side of the pivot must not only be the first of the three transactions in the dangerous structure to commit, but the first in the entire cycle of which the dangerous structure is a part. With that premise as a given, the proof is short, clear, and unassailable; but I think we need a cite to use that argument convincingly. -Kevin
Вложения
On Thu, Jun 02, 2011 at 01:01:05PM -0500, Kevin Grittner wrote: > If we're going to put this into the README-SSI as the proof of the > validity of this optimization, I'd like to have a footnote pointing > to a paper describing the "first commit in the cycle" aspect of a > dangerous structure. Got any favorites, or should I fall back on a > google search? Hmm. I don't see any that state that in so many words, but it's an obvious consequence of the proof of Theorem 2.1 in "Making Snapshot Isolation Serializable" -- note that T3 is chosen to be the transaction in the cycle with the earliest commit time. Dan -- Dan R. K. Ports MIT CSAIL http://drkp.net/
On 03.06.2011 00:14, Kevin Grittner wrote: > Attached is a comments-only patch for this, along with one > correction to the comments you added and a couple other typos. Ok, committed. > I'll submit a patch for the README-SSI file once I find a reference > I like to a paper describing what Dan's proof uses as a premise -- > that the transaction on the rw-conflict *out* side of the pivot must > not only be the first of the three transactions in the dangerous > structure to commit, but the first in the entire cycle of which the > dangerous structure is a part. With that premise as a given, the > proof is short, clear, and unassailable; but I think we need a cite > to use that argument convincingly. Agreed. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com