Обсуждение: Re: SSI non-serializalbe UPDATE performance (was: getting to beta)
> Dan Ports wrote: > Even under these conditions I couldn't reliably see a slowdown. My > latest batch of results (16 backends, median of three 10 minute > runs) shows a difference well below 1%. In a couple of cases I saw > the code with the SSI checks running faster than with them removed, > so this difference seems in the noise. Now that Dan has finished the comparative performance tests, and the version with SSI sometimes tests faster, sometimes slower, with the difference always a fraction of a percent even on a fairly unusual worst case environment (tmpfs and dataset fits in cache buffers), I think we can take this off the 9.1 "must fix" list. We've seen much larger changes with no explanation other than an assumption that the executable code heppens to line up on line cache boundaries differently. Unless there is an objection, I'll move this item from "Blockers for Beta1" to "Not Blockers for Beta1" on the Wiki page, pending results of the point raised below. > we might as well bail out of these functions early if there are no > serializable transactions running. Kevin points out we can do this > by checking if PredXact->SxactGlobalXmin is invalid. I would add > that we can do this safely without taking any locks, even on > weak-memory-ordering machines. Even if a new serializable > transaction starts concurrently, we have the appropriate buffer > page locked, so it's not able to take any relevant SIREAD locks. Even though this didn't show any difference in Dan's performance tests, it seems like reasonable insurance against creating a new bottleneck in very high concurrency situations. Dan, do you have a patch for this, or should I create one? -Kevin
On Sat, Apr 23, 2011 at 08:54:31AM -0500, Kevin Grittner wrote: > Even though this didn't show any difference in Dan's performance > tests, it seems like reasonable insurance against creating a new > bottleneck in very high concurrency situations. > > Dan, do you have a patch for this, or should I create one? Sure, patch is attached. Dan -- Dan R. K. Ports MIT CSAIL http://drkp.net/
Вложения
On Sun, Apr 24, 2011 at 11:33 PM, Dan Ports <drkp@csail.mit.edu> wrote: > On Sat, Apr 23, 2011 at 08:54:31AM -0500, Kevin Grittner wrote: >> Even though this didn't show any difference in Dan's performance >> tests, it seems like reasonable insurance against creating a new >> bottleneck in very high concurrency situations. >> >> Dan, do you have a patch for this, or should I create one? > > Sure, patch is attached. Committed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Apr 25, 2011 at 4:33 AM, Dan Ports <drkp@csail.mit.edu> wrote: > On Sat, Apr 23, 2011 at 08:54:31AM -0500, Kevin Grittner wrote: >> Even though this didn't show any difference in Dan's performance >> tests, it seems like reasonable insurance against creating a new >> bottleneck in very high concurrency situations. >> >> Dan, do you have a patch for this, or should I create one? > > Sure, patch is attached. Reading the code, IIUC, we check for RW conflicts after each write but only if the writer is running a serializable transaction. Am I correct in thinking that there is zero impact of SSI if nobody is running a serializable transaction? -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Wed, Apr 27, 2011 at 06:26:52PM +0100, Simon Riggs wrote: > Reading the code, IIUC, we check for RW conflicts after each write but > only if the writer is running a serializable transaction. > > Am I correct in thinking that there is zero impact of SSI if nobody is > running a serializable transaction? That is correct, now. (Well, other than having to check whether a serializable transaction is running, the cost of which is truly negligible.) Dan -- Dan R. K. Ports MIT CSAIL http://drkp.net/
Simon Riggs <simon@2ndQuadrant.com> wrote: > Reading the code, IIUC, we check for RW conflicts after each write > but only if the writer is running a serializable transaction. Correct as far as that statement goes. There are cases where predicate lock maintenance is needed when dealing with locked resources; see below. > Am I correct in thinking that there is zero impact of SSI if > nobody is running a serializable transaction? Benchmarks until a recent change actually were coming out slightly faster with the SSI patch. Since there's not other reason that should have been true, that pretty much had to be the random alignment of code on CPU cache boundaries. A recent change canceled that improvement; until we reverted SSI entirely for a comparative benchmark point we briefly thought SSI caused an overall fraction of a percent regression. We've considered changing the "call SSI method which does a quick return if not applicable" technique to "try to call SSI with a macro which skips the call if not applicable", but in the absence of any evidence of a performance hit with the simple, straightforward approach we have held off on that optimization attempt. There are cases where some minimal work is done in SSI for non-serializable transactions: (1) If a tuple which is predicate locked, or sits on a predicate- locked page, is updated, the predicate lock is duplicated for the new tuple. We have found patterns of updates involving four or more transactions where a non-serializable transaction can hide serialization anomalies among serializable transactions if we don't do this. Someone suggested that we could take out this call and just document that serializable transactions may not comply with the standard-defined behavior when there are concurrent non-serializable transactions. We were unable to show a measurable performance hit on this, although this was just with 32 clients hitting a 16 processor machine. There was at least a theoretical possibility that with higher levels of concurrency there could have been a new contention point for a LW lock here which could affect performance. We added a quick return which didn't need to check any locks at the front of this routine which is taken if there are no active serializable transactions on the cluster at the moment of update. (2) Page splits and page combines, even if they are from non-serializable transactions (like autovacuum) must take action if there are predicate locks on the pages involved. Again, there is a fast exit if no serializable transactions are active, and even before that check was added there was not a measurable impact on performance. If any of that isn't clear or leaves some concern, please let me know. -Kevin
On Wed, Apr 27, 2011 at 7:15 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: > Simon Riggs <simon@2ndQuadrant.com> wrote: > >> Reading the code, IIUC, we check for RW conflicts after each write >> but only if the writer is running a serializable transaction. > > Correct as far as that statement goes. Thanks. I'm surprised by that though, it seems weird. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Wed, Apr 27, 2011 at 7:15 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: > (1) If a tuple which is predicate locked, or sits on a predicate- > locked page, is updated, the predicate lock is duplicated for the > new tuple. We have found patterns of updates involving four or more > transactions where a non-serializable transaction can hide > serialization anomalies among serializable transactions if we don't > do this. Someone suggested that we could take out this call and > just document that serializable transactions may not comply with the > standard-defined behavior when there are concurrent non-serializable > transactions. We were unable to show a measurable performance hit > on this, although this was just with 32 clients hitting a 16 > processor machine. There was at least a theoretical possibility > that with higher levels of concurrency there could have been a new > contention point for a LW lock here which could affect performance. > We added a quick return which didn't need to check any locks at the > front of this routine which is taken if there are no active > serializable transactions on the cluster at the moment of update. Surprised to hear nobody mentioning memory reordering issues about that, but I'm not running Itaniums anywhere. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Thu, Apr 28, 2011 at 08:43:30AM +0100, Simon Riggs wrote: > > We added a quick return which didn't need to check any locks at the > > front of this routine which is taken if there are no active > > serializable transactions on the cluster at the moment of update. > > Surprised to hear nobody mentioning memory reordering issues about > that, but I'm not running Itaniums anywhere. I did spend a while thinking about it. There aren't any memory reordering issues with that optimization (even on the Alpha, where just about anything goes). The memory barrier when acquiring the buffer page lwlock acts as the synchronization point we need. When we see that no serializable transactions are running, that could have been reordered, but that read still had to come after the lock was taken. That's all we need: even if another backend starts a serializable transaction after that, we know it can't take any SIREAD locks on the same target while we're holding the buffer page lock. Dan -- Dan R. K. Ports MIT CSAIL http://drkp.net/
On Apr 28, 2011, at 9:55 AM, Dan Ports <drkp@csail.mit.edu> wrote: > On Thu, Apr 28, 2011 at 08:43:30AM +0100, Simon Riggs wrote: >>> We added a quick return which didn't need to check any locks at the >>> front of this routine which is taken if there are no active >>> serializable transactions on the cluster at the moment of update. >> >> Surprised to hear nobody mentioning memory reordering issues about >> that, but I'm not running Itaniums anywhere. > > I did spend a while thinking about it. There aren't any memory > reordering issues with that optimization (even on the Alpha, where just > about anything goes). > > The memory barrier when acquiring the buffer page lwlock acts as the > synchronization point we need. When we see that no serializable > transactions are running, that could have been reordered, but that read > still had to come after the lock was taken. That's all we need: even if > another backend starts a serializable transaction after that, we know > it can't take any SIREAD locks on the same target while we're holding > the buffer page lock. Sounds like that might be worth a comment. ...Robert
Robert Haas <robertmhaas@gmail.com> wrote: > On Apr 28, 2011, at 9:55 AM, Dan Ports <drkp@csail.mit.edu> wrote: >> The memory barrier when acquiring the buffer page lwlock acts as >> the synchronization point we need. When we see that no >> serializable transactions are running, that could have been >> reordered, but that read still had to come after the lock was >> taken. That's all we need: even if another backend starts a >> serializable transaction after that, we know it can't take any >> SIREAD locks on the same target while we're holding the buffer >> page lock. > > Sounds like that might be worth a comment. There were comments; after reading that post, do you think they need to be expanded or reworded?: http://git.postgresql.org/gitweb?p=postgresql.git;a=commitdiff;h=02e6a115cc6149551527a45545fd1ef8d37e6aa0 -Kevin
On Apr 28, 2011, at 6:29 PM, "Kevin Grittner" <Kevin.Grittner@wicourts.gov> wrote: > Robert Haas <robertmhaas@gmail.com> wrote: >> On Apr 28, 2011, at 9:55 AM, Dan Ports <drkp@csail.mit.edu> wrote: > >>> The memory barrier when acquiring the buffer page lwlock acts as >>> the synchronization point we need. When we see that no >>> serializable transactions are running, that could have been >>> reordered, but that read still had to come after the lock was >>> taken. That's all we need: even if another backend starts a >>> serializable transaction after that, we know it can't take any >>> SIREAD locks on the same target while we're holding the buffer >>> page lock. >> >> Sounds like that might be worth a comment. > > There were comments; after reading that post, do you think they need > to be expanded or reworded?: > > http://git.postgresql.org/gitweb?p=postgresql.git;a=commitdiff;h=02e6a115cc6149551527a45545fd1ef8d37e6aa0 Yeah, I think Dan's notes about memory ordering would be good to include. ...Robert
On Thu, Apr 28, 2011 at 06:45:54PM +0200, Robert Haas wrote: > Yeah, I think Dan's notes about memory ordering would be good to include. I left it out initially because I didn't want to make things more confusing. As far as memory ordering is concerned, this is the same story as anything else that uses lwlocks: the spinlock memory barrier prevents memory accesses from being reordered before the lock is acquired. The only unusual thing here is that the lock in question isn't the one that protects the variable we're reading. But I'm OK with adding a comment if you think it helps. Patch attached. Dan -- Dan R. K. Ports MIT CSAIL http://drkp.net/
Вложения
On Fri, Apr 29, 2011 at 3:23 AM, Dan Ports <drkp@csail.mit.edu> wrote: > On Thu, Apr 28, 2011 at 06:45:54PM +0200, Robert Haas wrote: >> Yeah, I think Dan's notes about memory ordering would be good to include. > > I left it out initially because I didn't want to make things more > confusing. As far as memory ordering is concerned, this is the same > story as anything else that uses lwlocks: the spinlock memory barrier > prevents memory accesses from being reordered before the lock is > acquired. The only unusual thing here is that the lock in question > isn't the one that protects the variable we're reading. > > But I'm OK with adding a comment if you think it helps. Patch attached. Looks good. Committed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company