Обсуждение: Storing hot members of PGPROC out of the band
Hi All, While working on some of the performance issues on HP-UX, I noticed a significant data cache misses for accessing PGPROC members. On a close inspection, it was quite evident that for large number (even few 10s) of clients, the loop inside GetSnapshotData will cause data cache miss for almost every PGPROC because the PGPROC structure is quite heavy and no more than one structure may fit in a single cache line. So I experimented by separating the most frequently and closely accessed members of the PGPROC into an out of band array. I call it PGPROC_MINIMAL structure which contains xid, xmin, vacuumFlags amongst others. Basically, all the commonly accessed members by GetSnapshotData find a place in this minimal structure. When PGPROC array is allocated, we also allocate another array of PGPROC_MINIMAL structures of the same size. While accessing the ProcArray, a simple pointer mathematic can get us the corresponding PGPROC_MINIMAL structure. The only exception being the dummy PGPROC for prepared transaction. A first cut version of the patch is attached. It looks big, but most of the changes are cosmetic because of added indirection. The patch also contains another change to keep the ProcArray sorted by (PGPROC *) to preserve locality of references while traversing the array. I did some tests of a 32 core IA HP-UX box and the results are quite good. With a scale factor of 100 and -N option of pgbench (updates on only accounts table), the numbers look something like this: Clients HEAD PGPROC-Patched Gain 1 1098.488663 1059.830369 -3.52% 4 3569.481435 3663.898254 2.65% 32 11627.059228 16419.864056 41.22% 48 11044.501244 15825.132582 43.29% 64 10432.206525 15408.50304 47.70% 80 10210.57835 15170.614435 48.58% The numbers are quite reproducible with couple of percentage points variance. So even for single client, I sometimes see no degradation. Here are some more numbers with the normal pgbench tests (without -N option). Clients HEAD PGPROC-Patched Gain 1 743 771 3.77% 4 1821 2315 27.13% 32 8011 9166 14.42% 48 7282 8959 23.03% 64 6742 8937 32.56% 80 6316 8664 37.18% Its quite possible that the effect of the patch is more evident on the particular hardware that I am testing. But the approach nevertheless seems reasonable. It will very useful if someone else having access to a large box can test the effect of the patch. BTW, since I played with many versions of the patch, the exact numbers with this version might be a little different than what I posted above. I will conduct more tests, especially with more number of clients and see if there is any difference in the improvement. Thanks, Pavan -- Pavan Deolasee EnterpriseDB http://www.enterprisedb.com
Вложения
On Thu, Nov 3, 2011 at 6:12 PM, Pavan Deolasee <pavan.deolasee@gmail.com> wrote: > When PGPROC array is allocated, we also allocate another array of > PGPROC_MINIMAL structures of the same size. While accessing the > ProcArray, a simple pointer mathematic can get us the corresponding > PGPROC_MINIMAL structure. The only exception being the dummy PGPROC > for prepared transaction. A first cut version of the patch is > attached. It looks big, but most of the changes are cosmetic because > of added indirection. The patch also contains another change to keep > the ProcArray sorted by (PGPROC *) to preserve locality of references > while traversing the array. This is very good. If you look at your PGPROC_MINIMAL, its mostly transaction related stuff, so I would rename it PGXACT or similar. Not sure why you talk about pointer math, seems easy enough just to have two arrays protected by one lock, and have each proc use the same offset in both arrays. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Fri, Nov 4, 2011 at 4:13 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > > If you look at your PGPROC_MINIMAL, its mostly transaction related > stuff, so I would rename it PGXACT or similar. Yeah, that looks good too. Though I am not sure if all fields are related to transaction state and whether we would need to add more fields to the structure in future. Having a general name might help in that case. > Not sure why you talk > about pointer math, seems easy enough just to have two arrays > protected by one lock, and have each proc use the same offset in both > arrays. > Right now we store PGPROC pointers in the ProcArray and the pointer math gets us the index to look into the other array. But we can actually just store indexes in the ProcArray to avoid that. A positive index may mean offset into the normal PGPROC array and a negative index can be used to get dummy PGPROC from the prepared transactions. Thanks, Pavan
On 04.11.2011 00:43, Simon Riggs wrote: > On Thu, Nov 3, 2011 at 6:12 PM, Pavan Deolasee<pavan.deolasee@gmail.com> wrote: > >> When PGPROC array is allocated, we also allocate another array of >> PGPROC_MINIMAL structures of the same size. While accessing the >> ProcArray, a simple pointer mathematic can get us the corresponding >> PGPROC_MINIMAL structure. The only exception being the dummy PGPROC >> for prepared transaction. A first cut version of the patch is >> attached. It looks big, but most of the changes are cosmetic because >> of added indirection. The patch also contains another change to keep >> the ProcArray sorted by (PGPROC *) to preserve locality of references >> while traversing the array. > > This is very good. > > If you look at your PGPROC_MINIMAL, its mostly transaction related > stuff, so I would rename it PGXACT or similar. Not sure why you talk > about pointer math, seems easy enough just to have two arrays > protected by one lock, and have each proc use the same offset in both > arrays. Agreed, that seems more clean. The PGPROCs for prepared transactions are currently allocated separately, so they're not currently in the same array as all other PGPROCs, but that could be changed. Here's a WIP patch for that. I kept the PGPROC_MINIMAL nomenclature for now, but I agree with Simon's suggestion to rename it. On 03.11.2011 20:12, Pavan Deolasee wrote: > Its quite possible that the effect of the patch is more evident on the > particular hardware that I am testing. But the approach nevertheless > seems reasonable. It will very useful if someone else having access to > a large box can test the effect of the patch. I tested this on an 8-core x64 box, but couldn't see any measurable difference in pgbench performance. I tried with and without -N and -S, and --unlogged-tables, but no difference. While looking at GetSnapshotData(), it occurred to me that when a PGPROC entry does not have its xid set, ie. xid == InvalidTransactionId, it's pointless to check the subxid array for that proc. If a transaction has no xid, none of its subtransactions can have an xid either. That's a trivial optimization, see attached. I tested this with "pgbench -S -M prepared -c 500" on the 8-core box, and it seemed to make a 1-2% difference (without the other patch). So, almost in the noise, but it also doesn't cost us anything in terms of readability or otherwise. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Вложения
On Mon, Nov 7, 2011 at 6:45 AM, Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > While looking at GetSnapshotData(), it occurred to me that when a PGPROC > entry does not have its xid set, ie. xid == InvalidTransactionId, it's > pointless to check the subxid array for that proc. If a transaction has no > xid, none of its subtransactions can have an xid either. That's a trivial > optimization, see attached. I tested this with "pgbench -S -M prepared -c > 500" on the 8-core box, and it seemed to make a 1-2% difference (without the > other patch). So, almost in the noise, but it also doesn't cost us anything > in terms of readability or otherwise. Oh, that's a good idea. +1 for doing that now, and then we can keep working on the rest of it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Nov 7, 2011 at 6:45 AM, Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > Agreed, that seems more clean. The PGPROCs for prepared transactions are > currently allocated separately, so they're not currently in the same array > as all other PGPROCs, but that could be changed. Here's a WIP patch for > that. I kept the PGPROC_MINIMAL nomenclature for now, but I agree with > Simon's suggestion to rename it. All right, I did that in the attached version, using Simon's suggested name. I also fixed up various comments (especially in InitProcGlobal), fixed the --enable-cassert build (which was busted), and added code to save/restore PreparedXactProcs in EXEC_BACKEND mode (which I assume is necessary, though the regression tests failed to fail without it). I'm wondering about the changes to how globalxmin is computed in GetSnapshotData(). That seems like it could hurt performance in the single-client case, and especially in the case where there is one active connection and lots of idle connections, and I'm wondering how thoroughly we've tested that particular bit apart from these other changes. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Вложения
On Tue, Nov 22, 2011 at 12:35 PM, Robert Haas <robertmhaas@gmail.com> wrote: > I'm wondering about the changes to how globalxmin is computed in > GetSnapshotData(). That seems like it could hurt performance in the > single-client case, and especially in the case where there is one > active connection and lots of idle connections, and I'm wondering how > thoroughly we've tested that particular bit apart from these other > changes. I separated this part out - see attached. pgxact-v2.patch is the same as what I posted on Tuesday with those changes factored out; recentglobalxmin.patch applies over top of it, and contains just the changes related to how RecentGlobalXmin is recomputed. I tried a pgbench -S test, with five minute runs, scale factor 100, shared_buffers = 8GB, on Nate Boley's 32-core box. I tried this on unpatched master, with just pgxact-v2.patch, and with both patches. At 1, 8, and 32, pgxact-v2 is faster than master, but pgxact-v2 + recentglobalxmin is slower than just pgxact-v2. When I get up to 80 clients, the recentgloblaxmin portion becomes a win. Just to make sure that I wasn't measuring random noise, I did seven runs with each configuration rather than my usual three. The table below shows the lowest, median, and highest tps results for each tested configuration. == 1 client == master: low 4312.806032 median 4343.971972 high 4395.870212 pgxact-v2: low 4391.273988 median 4408.283875 high 4448.315804 pgxact-v2+recentglobalxmin: low 4328.697157 median 4360.581749 high 4399.763953 == 8 clients == master: low 33295.56753 median 33356.682726 high 33639.368929 pgxact-v2: low 33719.004950 median 33927.195872 high 34106.679176 pgxact-v2+recentglobalxmin: low 33565.708794 median 33694.140297 high 33795.072927 == 32 clients == master: low 210514.460228 median 213168.532917 high 214169.174032 pgxact-v2: low 216230.082361 median 221678.274469 high 231465.732256 pgxact-v2+recentglobalxmin: low 212202.985015 median 219390.075074 high 223353.421472 == 80 clients == master: low 208139.915780 median 209867.352113 high 214868.151835 pgxact-v2: low 217003.716877 median 218360.946541 high 222095.321825 pgxact-v2+recentglobalxmin: low 219390.617005 median 220912.168056 high 221779.348531 I'm going to run some more tests, but my thought is that we should probably leave the recentglobalxmin changes out for the time being, pending further study and consideration of other alternatives. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Вложения
On Thu, Nov 24, 2011 at 1:30 PM, Robert Haas <robertmhaas@gmail.com> wrote: > I'm going to run some more tests, but my thought is that we should > probably leave the recentglobalxmin changes out for the time being, > pending further study and consideration of other alternatives. Agreed -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Thu, Nov 24, 2011 at 7:24 PM, Simon Riggs <simon@2ndquadrant.com> wrote: > On Thu, Nov 24, 2011 at 1:30 PM, Robert Haas <robertmhaas@gmail.com> wrote: > >> I'm going to run some more tests, but my thought is that we should >> probably leave the recentglobalxmin changes out for the time being, >> pending further study and consideration of other alternatives. > > Agreed > +1. These are independent patches and should be pursued like that. BTW, I reviewed the pgxact-v2.patch and I have no objections to that and it looks good to go in. Thanks Robert for making the necessary changes and also running the benchmark tests. Thanks, Pavan -- Pavan Deolasee EnterpriseDB http://www.enterprisedb.com
On Thu, Nov 24, 2011 at 11:54 PM, Pavan Deolasee <pavan.deolasee@gmail.com> wrote: > +1. These are independent patches and should be pursued like that. > BTW, I reviewed the pgxact-v2.patch and I have no objections to that > and it looks good to go in. Thanks Robert for making the necessary > changes and also running the benchmark tests. I ran some write tests with master, pgxact-v2, and pgxact-v2+recentglobalxmin. shared_buffers = 8GB, maintenance_work_mem = 1GB, synchronous_commit = off, checkpoint_segments = 300, checkpoint_timeout = 15min, checkpoint_completion_target = 0.9, wal_writer_delay = 20ms. Seven five-minute runs at scale factor 100 for each configuration. Here's the executive summary: On the read-only test, recentglobalxmin.patch was only a win at the very highest concurrency I tested (80 clients); on the read-write test, it starts to show a benefit at much lower concurrencies (32 clients, certainly, perhaps even 8 clients, on unlogged tables). However, pgxact-v2.patch seems to be a win on both read and write tests and at any concurrency level, including the single-client case. == 1 client, unlogged tables == master: low 671.861618 median 677.324867 high 765.824313 (but the second highest was only 679.491822) pgxact-v2: low 663.901614 median 689.496716 high 696.812065 pgxact-v2+recentglobalxmin: low 665.554342 median 685.401979 high 688.832906 == 8 clients, unlogged tables == master: low 4722.011063 median 4758.201239 high 4920.130891 pgxact-v2: low 4684.759859 median 4840.081663 high 4979.036845 pgxact-v2+recentglobalxmin: low 4723.743270 median 4856.513904 high 4997.528399 == 32 clients, unlogged tables == master: low 10878.959662 median 10901.523672 high 10934.699151 pgxact-v2: low 17944.914228 median 18060.058996 high 19281.541088 pgxact-v2+recentglobalxmin: low 18894.860512 median 19637.190567 high 19817.089456 == 80 clients, unlogged tables == master: low 7872.934292 median 7897.811216 high 7909.410723 pgxact-v2: low 12032.684380 median 12397.316995 high 13279.998414 pgxact-v2+recentglobalxmin: low 16964.227483 median 17801.478747 high 18107.646170 == 1 client, permanent tables == master: low 625.502929 median 628.442284 high 677.451660 pgxact-v2: low 636.755782 median 640.083573 high 645.273888 pgxact-v2+recentglobalxmin: low 633.320412 median 636.898945 high 637.886099 == 8 clients, permanent tables == master: low 4497.012143 median 4624.844801 high 4849.233268 pgxact-v2: low 4561.914897 median 4625.443111 high 4776.095552 pgxact-v2+recentglobalxmin: low 4469.742226 median 4789.249847 high 4824.033794 == 32 clients, permanent tables == master: low 10468.362239 median 10511.425102 high 10531.069684 pgxact-v2: low 12821.732396 median 14500.067726 high 14546.538281 pgxact-v2+recentglobalxmin: low 14907.122746 median 15129.665408 high 15186.743199 == 80 clients, permanent tables == master: low 7601.067552 median 7612.898321 high 7631.487355 pgxact-v2: low 11712.895410 median 12004.807309 high 12512.078569 pgxact-v2+recentglobalxmin: low 15186.695057 median 15810.452158 high 16166.272699 I don't see much reason to wait around any further on the core patch (pgact-v2.patch) so I'll commit that now. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Nov 7, 2011 at 7:28 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Mon, Nov 7, 2011 at 6:45 AM, Heikki Linnakangas > <heikki.linnakangas@enterprisedb.com> wrote: >> While looking at GetSnapshotData(), it occurred to me that when a PGPROC >> entry does not have its xid set, ie. xid == InvalidTransactionId, it's >> pointless to check the subxid array for that proc. If a transaction has no >> xid, none of its subtransactions can have an xid either. That's a trivial >> optimization, see attached. I tested this with "pgbench -S -M prepared -c >> 500" on the 8-core box, and it seemed to make a 1-2% difference (without the >> other patch). So, almost in the noise, but it also doesn't cost us anything >> in terms of readability or otherwise. > > Oh, that's a good idea. +1 for doing that now, and then we can keep > working on the rest of it. I spent some time looking at (and benchmarking) this idea today. I rearranged that section of code a little more into what seemed like the optimal order to avoid doing more work than necessary, but I wasn't getting much of a gain out of it even on unlogged tables and on permanent tables it was looking like a small loss, though I didn't bother letting the benchmarks run for long enough to nail that down because it didn't seem to amount to much one way or the other. I added then added one more change that helped quite a lot: I introduced a macro NormalTransactionIdPrecedes() which works like TransactionIdPrecdes() but (a) is only guaranteed to work if the arguments are known to be normal transaction IDs (which it also asserts, for safety) and (b) is a macro rather than a function call. I found three places to use that inside GetSnapshotData(), and the results with that change look fairly promising. Nate Boley's box, m = master, s = with the attached patch, median of three 5-minute runs at scale factor 100, config same as my other recent tests: Permanent Tables m01 tps = 617.734793 (including connections establishing) s01 tps = 620.330099 (including connections establishing) m08 tps = 4589.566969 (including connections establishing) s08 tps = 4545.942588 (including connections establishing) m16 tps = 7618.842377 (including connections establishing) s16 tps = 7596.759619 (including connections establishing) m24 tps = 11770.534809 (including connections establishing) s24 tps = 11789.424152 (including connections establishing) m32 tps = 10776.660575 (including connections establishing) s32 tps = 10643.361817 (including connections establishing) m80 tps = 11057.353339 (including connections establishing) s80 tps = 10598.254713 (including connections establishing) Unlogged Tables m01 tps = 668.145495 (including connections establishing) s01 tps = 676.793337 (including connections establishing) m08 tps = 4715.214745 (including connections establishing) s08 tps = 4737.833913 (including connections establishing) m16 tps = 8110.607784 (including connections establishing) s16 tps = 8192.013414 (including connections establishing) m24 tps = 14120.753841 (including connections establishing) s24 tps = 14302.915040 (including connections establishing) m32 tps = 17886.032656 (including connections establishing) s32 tps = 18735.319468 (including connections establishing) m80 tps = 12711.930739 (including connections establishing) s80 tps = 17592.715471 (including connections establishing) Now, you might not initially find those numbers all that encouraging. Of course, the unlogged tables numbers are quite good, especially at 32 and 80 clients, where the gains are quite marked. But the permanent table numbers are at best a wash, and maybe a small loss. Interestingly enough, some recent benchmarking of the FlexLocks patch showed a similar (though more pronounced) effect: http://archives.postgresql.org/pgsql-hackers/2011-12/msg00095.php Now, both the FlexLocks patch and this patch aim to mitigate contention problems around ProcArrayLock. But they do it in completely different ways. When I got a small regression on permanent tables with the FlexLocks patch, I thought the problem was with the patch itself, which is believable, since it was tinkering with the LWLock machinery, a fairly global sort of change that you can well imagine might break something. But it's hard to imagine how that could possibly be the case here, especially given the speedups on unlogged tables, because this patch is dead simple and entirely isolated to GetSnapshotData(). So I have a new theory: on permanent tables, *anything* that reduces ProcArrayLock contention causes an approximately equal increase in WALInsertLock contention (or maybe some other lock), and in some cases that increase in contention elsewhere can cost more than the amount we're saving here. On that theory, I'm inclined to think that's not really a problem. We'll go nuts if we refuse to commit anything until it shows a meaningful win on every imaginable workload, and it seems like this can't really be worse than the status quo; any case where it is must be some kind of artifact. We're better of getting rid of as much ProcArrayLock contention as possible, rather than keeping it around because there are corner cases where it decreases contention on some other lock. One small detail I'm noticing on further review is that I've slightly changed the semantics here: if (!TransactionIdIsNormal(xid) || NormalTransactionIdPrecedes(xmax, xid)) continue; That really ought to be testing <=, not just <. That doesn't seem like it should affect correctness, though: at worst, we unnecessarily include one or more XIDs in the snapshot that will be ignored anyway. I'll fix that and rerun the tests but I don't think it'll make any difference. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Вложения
On Thu, Dec 15, 2011 at 11:35 PM, Robert Haas <robertmhaas@gmail.com> wrote: > One small detail I'm noticing on further review is that I've slightly > changed the semantics here: > > if (!TransactionIdIsNormal(xid) > || NormalTransactionIdPrecedes(xmax, xid)) > continue; > > That really ought to be testing <=, not just <. That doesn't seem > like it should affect correctness, though: at worst, we unnecessarily > include one or more XIDs in the snapshot that will be ignored anyway. > I'll fix that and rerun the tests but I don't think it'll make any > difference. New results with the attached, updated patch. As predicted, these are quite similar to the last batch... m01 tps = 618.460567 (including connections establishing) s01 tps = 628.394270 (including connections establishing) m08 tps = 4558.930585 (including connections establishing) s08 tps = 4490.285074 (including connections establishing) m16 tps = 7577.677079 (including connections establishing) s16 tps = 7582.611380 (including connections establishing) m24 tps = 11556.680518 (including connections establishing) s24 tps = 11527.982307 (including connections establishing) m32 tps = 10807.216084 (including connections establishing) s32 tps = 10871.625992 (including connections establishing) m80 tps = 10818.092314 (including connections establishing) s80 tps = 10866.780660 (including connections establishing) Unlogged Tables: m01 tps = 670.328782 (including connections establishing) s01 tps = 818.666971 (including connections establishing) m08 tps = 4793.337235 (including connections establishing) s08 tps = 4888.600945 (including connections establishing) m16 tps = 8016.092785 (including connections establishing) s16 tps = 8123.217451 (including connections establishing) m24 tps = 14082.694567 (including connections establishing) s24 tps = 14114.466246 (including connections establishing) m32 tps = 17881.340576 (including connections establishing) s32 tps = 18291.739818 (including connections establishing) m80 tps = 12767.535657 (including connections establishing) s80 tps = 17380.055381 (including connections establishing) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Вложения
Robert Haas wrote: > On that theory, I'm inclined to think that's not really a problem. > We'll go nuts if we refuse to commit anything until it shows a > meaningful win on every imaginable workload, and it seems like this > can't really be worse than the status quo; any case where it is must > be some kind of artifact. We're better of getting rid of as much > ProcArrayLock contention as possible, rather than keeping it around > because there are corner cases where it decreases contention on some > other lock. Interesting conclusion, and it makes sense. Seems once this is applied we will have more places to look for contention improvements. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
On Fri, Dec 16, 2011 at 8:25 PM, Bruce Momjian <bruce@momjian.us> wrote: > Robert Haas wrote: >> On that theory, I'm inclined to think that's not really a problem. >> We'll go nuts if we refuse to commit anything until it shows a >> meaningful win on every imaginable workload, and it seems like this >> can't really be worse than the status quo; any case where it is must >> be some kind of artifact. We're better of getting rid of as much >> ProcArrayLock contention as possible, rather than keeping it around >> because there are corner cases where it decreases contention on some >> other lock. > > Interesting conclusion, and it makes sense. Seems once this is applied > we will have more places to look for contention improvements. Yeah. The performance results I posted the other day seem to show that on some of these tests we're thrashing our CLOG buffers, and the difference between unlogged tables and permanent tables seems to indicate pretty clearly that WALInsertLock is a huge problem. I'm going to look more at the CLOG stuff next week, and also keep poking at ProcArrayLock, where I think there's still room for further improvement. I am leaving WALInsertLock to Heikki for now, since (1) I don't want to collide with what he's working on, (2) he knows more about it than I do, anyway, and (3) it's a really hard problem and I don't have any particularly good ideas about how to fix it. :-( -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Dec 16, 2011, at 7:25 PM, Bruce Momjian wrote: > Robert Haas wrote: >> On that theory, I'm inclined to think that's not really a problem. >> We'll go nuts if we refuse to commit anything until it shows a >> meaningful win on every imaginable workload, and it seems like this >> can't really be worse than the status quo; any case where it is must >> be some kind of artifact. We're better of getting rid of as much >> ProcArrayLock contention as possible, rather than keeping it around >> because there are corner cases where it decreases contention on some >> other lock. > > Interesting conclusion, and it makes sense. Seems once this is applied > we will have more places to look for contention improvements. I also wonder how much this throws some previous performance tests into suspicion. If it's not uncommon for performance improvementattempts to shift a bottleneck to a different part of the system and marginally hurt performance then we mightbe throwing away good performance improvement ideas before we should... -- Jim C. Nasby, Database Architect jim@nasby.net 512.569.9461 (cell) http://jim.nasby.net
On Sat, Dec 17, 2011 at 1:00 AM, Jim Nasby <jim@nasby.net> wrote: > I also wonder how much this throws some previous performance tests into suspicion. If it's not uncommon for performanceimprovement attempts to shift a bottleneck to a different part of the system and marginally hurt performance thenwe might be throwing away good performance improvement ideas before we should... I think we are (mostly) OK on this point, at least as far as the work I've been doing. We've actually had a few previous instances of this phenomenon - e.g. when I first committed my fastlock patch, performance actually got worse if you had >40 cores doing read-only queries, because speeding up the lock manager made it possible for the spinlock protection SInvalReadLock to mess things up royally. Nevertheless, we got it committed - and fixed the SInvalReadLock problem, too. This one is/was somewhat more subtle, but I'm feeling pretty good about our chances of making at least some further progress in time for 9.2. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company