Обсуждение: Questions about checksum feature in 9.3
I am getting a new server ready for production and saw the release note on the new checksum feature. I thought it soundedlike something we might want, and then after reading realized we have to initdb with the feature on. I figured I'dbetter check into it a little more since changing later might be a bit of a hassle and found notes on getting a vectorizedversion running for better performance. My attempts to compile it vectorized on OS X seemed to have failed since I don't find a vector instruction in the .o fileeven though the options -msse4.1 -funroll-loops -ftree-vectorize should be supported according to the man page for Apple'sllvm-gcc. So, has anyone compiled checksum vectorized on OS X? Are there any performance data that would indicate whether or not Ishould worry with this in the first place? So far we are pretty happy with the performance of 9.2.4, but have noticed a few situations where it's a little slower thanwe might like, but these instances are rare. I'd accept a small performance hit if we can get better reliability andawareness of potential problems. Thanks, Kevin
On Sun, Sep 15, 2013 at 8:13 AM, Kevin <kevo@gatorgraphics.com> wrote: > My attempts to compile it vectorized on OS X seemed to have failed since I don't find a vector instruction in the .o fileeven though the options -msse4.1 -funroll-loops -ftree-vectorize should be supported according to the man page for Apple'sllvm-gcc. I'm not sure what version of LLVM Apple is using for llvm-gcc. I know that clang+llvm 3.3 can successfully vectorize the checksum algorithm when -O3 is used. > So, has anyone compiled checksum vectorized on OS X? Are there any performance data that would indicate whether or notI should worry with this in the first place? Even without vectorization the worst case performance hit is about 20%. This is for a workload that is fully bottlenecked on swapping pages in between shared buffers and OS cache. In real world cases it's hard to imagine it having any measurable effect. A single core can checksum several gigabytes per second of I/O without vectorization, and about 30GB/s with vectorization. Regards, Ants Aasma -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de
Ants Aasma-2 wrote >> So, has anyone compiled checksum vectorized on OS X? Are there any >> performance data that would indicate whether or not I should worry with >> this in the first place? > > Even without vectorization the worst case performance hit is about > 20%. This is for a workload that is fully bottlenecked on swapping > pages in between shared buffers and OS cache. In real world cases it's > hard to imagine it having any measurable effect. A single core can > checksum several gigabytes per second of I/O without vectorization, > and about 30GB/s with vectorization. Thoughts on how/where to provide guidance as to this kind of concern. The single paragraph in the initdb documentation seems to be lacking. Would a destination page on the wiki, linked to from the documentation, where "current knowledge" regarding benchmarks and caveats can be stored, be appropriate. To that end, Ants, do you actually have some resources and/or benchmarks which support your claim and that you can provide links to? The "single" core aspect is interesting. Does the implementation have a dedicated core to perform these calculations or must the same thread that handles the relevant query perform this work as well? How much additional impact/overhead does having to multitask have on the maximum throughput of a single core in processing checksums? This whole vectorization angle also doesn't seem to be in the documentation...though I didn't look super hard. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Questions-about-checksum-feature-in-9-3-tp5770936p5771100.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.
On 9/16/13 10:14 AM, David Johnston wrote: > The "single" core aspect is interesting. Does the implementation have a > dedicated core to perform these calculations or must the same thread that > handles the relevant query perform this work as well? How much additional > impact/overhead does having to multitask have on the maximum throughput of a > single core in processing checksums? Postgres doesn't currently have any real kind of parallelism, so whatever process needs to do the checksum will be the processactually running the checksum. That said, there are background processes that could potentially be involved here, depending on exactly where checksums arebeing calculated (I don't remember exactly when the checks are done). For example, if a buffer is being written out bythe bgwriter, then it's the bgwriter process that will actually do the checksum, not a backend process. -- Jim C. Nasby, Data Architect jim@nasby.net 512.569.9461 (cell) http://jim.nasby.net