Обсуждение: PostgreSQL performance on various distribution stock kernels
Is there a source comparing PostgreSQL performance (say, using pgbench) out of the box for various Linux distributions? Alternately, is there an analysis anywhere of the potential gains from building a custom kernel and just what customizations are most relevant to a PostgreSQL server? Some background - in investigating the overhead of adopting OpenVZ virtualization, I ran pgbench tests on PostgreSQL running in a virtual environment (VE) and compared to PostgreSQL running directly on the hardware node (HN) under the current stable OpenVZ kernel with no VE running. The results were roughly in line with expectations based on OpenVZ documentation (5% fewer transactions per second.) For completeness, I then ran the same tests with the current stock Fedora 8 kernel running natively on the same hardware (after all this is the true non-virtual alternative.) Surprisingly, this test performed markedly worse than under the OpenVZ kernel (either on HN or in VE) even though the latter is from the 2.6.18 series and has added baggage to support OpenVZ's OS virtualization. Multiple pgbench runs arrive confirm this conclusion. The PostgreSQL server version (8.2.5), configuration, hardware, etc. are identical (actually same HD filesystem image mounted at /var/lib/pgsql) for each test. Similarly, other than the kernel, the OS is identical - stock Fedora 8 with up to date packages for each test. I double-checked the kernel architecture via uname: Fedora 8: Linux 2.6.23.1-49.fc8 #1 SMP Thu Nov 8 21:41:26 EST 2007 i686 i686 i386 GNU/Linux OpenVZ: Linux 2.6.18-8.1.15.el5.028stab049.1 #1 SMP Thu Nov 8 16:23:12 MSK 2007 i686 i686 i386 GNU/Linux So, what's different between these tests? I'm seeing performance differences of between +65% to +90% transactions per second of the OpenVZ kernel running on the HN over the stock Fedora 8 kernel. Is this reflective of different emphasis between RHEL and Fedora kernel builds? Some OpenVZ optimization on top of the RHEL5 build? Something else? Where should I look? any insights much appreciated, Damon Hart
On 11/26/07, Damon Hart <dhcom@sundial.com> wrote: > So, what's different between these tests? I'm seeing performance > differences of between +65% to +90% transactions per second of the > OpenVZ kernel running on the HN over the stock Fedora 8 kernel. Is > this reflective of different emphasis between RHEL and Fedora kernel > builds? Some OpenVZ optimization on top of the RHEL5 build? Something > else? Where should I look? A recent FreeBSD benchmark (which also tested Linux performance) found major performance differences between recent versions of the kernel, possibly attributable to the new so-called completely fair scheduler: http://archives.postgresql.org/pgsql-performance/2007-11/msg00132.php No idea if it's relevant. Alexander.
On Nov 26, 2007 4:50 PM, Damon Hart <dhcom@sundial.com> wrote: > > So, what's different between these tests? I'm seeing performance > differences of between +65% to +90% transactions per second of the > OpenVZ kernel running on the HN over the stock Fedora 8 kernel. Is > this reflective of different emphasis between RHEL and Fedora kernel > builds? Some OpenVZ optimization on top of the RHEL5 build? Something > else? Where should I look? > > any insights much appreciated, How many TPS are you seeing on each one? If you are running 10krpm drives and seeing more than 166.66 transactions per second, then your drives are likely lying to you and not actually fsyncing, and it could be that fsync() on IDE / SATA has been implemented in later kernels and it isn't lying. Hard to say for sure. What does vmstat 1 have to say on each system when it's under load?
Damon Hart <dhcom@sundial.com> writes: > So, what's different between these tests? I'm seeing performance > differences of between +65% to +90% transactions per second of the > OpenVZ kernel running on the HN over the stock Fedora 8 kernel. Is > this reflective of different emphasis between RHEL and Fedora kernel > builds? Some OpenVZ optimization on top of the RHEL5 build? Something > else? Where should I look? Considering how raw Fedora 8 is, I think what you've probably found is a performance bug that should be reported to the kernel hackers. Just to confirm: this *is* the same filesystem in both cases, right? regards, tom lane
On Nov 26, 2007 5:00 PM, Alexander Staubo <alex@purefiction.net> wrote: > On 11/26/07, Damon Hart <dhcom@sundial.com> wrote: > > So, what's different between these tests? I'm seeing performance > > differences of between +65% to +90% transactions per second of the > > OpenVZ kernel running on the HN over the stock Fedora 8 kernel. Is > > this reflective of different emphasis between RHEL and Fedora kernel > > builds? Some OpenVZ optimization on top of the RHEL5 build? Something > > else? Where should I look? > > A recent FreeBSD benchmark (which also tested Linux performance) found > major performance differences between recent versions of the kernel, > possibly attributable to the new so-called completely fair scheduler: > > http://archives.postgresql.org/pgsql-performance/2007-11/msg00132.php Yeah, I wondered about that too, but thought the completely fair scheduler was not on by default so didn't mention it. Hmmm. I wonder.
On Mon, 2007-11-26 at 17:00 -0600, Scott Marlowe wrote: > On Nov 26, 2007 4:50 PM, Damon Hart <dhcom@sundial.com> wrote: > > > > So, what's different between these tests? I'm seeing performance > > differences of between +65% to +90% transactions per second of the > > OpenVZ kernel running on the HN over the stock Fedora 8 kernel. Is > > this reflective of different emphasis between RHEL and Fedora kernel > > builds? Some OpenVZ optimization on top of the RHEL5 build? Something > > else? Where should I look? > > > > any insights much appreciated, > > How many TPS are you seeing on each one? If you are running 10krpm > drives and seeing more than 166.66 transactions per second, then your > drives are likely lying to you and not actually fsyncing, and it could > be that fsync() on IDE / SATA has been implemented in later kernels > and it isn't lying. > > Hard to say for sure. > > What does vmstat 1 have to say on each system when it's under load? I will have to repeat the tests to give you any vmstat info, but perhaps a little more raw input might be useful. Test H/W: Dell Precision 650 Dual Intel CPU: Dual XEON 2.4GHz 512k Cache RAM: 4GB of DDR ECC Hard Drive: 4 x 36GB 10K 68Pin SCSI Hard Drive pgbench scale: 50 clients: 50 transactions per client: 100 stats for 30 runs each kernel in TPS (excluding connections establishing) OpenVZ (RHEL5 derived 2.6.18 series) average: 446 maximum: 593 minimum: 95 stdev: 151 median: 507 stock Fedora 8 (2.6.23 series) average: 270 maximum: 526 minimum: 83 stdev: 112 median: 268 Does your 10K RPM drive 166 TPS ceiling apply in this arrangement with multiple disks (the PostgreSQL volume spans three drives, segregated from the OS) and multiple pgbench clients? I'm fuzzy on whether these factors even enter into that rule of thumb. At least as far as the PostgreSQL configuration is concerned, fsync has not been changed from the default. Damon
On Mon, 2007-11-26 at 18:06 -0500, Tom Lane wrote: > Damon Hart <dhcom@sundial.com> writes: > > So, what's different between these tests? I'm seeing performance > > differences of between +65% to +90% transactions per second of the > > OpenVZ kernel running on the HN over the stock Fedora 8 kernel. Is > > this reflective of different emphasis between RHEL and Fedora kernel > > builds? Some OpenVZ optimization on top of the RHEL5 build? Something > > else? Where should I look? > > Considering how raw Fedora 8 is, I think what you've probably found is a > performance bug that should be reported to the kernel hackers. > Not being a kernel hacker, any suggestions on how to provide more useful feedback than just pgbench TPS comparison and hardware specs? What's the best forum, presuming this does boil down to kernel issues. > Just to confirm: this *is* the same filesystem in both cases, right? > > regards, tom lane Yes, same filesystem simply booting different kernels. Damon
On Mon, 26 Nov 2007, Damon Hart wrote: > Fedora 8: > Linux 2.6.23.1-49.fc8 #1 SMP Thu Nov 8 21:41:26 EST 2007 i686 i686 i386 > GNU/Linux > > OpenVZ: > Linux 2.6.18-8.1.15.el5.028stab049.1 #1 SMP Thu Nov 8 16:23:12 MSK 2007 > i686 i686 i386 GNU/Linux 2.6.23 introduced a whole new scheduler: http://www.linux-watch.com/news/NS2939816251.html so it's rather different from earlier 2.6 releases, and so new that there could easily be performance bugs. > Does your 10K RPM drive 166 TPS ceiling apply in this arrangement with > multiple disks Number of disks has nothing to do with it; it depends only on the rate the disk with the WAL volume is spinning at. But that's for a single client. > pgbench > scale: 50 > clients: 50 > transactions per client: 100 With this many clients, you can get far more transactions per second committed than the max for a single client (166@10K rpm). What you're seeing, somewhere around 500 per second, is reasonable. Note that you're doing two things that make pgbench less useful than it can be: 1) The number of transactions you're committing is trivial, which is one reason why your test runs have such a huge variation. Try 10000 transactions/client if you want something that doesn't vary quite so much. If it doesn't run for a couple of minutes, you're not going to get good repeatability. 2) The way pgbench works, it takes a considerable amount of resources to simulate this many clients. You might get higher (and more realistic) numbers if you run the pgbench client on another system than the server. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD