Обсуждение: access time performance problem
I have an IBM Xseries 300 single cpu with RH installed, 512Mb RAM and SCSI drive with hardware mirroring. Postgresql database is on a partition with ext3 (journalized file system). My greatest table contains about 30.000 records. Postgresql in my project is used to feed/get data from an external hardware as quick as possible. The external device ask the IBM for its configuration data, and the goal is to do a fetch on the database and to send backthe info as quick as possible. The second scenario is when the external device wants to back up its configuration. A mean time of 50ms between database accesses is foreseen. For both scenario I have chosen auto-commit mode, because every record has to be on disc as quick as possible. I have remarked very bad database access time performances. I have then tried with another computer : a common desktop PC(compaq), IDE drive, less memory and less CPU speed. I got better database access time. Here is the results: delete_records insert_records update_records Compaq mean access time: 2.7ms 4.5ms 4.8ms IBM mean access time: 22.9ms 24.6ms 25.9ms When browsing newsgroups, I found that playing with wal_sync_method parameter could give better results. I tried with wal_sync_method=open_sync and here are the results: delete_records insert_records update_records Compaq mean access time: 1.0ms 2.6ms 2.6ms IBM mean access time: 4.0ms 1.3ms 1.3ms My first question is: how is it possible to have such gain in time for the IBM between the case wal_sync_method=fsync andthe case wal_sync_method=open_sync ? Another problem is the following: about every 1000 database access (not regular), the database accesses are hanged during approximately 2500ms. I suppose that this time is used by the OS to flush the memory cache to hard disk. My second question is: how is it possible to avoid such hanging of the database ? Is it possible to flush a part of the cachewhile working on another part of it, the goal being not to interrupt the whole process ? Thanx for your future comments. --Louis Croisez.
Quick question, are you regularly vacuuming and analyzing your database? Also, ext3 can definitely slow things down. If your machine is stable and on a UPS it may be worth your while to just run ext2. Also, have you compared output from bonnie++ on the compaq against the IBM (run it on the same drive that hosts the database of course.) it's a free program you can download to test your drive subsystem's performance. A SCSI mirror set on 10k drives should be able to read at >30 Megs a second and an IDE drive should be in the 5 to 15 Megs a second range. Since Postgresql is designed more for integrity and transactions, it may not be your best choice here. I'm not sure what would be your best choice, but Postgresql is not known for being a real time system with performance guarantees on response times. Also, what processor speeds are these two machines? Just wondering. On Wed, 9 Oct 2002, Louis-Marie Croisez wrote: > I have an IBM Xseries 300 single cpu with RH installed, 512Mb RAM and SCSI drive with hardware mirroring. > Postgresql database is on a partition with ext3 (journalized file system). > My greatest table contains about 30.000 records. > > Postgresql in my project is used to feed/get data from an external hardware as quick as possible. > The external device ask the IBM for its configuration data, and the goal is to do a fetch on the database and to send backthe info > as quick as possible. > The second scenario is when the external device wants to back up its configuration. > A mean time of 50ms between database accesses is foreseen. > For both scenario I have chosen auto-commit mode, because every record has to be on disc as quick as possible. > > I have remarked very bad database access time performances. I have then tried with another computer : a common desktopPC (compaq), > IDE drive, less memory and less CPU speed. I got better database access time. > Here is the results: > > delete_records insert_records update_records > Compaq mean access time: 2.7ms 4.5ms 4.8ms > IBM mean access time: 22.9ms 24.6ms 25.9ms > > When browsing newsgroups, I found that playing with wal_sync_method parameter could give better results. > I tried with wal_sync_method=open_sync and here are the results: > > delete_records insert_records update_records > Compaq mean access time: 1.0ms 2.6ms 2.6ms > IBM mean access time: 4.0ms 1.3ms 1.3ms > > My first question is: how is it possible to have such gain in time for the IBM between the case wal_sync_method=fsync andthe case > wal_sync_method=open_sync ? > > Another problem is the following: > about every 1000 database access (not regular), the database accesses are hanged during approximately 2500ms. > I suppose that this time is used by the OS to flush the memory cache to hard disk. > > My second question is: how is it possible to avoid such hanging of the database ? Is it possible to flush a part of thecache while > working on another part of it, the goal being not to interrupt the whole process ? > > Thanx for your future comments. > > --Louis Croisez. > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) >
Thanx for your response Scott. In order to illustrate my problem, here is a capture of an Excel graph showing the access peaks I get: http://louis.croisez.free.fr/download/capture1.jpg The test program makes updates of a table (on 1000 records randomly) every 50ms. Approximately, every about 3 seconds, the update process is hanged for about 800ms (these data are measured on the Compaq.On the IBM, the results are worse). This is a great problem for us. Could you give me some help to correct that ? LM Croisez. ----- Original Message ----- From: "scott.marlowe" <scott.marlowe@ihs.com> To: "Louis-Marie Croisez" <louis-marie.croisez@etca.alcatel.be> Cc: <pgsql-general@postgresql.org> Sent: Wednesday, October 09, 2002 6:46 PM Subject: Re: [GENERAL] access time performance problem > Quick question, are you regularly vacuuming and analyzing your database? > > Also, ext3 can definitely slow things down. If your machine is stable and > on a UPS it may be worth your while to just run ext2. > > Also, have you compared output from bonnie++ on the compaq against the > IBM (run it on the same drive that hosts the database of course.) it's a > free program you can download to test your drive subsystem's performance. > A SCSI mirror set on 10k drives should be able to read at >30 Megs a > second and an IDE drive should be in the 5 to 15 Megs a second range. > > Since Postgresql is designed more for integrity and transactions, it may > not be your best choice here. I'm not sure what would be your best > choice, but Postgresql is not known for being a real time system with > performance guarantees on response times. > > Also, what processor speeds are these two machines? Just wondering. > > On Wed, 9 Oct 2002, Louis-Marie Croisez wrote: > > > I have an IBM Xseries 300 single cpu with RH installed, 512Mb RAM and SCSI drive with hardware mirroring. > > Postgresql database is on a partition with ext3 (journalized file system). > > My greatest table contains about 30.000 records. > > > > Postgresql in my project is used to feed/get data from an external hardware as quick as possible. > > The external device ask the IBM for its configuration data, and the goal is to do a fetch on the database and to sendback the info > > as quick as possible. > > The second scenario is when the external device wants to back up its configuration. > > A mean time of 50ms between database accesses is foreseen. > > For both scenario I have chosen auto-commit mode, because every record has to be on disc as quick as possible. > > > > I have remarked very bad database access time performances. I have then tried with another computer : a common desktopPC (compaq), > > IDE drive, less memory and less CPU speed. I got better database access time. > > Here is the results: > > > > delete_records insert_records update_records > > Compaq mean access time: 2.7ms 4.5ms 4.8ms > > IBM mean access time: 22.9ms 24.6ms 25.9ms > > > > When browsing newsgroups, I found that playing with wal_sync_method parameter could give better results. > > I tried with wal_sync_method=open_sync and here are the results: > > > > delete_records insert_records update_records > > Compaq mean access time: 1.0ms 2.6ms 2.6ms > > IBM mean access time: 4.0ms 1.3ms 1.3ms > > > > My first question is: how is it possible to have such gain in time for the IBM between the case wal_sync_method=fsyncand the case > > wal_sync_method=open_sync ? > > > > Another problem is the following: > > about every 1000 database access (not regular), the database accesses are hanged during approximately 2500ms. > > I suppose that this time is used by the OS to flush the memory cache to hard disk. > > > > My second question is: how is it possible to avoid such hanging of the database ? Is it possible to flush a part of thecache while > > working on another part of it, the goal being not to interrupt the whole process ? > > > > Thanx for your future comments. > > > > --Louis Croisez. > > > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 2: you can get off all lists at once with the unregister command > > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) > > > >
Sorry, bad url: use this instead: http://louis.croisez.free.fr/capture1.jpg LM Croisez ----- Original Message ----- From: "Louis-Marie Croisez" <louis-marie.croisez@etca.alcatel.be> To: "scott.marlowe" <scott.marlowe@ihs.com> Cc: <pgsql-general@postgresql.org> Sent: Thursday, October 10, 2002 10:41 AM Subject: Re: [GENERAL] access time performance problem > Thanx for your response Scott. > In order to illustrate my problem, here is a capture of an Excel graph showing the access peaks I get: > http://louis.croisez.free.fr/download/capture1.jpg > The test program makes updates of a table (on 1000 records randomly) every 50ms. > Approximately, every about 3 seconds, the update process is hanged for about 800ms (these data are measured on the Compaq.On the > IBM, the results are worse). > This is a great problem for us. > Could you give me some help to correct that ? > > LM Croisez. > > > > ----- Original Message ----- > From: "scott.marlowe" <scott.marlowe@ihs.com> > To: "Louis-Marie Croisez" <louis-marie.croisez@etca.alcatel.be> > Cc: <pgsql-general@postgresql.org> > Sent: Wednesday, October 09, 2002 6:46 PM > Subject: Re: [GENERAL] access time performance problem > > > > Quick question, are you regularly vacuuming and analyzing your database? > > > > Also, ext3 can definitely slow things down. If your machine is stable and > > on a UPS it may be worth your while to just run ext2. > > > > Also, have you compared output from bonnie++ on the compaq against the > > IBM (run it on the same drive that hosts the database of course.) it's a > > free program you can download to test your drive subsystem's performance. > > A SCSI mirror set on 10k drives should be able to read at >30 Megs a > > second and an IDE drive should be in the 5 to 15 Megs a second range. > > > > Since Postgresql is designed more for integrity and transactions, it may > > not be your best choice here. I'm not sure what would be your best > > choice, but Postgresql is not known for being a real time system with > > performance guarantees on response times. > > > > Also, what processor speeds are these two machines? Just wondering. > > > > On Wed, 9 Oct 2002, Louis-Marie Croisez wrote: > > > > > I have an IBM Xseries 300 single cpu with RH installed, 512Mb RAM and SCSI drive with hardware mirroring. > > > Postgresql database is on a partition with ext3 (journalized file system). > > > My greatest table contains about 30.000 records. > > > > > > Postgresql in my project is used to feed/get data from an external hardware as quick as possible. > > > The external device ask the IBM for its configuration data, and the goal is to do a fetch on the database and to sendback the > info > > > as quick as possible. > > > The second scenario is when the external device wants to back up its configuration. > > > A mean time of 50ms between database accesses is foreseen. > > > For both scenario I have chosen auto-commit mode, because every record has to be on disc as quick as possible. > > > > > > I have remarked very bad database access time performances. I have then tried with another computer : a common desktopPC > (compaq), > > > IDE drive, less memory and less CPU speed. I got better database access time. > > > Here is the results: > > > > > > delete_records insert_records update_records > > > Compaq mean access time: 2.7ms 4.5ms 4.8ms > > > IBM mean access time: 22.9ms 24.6ms 25.9ms > > > > > > When browsing newsgroups, I found that playing with wal_sync_method parameter could give better results. > > > I tried with wal_sync_method=open_sync and here are the results: > > > > > > delete_records insert_records update_records > > > Compaq mean access time: 1.0ms 2.6ms 2.6ms > > > IBM mean access time: 4.0ms 1.3ms 1.3ms > > > > > > My first question is: how is it possible to have such gain in time for the IBM between the case wal_sync_method=fsyncand the > case > > > wal_sync_method=open_sync ? > > > > > > Another problem is the following: > > > about every 1000 database access (not regular), the database accesses are hanged during approximately 2500ms. > > > I suppose that this time is used by the OS to flush the memory cache to hard disk. > > > > > > My second question is: how is it possible to avoid such hanging of the database ? Is it possible to flush a part ofthe cache > while > > > working on another part of it, the goal being not to interrupt the whole process ? > > > > > > Thanx for your future comments. > > > > > > --Louis Croisez. > > > > > > > > > ---------------------------(end of broadcast)--------------------------- > > > TIP 2: you can get off all lists at once with the unregister command > > > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) > > > > > > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/users-lounge/docs/faq.html >
On Thursday 10 Oct 2002 9:51 am, Louis-Marie Croisez wrote: > Sorry, bad url: > use this instead: > http://louis.croisez.free.fr/capture1.jpg > > > > Thanx for your response Scott. > > In order to illustrate my problem, here is a capture of an Excel graph > > showing the access peaks I get: > > http://louis.croisez.free.fr/download/capture1.jpg > > The test program makes updates of a table (on 1000 records randomly) > > every 50ms. Approximately, every about 3 seconds, the update process is > > hanged for about 800ms (these data are measured on the Compaq. On the > > IBM, the results are worse). > > This is a great problem for us. > > Could you give me some help to correct that ? I'm no expert on fs issues, but I do remember reading about bursts of write activity occuring with ext3 (which you mentioned using in a previous mail, and Scott remarked upon). IIRC it was something to do with ext3 buffering for a period and then performing a bunch of writes at the same time. This looks like a plausible candidate for your problem. The bad news is you'll have to do some googling, since I can't remember where I heard about it. The good news is that I seem to remember tuning that would help. You might find vmstat/iostat useful to identify where the slowdown is occuring. - Richard Huxton
Hi Richard, I have test it right now mounting the ext3 partition with ext2. The result is identical. LM Croisez ----- Original Message ----- From: "Richard Huxton" <dev@archonet.com> To: "Louis-Marie Croisez" <louis-marie.croisez@etca.alcatel.be> Cc: <pgsql-general@postgresql.org> Sent: Thursday, October 10, 2002 11:26 AM Subject: Re: [GENERAL] access time performance problem On Thursday 10 Oct 2002 9:51 am, Louis-Marie Croisez wrote: > Sorry, bad url: > use this instead: > http://louis.croisez.free.fr/capture1.jpg > > > > Thanx for your response Scott. > > In order to illustrate my problem, here is a capture of an Excel graph > > showing the access peaks I get: > > http://louis.croisez.free.fr/download/capture1.jpg > > The test program makes updates of a table (on 1000 records randomly) > > every 50ms. Approximately, every about 3 seconds, the update process is > > hanged for about 800ms (these data are measured on the Compaq. On the > > IBM, the results are worse). > > This is a great problem for us. > > Could you give me some help to correct that ? I'm no expert on fs issues, but I do remember reading about bursts of write activity occuring with ext3 (which you mentioned using in a previous mail, and Scott remarked upon). IIRC it was something to do with ext3 buffering for a period and then performing a bunch of writes at the same time. This looks like a plausible candidate for your problem. The bad news is you'll have to do some googling, since I can't remember where I heard about it. The good news is that I seem to remember tuning that would help. You might find vmstat/iostat useful to identify where the slowdown is occuring. - Richard Huxton ---------------------------(end of broadcast)--------------------------- TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
On Thursday 10 Oct 2002 12:40 pm, Louis-Marie Croisez wrote: > Hi Richard, > I have test it right now mounting the ext3 partition with ext2. > The result is identical. Hmm - that's ruled the journalling out then. Logically, it can only be two things: 1. PostgreSQL flushing buffers or similar. 2. Another process flushing buffers or similar. Can you try it updating batches of 200 records rather than 1000 - if the time between delays increases that would indicate it's PG doing something. Otherwise it could be related to logging, another process or the OS. Try tracing activity with vmstat/iostat - that should show you what's happening in a general sense. Then we can see if we can pin it down in detail. -- Richard Huxton
On Thu, 10 Oct 2002, Louis-Marie Croisez wrote: > Sorry, bad url: > use this instead: > http://louis.croisez.free.fr/capture1.jpg I'm gonna make a guess here that it's WAL activity, or log activity. So first off, are you saving postgresql's output to a log? if so, try logging to a different drive than your data drive or turning off logging for a quick test. If it's WAL writes that are causing the problem, you can either try to change the settings for commit_delay and commit_siblings or symlinking the pg_xlog directory somewhere else. If you aren't sure how to do that, don't learn how to on a live production server with real data.