Обсуждение: Continual Postgres headaches...
I've been trying for quite a while to get Postgresql tuned for use as an OLTP system. I have several PL/pgSQL functions that handle inserts and updates to the main table and several near-real-time daemons written that access the data and can take automated actions on it (email/page concerned people, get complimentary information from a different system, etc.). I started with Postgres 8.1 and am now using 8.2.4 (and have been since its release). I'll try to provide enough information for a decent response, but as I can't obviously put my entire schema and database out there, I'm hoping that I can get some decent guidelines beyond that what I've found though Google, etc. to get this thing tuned better.
Most of the data centers in on a central table and has 23 columns, 1 constraint, and 9 indexes. 4 of the indexes are partial. The table usually contains about 3-4 million rows, but I've cut it down to 1.2 million (cut out 2/3 of the data) in an effort to migrate the database to a 2nd sever for more testing. The two partial indexes used the most: 242MB accessed nearly constantly, and 15MB accessed every 5 seconds - but also updated constantly via inserts using the 242MB index. Other than one other 25MB index, the others seem to average around 300MB each, but these aren't used quite as often (usually about every minute or so).
My problems really are with performance consistency. I have tweaked the execution so that everything should run with sub-second execution times, but even after everything is running well, I can get at most a week or two of steady running before things start to degrade.
In my efforts to correct these consistency in execution problems, I have gone from vacuuming (with analyze) twice a day to every 30 minutes (how long it takes a vacuum analyze to run - another seeming problem because it shouldn't take so long?). I've done a VACUUM FULL and that sometimes helps, but obviously is not a long-term solution for a true OLTP system. Why I said "sometimes helps": earlier this week I did a VACUUM FULL, VACUUM ANALYZE on the entire schema. A query that was supposed to use the 25MB index above (that matched the parameters of the partial index exactly) was still not doing so. I had to DROP and re-CREATE the index (and do another ANALYZE) to get the database to (correctly) use the index.
Another problem: sometimes I get these unexplainable delays in using a 'stored procedure' even though all its SQL calls run just fine individually. For example, calling a particular function will take 1 minute to execute even though entering the SQL commands individually through psql will total up to about 1 second. When I log 'long-duration queries', I only see the top function call in the log. When I put 'RAISE NOTICE' statements in the various PL/pgSQL functions, I only see them displayed for the function I call directly; any underlying functions called from that function does not show any of their NOTICE statements. Because of this I can't track down where the delays are occurring for me to try and correct the problem (not that there should be one in the first place)! I should also note that there is not any lock contention (double-checked with pgadmin3) showing that would "artificially" delay the queries...
I've played with quite a few of the parameters like vacuum_cost_delay, work_mem, shared_buffers, turned fsync off, etc. in the postgresql.conf file. The server I've been given to use is quite old (read: not the fastest anymore), but normally shows a load average of 0.25.to 0.50 or so.
I guess my questions boil down to what I can do other than use the EXPLAIN, EXPLAIN ANALYZE, VACUUM ANALYZE, etc. to get my schema performing acceptably across the board in a consistent manner. If there is more information that I can give out that would improve responses, please let me know.
Thanks in advance for any help you send my way!
NOTICE: This electronic mail transmission may contain confidential information and is intended only for the person(s) named. Any use, copying or disclosure by any other person is strictly prohibited. If you have received this transmission in error, please notify the sender via e-mail.
On Dec 6, 2007 2:22 PM, Weber, Geoffrey M. <Geoffrey.Weber@mcleodusa.com> wrote: > In my efforts to correct these consistency in execution problems, I have > gone from vacuuming (with analyze) twice a day to every 30 minutes (how long > it takes a vacuum analyze to run - another seeming problem because it > shouldn't take so long?). I've done a VACUUM FULL and that sometimes helps, > but obviously is not a long-term solution for a true OLTP system. Why I > said "sometimes helps": earlier this week I did a VACUUM FULL, VACUUM > ANALYZE on the entire schema. A query that was supposed to use the 25MB > index above (that matched the parameters of the partial index exactly) was > still not doing so. I had to DROP and re-CREATE the index (and do another > ANALYZE) to get the database to (correctly) use the index. you may want to consider using autovacuum. vacuum full is usually not recommended. maybe you could post some explain analyze of some queries that are giving you problems? merlin
On Dec 6, 2007, at 1:22 PM, Weber, Geoffrey M. wrote: > I've been trying for quite a while to get Postgresql tuned for use > as an OLTP system. I have several PL/pgSQL functions that handle > inserts and updates to the main table and several near-real-time > daemons written that access the data and can take automated actions > on it (email/page concerned people, get complimentary information > from a different system, etc.). I started with Postgres 8.1 and am > now using 8.2.4 (and have been since its release). I'll try to > provide enough information for a decent response, but as I can't > obviously put my entire schema and database out there, I'm hoping > that I can get some decent guidelines beyond that what I've found > though Google, etc. to get this thing tuned better. > > > > Most of the data centers in on a central table and has 23 columns, > 1 constraint, and 9 indexes. 4 of the indexes are partial. The > table usually contains about 3-4 million rows, but I've cut it down > to 1.2 million (cut out 2/3 of the data) in an effort to migrate > the database to a 2nd sever for more testing. The two partial > indexes used the most: 242MB accessed nearly constantly, and 15MB > accessed every 5 seconds - but also updated constantly via inserts > using the 242MB index. Other than one other 25MB index, the others > seem to average around 300MB each, but these aren't used quite as > often (usually about every minute or so). Do you really have that many query patterns that you need 9 different indexes on the table? Without seeing the table layout I'm suspicious that with 23 column table with 9 indexes it could either do with some normalization, vertical partitioning or less indexes. There is definitely such a thing as too many indexes, especially if any of them share indexed columns. > My problems really are with performance consistency. I have > tweaked the execution so that everything should run with sub-second > execution times, but even after everything is running well, I can > get at most a week or two of steady running before things start to > degrade. > > > > In my efforts to correct these consistency in execution problems, I > have gone from vacuuming (with analyze) twice a day to every 30 > minutes (how long it takes a vacuum analyze to run - another > seeming problem because it shouldn't take so long?). That really depends on the table size and the kind of traffic it sees. > I've done a VACUUM FULL and that sometimes helps, but obviously is > not a long-term solution for a true OLTP system. Why I said > "sometimes helps": earlier this week I did a VACUUM FULL, VACUUM > ANALYZE on the entire schema. A query that was supposed to use the > 25MB index above (that matched the parameters of the partial index > exactly) was still not doing so. I had to DROP and re-CREATE the > index (and do another ANALYZE) to get the database to (correctly) > use the index. VACUUM FULL rewrites large portions of your table in order to compact space. Since you're physically moving the tuples around it requires updating the indexes on the table and thus can be a fast track to bloated indexes. The VACUUM FULL docs even point this out and suggests running REINDEX after a VACUUM FULL. You do realize that we have a REINDEX command, right? No need to drop and recreate separately when they are consecutive actions, just use REINDEX. > > > Another problem: sometimes I get these unexplainable delays in > using a 'stored procedure' even though all its SQL calls run just > fine individually. For example, calling a particular function will > take 1 minute to execute even though entering the SQL commands > individually through psql will total up to about 1 second. When I > log 'long-duration queries', I only see the top function call in > the log. When I put 'RAISE NOTICE' statements in the various PL/ > pgSQL functions, I only see them displayed for the function I call > directly; any underlying functions called from that function does > not show any of their NOTICE statements. Because of this I can't > track down where the delays are occurring for me to try and correct > the problem (not that there should be one in the first place)! I > should also note that there is not any lock contention (double- > checked with pgadmin3) showing that would "artificially" delay the > queries... > > > > I've played with quite a few of the parameters like > vacuum_cost_delay, work_mem, shared_buffers, turned fsync off, etc. > in the postgresql.conf file. The server I've been given to use is > quite old (read: not the fastest anymore), but normally shows a > load average of 0.25.to 0.50 or so. > > > > I guess my questions boil down to what I can do other than use the > EXPLAIN, EXPLAIN ANALYZE, VACUUM ANALYZE, etc. to get my schema > performing acceptably across the board in a consistent manner. If > there is more information that I can give out that would improve > responses, please let me know. As Merlin said, it would help to see some of those query plans along with those tables' definitions. In addition, your memory and disk/I/ O related config setting along with the system specs would help alot. Without concrete examples and numbers, the answers to all of your questions are "It depends -- on those concrete examples and numbers." Erik Jones Software Developer | Emma® erik@myemma.com 800.595.4401 or 615.292.5888 615.292.0777 (fax) Emma helps organizations everywhere communicate & market in style. Visit us online at http://www.myemma.com
On Dec 6, 2007, at 2:22 PM, Weber, Geoffrey M. wrote: > I've been trying for quite a while to get Postgresql tuned for use > as an OLTP system. I have several PL/pgSQL functions that handle > inserts and updates to the main table and several near-real-time > daemons written that access the data and can take automated actions > on it (email/page concerned people, get complimentary information > from a different system, etc.). I started with Postgres 8.1 and am > now using 8.2.4 (and have been since its release). I'll try to > provide enough information for a decent response, but as I can't > obviously put my entire schema and database out there, I'm hoping > that I can get some decent guidelines beyond that what I've found > though Google, etc. to get this thing tuned better. > > Most of the data centers in on a central table and has 23 columns, > 1 constraint, and 9 indexes. 4 of the indexes are partial. The > table usually contains about 3-4 million rows, but I've cut it down > to 1.2 million (cut out 2/3 of the data) in an effort to migrate > the database to a 2nd sever for more testing. The two partial > indexes used the most: 242MB accessed nearly constantly, and 15MB > accessed every 5 seconds - but also updated constantly via inserts > using the 242MB index. Other than one other 25MB index, the others > seem to average around 300MB each, but these aren't used quite as > often (usually about every minute or so). > > My problems really are with performance consistency. I have > tweaked the execution so that everything should run with sub-second > execution times, but even after everything is running well, I can > get at most a week or two of steady running before things start to > degrade. > Without some examples of reproducible problematic behavior, you are likely to get only hazy responses. With your rate of database changes, you may need to be vacuuming more often (or certain tables more and other tables less). From your description above, it sounds like you are persistently polling the database for changes. Have you considered using asynchronous notifications? http://www.postgresql.org/docs/8.2/interactive/sql-listen.html Cheers, M
Sometimes breaking the query down using nested cursors can help, especially if the query has many joins. It usually makes behavior more predicatable anyway. -----Original Message----- From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Merlin Moncure Sent: Thursday, December 06, 2007 2:43 PM To: Weber, Geoffrey M. Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] Continual Postgres headaches... On Dec 6, 2007 2:22 PM, Weber, Geoffrey M. <Geoffrey.Weber@mcleodusa.com> wrote: > In my efforts to correct these consistency in execution problems, I have > gone from vacuuming (with analyze) twice a day to every 30 minutes (how long > it takes a vacuum analyze to run - another seeming problem because it > shouldn't take so long?). I've done a VACUUM FULL and that sometimes helps, > but obviously is not a long-term solution for a true OLTP system. Why I > said "sometimes helps": earlier this week I did a VACUUM FULL, VACUUM > ANALYZE on the entire schema. A query that was supposed to use the 25MB > index above (that matched the parameters of the partial index exactly) was > still not doing so. I had to DROP and re-CREATE the index (and do another > ANALYZE) to get the database to (correctly) use the index. you may want to consider using autovacuum. vacuum full is usually not recommended. maybe you could post some explain analyze of some queries that are giving you problems? merlin ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
"Weber, Geoffrey M." <Geoffrey.Weber@mcleodusa.com> writes: > My problems really are with performance consistency. I have tweaked the > execution so that everything should run with sub-second execution times, but > even after everything is running well, I can get at most a week or two of > steady running before things start to degrade. Aside from the other responses: gradual degradation over a period of time sounds like a bloat problem to me. You should monitor table and index sizes to confirm or deny that and find out just what's bloating. Switching to autovacuum, or a better-designed manual vacuum schedule, might be the ultimate answer, but first you need some concrete data about what's going wrong. Otherwise you're just shooting in the dark. regards, tom lane
On Thursday 06 December 2007 20:00, Tom Lane wrote: > "Weber, Geoffrey M." <Geoffrey.Weber@mcleodusa.com> writes: > > My problems really are with performance consistency. I have tweaked the > > execution so that everything should run with sub-second execution times, > > but even after everything is running well, I can get at most a week or > > two of steady running before things start to degrade. > > Aside from the other responses: gradual degradation over a period of > time sounds like a bloat problem to me. You should monitor table and > index sizes to confirm or deny that and find out just what's bloating. > Switching to autovacuum, or a better-designed manual vacuum schedule, > might be the ultimate answer, but first you need some concrete data > about what's going wrong. Otherwise you're just shooting in the dark. > just to add, make sure your free_space_map settings are set adequately; otherwise you might not be able to find an adequate schedule. -- Robert Treat Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL