Обсуждение: Weird issue with planner choosing seq scan
I have a table, that in production, currently has a little over 3 million records in production. In development, the same table has about 10 million records (we have cleaned production a few weeks ago). One of my queries joins this table with another, and in development, the particular condition uses an IndexScan on the "stamp" column (the time this record was inserted) which makes it run fast. In Production however (different machine, similar specs/postgresql settings) the planner uses a seq scan on that table, causing the query performance to be abysmal (sometimes over 10 seconds in production, around 1 second in development). What can I do to tweak this/ troubleshoot it? I have another table with similar structure etc. that has the same issue. Thanks!!! Here is the query: SELECT node,count(*) AS counts FROM u_counts c,res r WHERE c.res_id=r.id AND stamp > (current_timestamp - interval '1 day') AND r.rtype='u' AND r.location=1 GROUP BY node; The tables have an index on u_counts.res_id, u_counts.stamp, res.location, and res.rtype Here is the production explain analyze: HashAggregate (cost=472824.67..472824.77 rows=8 width=6) (actual time=12482.856..12482.872 rows=9 loops=1) -> Hash Join (cost=16.71..471847.28 rows=195479 width=6) (actual time=1217.532..10618.930 rows=1035998 loops=1) Hash Cond: (c.res_id = r.id) -> Seq Scan on u_counts c (cost=0.00..466319.96 rows=948218 width=4) (actual time=1217.183..7343.507 rows=1035998 loops=1) Filter: (stamp > (now() - '1 day'::interval)) -> Hash (cost=15.88..15.88 rows=67 width=10) (actual time=0.299..0.299 rows=60 loops=1) -> Seq Scan on res r (cost=0.00..15.88 rows=67 width=10) (actual time=0.027..0.195 rows=60 loops=1) Filter: (((rtype)::text = 'u'::text) AND (location = 1)) Total runtime: 12482.961 ms Here is the development explain analyze: HashAggregate (cost=72.91..73.02 rows=9 width=6) (actual time=3108.793..3108.807 rows=9 loops=1) -> Hash Join (cost=10.42..71.27 rows=327 width=6) (actual time=0.608..2446.714 rows=392173 loops=1) Hash Cond: (c.res_id = r.id) -> Index Scan using u_counts_i2 on u_counts c (cost=0.00..53.53 rows=1082 width=4) (actual time=0.277..1224.582 rows=392173 loops=1) Index Cond: (stamp > (now() - '1 day'::interval)) -> Hash (cost=9.53..9.53 rows=71 width=10) (actual time=0.310..0.310 rows=78 loops=1) -> Seq Scan on res r (cost=0.00..9.53 rows=71 width=10) (actual time=0.010..0.189 rows=78 loops=1) Filter: (((rtype)::text = 'u'::text) AND (location = 1)) Total runtime: 3108.891 ms
Sean Leach <sleach@wiggum.com> writes: > I have a table, that in production, currently has a little over 3 > million records in production. In development, the same table has > about 10 million records (we have cleaned production a few weeks > ago). You mean the other way around, to judge by the rowcounts from EXPLAIN. > -> Index Scan using u_counts_i2 on u_counts c > (cost=0.00..53.53 rows=1082 width=4) (actual time=0.277..1224.582 > rows=392173 loops=1) I kinda think the devel system wouldn't be using an indexscan either if it had up-to-date ANALYZE statistics. But even with the 1082 row estimate that seems a remarkably low cost estimate. Have you been playing games with random_page_cost? Maybe you forgot to duplicate the devel system's cost parameters onto the production system? regards, tom lane
Nope, seems like that would make sense but dev is 10 mill, prod is 3 million. Also including random_page_cost below. Thanks for any help. Here is dev: db=> analyze u_counts; ANALYZE Time: 15775.161 ms db=> select count(1) from u_counts; count ---------- 10972078 (1 row) db=> show random_page_cost; random_page_cost ------------------ 4 (1 row) Time: 0.543 ms db=> explain analyze SELECT node,count(*) AS counts FROM u_counts c,res r WHERE c.res_id=r.id AND stamp > (current_timestamp - interval '1 day') AND r.rtype='udns' AND r.location=1 GROUP BY node; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------------- HashAggregate (cost=12906.12..12906.24 rows=9 width=6) (actual time=3135.831..3135.845 rows=9 loops=1) -> Hash Join (cost=10.42..12538.88 rows=73449 width=6) (actual time=0.746..2475.632 rows=391380 loops=1) Hash Cond: (c.res_id = r.id) -> Index Scan using u_counts_i2 on db c (cost=0.00..10882.33 rows=243105 width=4) (actual time=0.287..1269.651 rows=391380 loops=1) Index Cond: (stamp > (now() - '1 day'::interval)) -> Hash (cost=9.53..9.53 rows=71 width=10) (actual time=0.430..0.430 rows=78 loops=1) -> Seq Scan on res r (cost=0.00..9.53 rows=71 width=10) (actual time=0.021..0.203 rows=78 loops=1) Filter: (((rtype)::text = 'udns'::text) AND (location = 1)) Total runtime: 3136.000 ms Now - here is prod: db=> show random_page_cost; random_page_cost ------------------ 4 (1 row) Time: 0.434 ms db=> analyze u_counts; ANALYZE Time: 179.928 ms db=> select count(1) from u_counts; count --------- 3292215 (1 row) db=> explain analyze SELECT node,count(*) AS counts FROM u_counts c,res r WHERE c.res_id=r.id AND stamp > (current_timestamp - interval '1 day') AND r.rtype='udns' AND r.location=1 GROUP BY node; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------ HashAggregate (cost=452333.49..452333.59 rows=8 width=6) (actual time=13200.887..13200.902 rows=9 loops=1) -> Hash Join (cost=16.71..451192.74 rows=228149 width=6) (actual time=1430.458..11274.073 rows=1036015 loops=1) Hash Cond: (c.res_id = r.id) -> Seq Scan on u_counts c (cost=0.00..444744.45 rows=1106691 width=4) (actual time=1429.996..7893.178 rows=1036015 loops=1) Filter: (stamp > (now() - '1 day'::interval)) -> Hash (cost=15.88..15.88 rows=67 width=10) (actual time=0.363..0.363 rows=60 loops=1) -> Seq Scan on res r (cost=0.00..15.88 rows=67 width=10) (actual time=0.046..0.258 rows=60 loops=1) Filter: (((rtype)::text = 'udns'::text) AND (location = 1)) Total runtime: 13201.046 ms (9 rows) Time: 13204.686 ms On Feb 24, 2008, at 9:50 AM, Tom Lane wrote: > Sean Leach <sleach@wiggum.com> writes: >> I have a table, that in production, currently has a little over 3 >> million records in production. In development, the same table has >> about 10 million records (we have cleaned production a few weeks >> ago). > > You mean the other way around, to judge by the rowcounts from EXPLAIN. > >> -> Index Scan using u_counts_i2 on u_counts c >> (cost=0.00..53.53 rows=1082 width=4) (actual time=0.277..1224.582 >> rows=392173 loops=1) > > I kinda think the devel system wouldn't be using an indexscan e ither > if it had up-to-date ANALYZE statistics. But even with the 1082 row > estimate that seems a remarkably low cost estimate. Have you been > playing games with random_page_cost? Maybe you forgot to duplicate > the > devel system's cost parameters onto the production system? > > regards, tom lane
Sean Leach <sleach@wiggum.com> writes: > Now - here is prod: > db=> select count(1) from u_counts; > count > --------- > 3292215 > (1 row) > -> Seq Scan on u_counts c (cost=0.00..444744.45 > rows=1106691 width=4) (actual time=1429.996..7893.178 rows=1036015 > loops=1) > Filter: (stamp > (now() - '1 day'::interval)) Given that this scan actually is selecting about a third of the table, I'm not sure that the planner is doing the wrong thing. It's hard to see how an indexscan would be an improvement. [ thinks for a bit... ] Actually, the problem might be the 3M executions of now() and interval subtraction that you get in the seqscan case. What results do you get if you write it with a sub-select like this: explain analyze SELECT node,count(*) AS counts FROM u_counts c,res r WHERE c.res_id=r.id AND stamp > (SELECT current_timestamp - interval '1 day') AND r.rtype='udns' AND r.location=1 GROUP BY node; regards, tom lane
On Feb 24, 2008, at 11:10 AM, Tom Lane wrote: > Sean Leach <sleach@wiggum.com> writes: >> Now - here is prod: > >> db=> select count(1) from u_counts; >> count >> --------- >> 3292215 >> (1 row) > > >> -> Seq Scan on u_counts c (cost=0.00..444744.45 >> rows=1106691 width=4) (actual time=1429.996..7893.178 rows=1036015 >> loops=1) >> Filter: (stamp > (now() - '1 day'::interval)) > > Given that this scan actually is selecting about a third of the table, > I'm not sure that the planner is doing the wrong thing. It's hard to > see how an indexscan would be an improvement. > > [ thinks for a bit... ] Actually, the problem might be the 3M > executions of now() and interval subtraction that you get in the > seqscan > case. What results do you get if you write it with a sub-select > like this: > > explain analyze SELECT node,count(*) AS counts FROM u_counts > c,res r WHERE c.res_id=r.id AND stamp > (SELECT current_timestamp - > interval > '1 day') AND r.rtype='udns' AND r.location=1 GROUP BY node; Unfortunately, the same, dev uses index scan, prod uses seq scan, prod takes about 4x longer to do the query. Any other thoughts on best way to proceed? Thanks again Tom. > > > regards, tom lane > > ---------------------------(end of > broadcast)--------------------------- > TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match
Tom Lane wrote > Sean Leach <sleach@wiggum.com> writes: > > Now - here is prod: > > > db=> select count(1) from u_counts; > > count > > --------- > > 3292215 > > (1 row) > > > > -> Seq Scan on u_counts c (cost=0.00..444744.45 > > rows=1106691 width=4) (actual time=1429.996..7893.178 rows=1036015 > > loops=1) > > Filter: (stamp > (now() - '1 day'::interval)) > > Given that this scan actually is selecting about a third of the table, > I'm not sure that the planner is doing the wrong thing. It's hard to > see how an indexscan would be an improvement. If you always get around a third of the rows in your table written in the last day, you've got to be deleting about a thirdof the rows in your table every day too. You might have a huge number of dead rows in your table, slowing down the sequentialscan. (Likewise updating a third of the rows, changing an indexed field.) What do you get from: VACUUM VERBOSE u_counts; Regards, Stephen Denne. Disclaimer: At the Datamail Group we value team commitment, respect, achievement, customer focus, and courage. This email with any attachmentsis confidential and may be subject to legal privilege. If it is not intended for you please advise by reply immediately,destroy it and do not copy, disclose or use it in any way. __________________________________________________________________ This email has been scanned by the DMZGlobal Business Quality Electronic Messaging Suite. Please see http://www.dmzglobal.com/services/bqem.htm for details. __________________________________________________________________
On Feb 24, 2008, at 1:18 PM, Stephen Denne wrote: > If you always get around a third of the rows in your table written > in the last day, you've got to be deleting about a third of the rows > in your table every day too. You might have a huge number of dead > rows in your table, slowing down the sequential scan. > (Likewise updating a third of the rows, changing an indexed field.) > > What do you get from: > VACUUM VERBOSE u_counts; This actually makes sense as we aggregate the production rows (but not development), and here is the output of vacuum analyze. We have the auto vacuum daemon on, but after we do our aggregation (we aggregate rows down to a less granular time scale, i.e. similar to what rrdtool does etc.), we should probably do a 'vacuum full analyze' moving forward after each aggregation run, right? I need to do one now it appears, but I am assuming it will take a _long_ time...I might need to schedule some downtime if it will. Even without a full vacuum, the query seems to have come down from 20-30s to 5s. db=> VACUUM VERBOSE u_counts; INFO: vacuuming "public.u_counts" INFO: index "u_counts_pkey" now contains 5569556 row versions in 73992 pages DETAIL: 0 index row versions were removed. 57922 index pages have been deleted, 57922 are currently reusable. CPU 0.59s/0.09u sec elapsed 3.73 sec. INFO: index "u_counts_i1" now contains 5569556 row versions in 76820 pages DETAIL: 0 index row versions were removed. 54860 index pages have been deleted, 54860 are currently reusable. CPU 1.04s/0.16u sec elapsed 20.10 sec. INFO: index "u_counts_i2" now contains 5569556 row versions in 77489 pages DETAIL: 0 index row versions were removed. 53708 index pages have been deleted, 53708 are currently reusable. CPU 0.70s/0.10u sec elapsed 5.41 sec. INFO: index "u_counts_i3" now contains 5569556 row versions in 76900 pages DETAIL: 0 index row versions were removed. 55564 index pages have been deleted, 55564 are currently reusable. CPU 0.94s/0.13u sec elapsed 20.34 sec. INFO: "u_counts": found 0 removable, 5569556 nonremovable row versions in 382344 pages DETAIL: 2085075 dead row versions cannot be removed yet. There were 15567992 unused item pointers. 281727 pages contain useful free space. 0 pages are entirely empty. CPU 5.24s/1.77u sec elapsed 53.69 sec. WARNING: relation "public.u_counts" contains more than "max_fsm_pages" pages with useful free space HINT: Consider using VACUUM FULL on this relation or increasing the configuration parameter "max_fsm_pages". VACUUM Time: 53758.329 ms
The fact that your indexes are bloated but your table is not makes me wonder if you're not running a really old version of pgsql that had problems with monotonically increasing indexes bloating over time and requiring reindexing. That problem has been (for the most part) solved by some hacking Tom Lane did some time back. What version pgsql is this? If it's pre 8.0 it might be worth looking into migrating for performance and maintenance reasons.
On Feb 24, 2008, at 4:03 PM, Scott Marlowe wrote: > The fact that your indexes are bloated but your table is not makes me > wonder if you're not running a really old version of pgsql that had > problems with monotonically increasing indexes bloating over time and > requiring reindexing. > > That problem has been (for the most part) solved by some hacking Tom > Lane did some time back. > > What version pgsql is this? If it's pre 8.0 it might be worth looking > into migrating for performance and maintenance reasons. It's the latest 8.3.0 release :(
On Sun, Feb 24, 2008 at 6:05 PM, Sean Leach <sleach@wiggum.com> wrote: > On Feb 24, 2008, at 4:03 PM, Scott Marlowe wrote: > > > > > What version pgsql is this? If it's pre 8.0 it might be worth looking > > into migrating for performance and maintenance reasons. > > It's the latest 8.3.0 release :( Urg. Then I wonder how your indexes are bloating but your table is not... you got autovac running? No weird lock issues? It's a side issue right now since the table is showing as non-bloated (unless you've got a long running transaction and that number is WAY off from your vacuum)
On Feb 24, 2008, at 4:27 PM, Scott Marlowe wrote: > On Sun, Feb 24, 2008 at 6:05 PM, Sean Leach <sleach@wiggum.com> wrote: > >> On Feb 24, 2008, at 4:03 PM, Scott Marlowe wrote: >> >>> >>> What version pgsql is this? If it's pre 8.0 it might be worth >>> looking >>> into migrating for performance and maintenance reasons. >> >> It's the latest 8.3.0 release :( > > Urg. Then I wonder how your indexes are bloating but your table is > not... you got autovac running? No weird lock issues? It's a side > issue right now since the table is showing as non-bloated (unless > you've got a long running transaction and that number is WAY off from > your vacuum) Autovac is running, but probably not tuned. I am looking at my max_fsm_pages setting to up as vacuum says, but not sure which value to use (all the posts on the web refer to what looks like an old vacuum output format), is this the line to look at? INFO: "u_counts": found 0 removable, 6214708 nonremovable row versions in 382344 pages DETAIL: 2085075 dead row versions cannot be removed yet. I.e. I need 382344 max_fsm_pages? No weird lock issues that we have seen. So should I do a vacuum full and then hope this doesn't happen again? Or should I run a VACUUM FULL after each aggregation run? Thanks! Sean
On Sun, 24 Feb 2008, Tom Lane wrote: > Sean Leach <sleach@wiggum.com> writes: >> I have a table, that in production, currently has a little over 3 >> million records in production. In development, the same table has >> about 10 million records (we have cleaned production a few weeks >> ago). > > You mean the other way around, to judge by the rowcounts from EXPLAIN. > >> -> Index Scan using u_counts_i2 on u_counts c >> (cost=0.00..53.53 rows=1082 width=4) (actual time=0.277..1224.582 >> rows=392173 loops=1) > > I kinda think the devel system wouldn't be using an indexscan either > if it had up-to-date ANALYZE statistics. But even with the 1082 row > estimate that seems a remarkably low cost estimate. Seems pretty obvious to me. The table is obviously going to be well ordered by the timestamp, if that's the time that the entries are inserted into the table. So the index is going to have a very good correlation with the order of the table, which is why the estimated cost for the index scan is so low. The production table will be more active than the development table, so the entries in it will be more recent. The entries that were cleaned out a while ago are all irrelevant, because they will be old ones, and we are specifically searching for new entries. Because the production table is more active, even though it is smaller, the results of the search will be bigger (as seen in the explain analyse results), pushing it over the limit and making a sequential scan more attractive. Matthew -- Failure is not an option. It comes bundled with your Microsoft product. -- Ferenc Mantfeld
Sean Leach wrote: > On Feb 24, 2008, at 4:27 PM, Scott Marlowe wrote: > > > > > Urg. Then I wonder how your indexes are bloating but your table is > > not... you got autovac running? No weird lock issues? It's a side > > issue right now since the table is showing as non-bloated (unless > > you've got a long running transaction and that number is > WAY off from > > your vacuum) > > > Autovac is running, but probably not tuned. I am looking at my > max_fsm_pages setting to up as vacuum says, but not sure which value > to use (all the posts on the web refer to what looks like an old > vacuum output format), is this the line to look at? > > INFO: "u_counts": found 0 removable, 6214708 nonremovable row > versions in 382344 pages > DETAIL: 2085075 dead row versions cannot be removed yet. > > I.e. I need 382344 max_fsm_pages? No weird lock issues that we have > seen. I think the hint and warning are referring to this line: > 281727 pages contain useful free space. But you're likely to have other relations in your database that have useful free space too. What this warning is saying is that at least some of the useful free space in that table will not be re-used for new rowsor row versions, because it is impossible for the free space map to have references to all of the pages with usable space,since it is too small to hold that much information. > So should I do a vacuum full and then hope this doesn't > happen again? > Or should I run a VACUUM FULL after each aggregation run? If your usage pattern results in generating all of that unused space in one transaction, and no further inserts or updatesto that table till next time you run the same process, then my guess is that you probably should run a vacuum fullon that table after each aggregation run. In that case you wouldn't have to increase max_fsm_pages solely to keep trackof large amount of unused space in that table, since you're cleaning it up as soon as you're generating it. You earlier had 5.5 million row versions, 2 million of them dead but not yet removable, and you said (even earlier) thatthe table had 3.3 million rows in it. You now say you've got 6.2 million row versions (with the same 2M dead). So it looks like you're creating new row versionsat quite a pace, in which case increasing max_fsm_pages, and not worrying about doing a vacuum full _every_ timeis probably a good idea. Have you checked Scott Marlowe's note: > > unless you've got a long running transaction How come those 2 million dead rows are not removable yet? My guess (based on a quick search of the mailing lists) would bethat they were generated from your aggregation run, and that a long running transaction started before your aggregationrun committed (possibly even before it started), and that transaction is still alive. Alternatively, it may be a different 2 million dead row versions now than earlier, and may simply be a side effect of yourparticular usage, and nothing to worry about. (Though it is exactly the same number of rows, which strongly hints atbeing exactly the same rows.) Regards, Stephen Denne. Disclaimer: At the Datamail Group we value team commitment, respect, achievement, customer focus, and courage. This email with any attachmentsis confidential and may be subject to legal privilege. If it is not intended for you please advise by reply immediately,destroy it and do not copy, disclose or use it in any way. __________________________________________________________________ This email has been scanned by the DMZGlobal Business Quality Electronic Messaging Suite. Please see http://www.dmzglobal.com/services/bqem.htm for details. __________________________________________________________________
On Feb 25, 2008, at 1:19 PM, Stephen Denne wrote: > >> So should I do a vacuum full and then hope this doesn't >> happen again? >> Or should I run a VACUUM FULL after each aggregation run? > > If your usage pattern results in generating all of that unused space > in one transaction, and no further inserts or updates to that table > till next time you run the same process, then my guess is that you > probably should run a vacuum full on that table after each > aggregation run. In that case you wouldn't have to increase > max_fsm_pages solely to keep track of large amount of unused space > in that table, since you're cleaning it up as soon as you're > generating it. > > You earlier had 5.5 million row versions, 2 million of them dead but > not yet removable, and you said (even earlier) that the table had > 3.3 million rows in it. > You now say you've got 6.2 million row versions (with the same 2M > dead). So it looks like you're creating new row versions at quite a > pace, in which case increasing max_fsm_pages, and not worrying about > doing a vacuum full _every_ time is probably a good idea. So 281727 should be the minimum I bump it to correct? > > > Have you checked Scott Marlowe's note: > >>> unless you've got a long running transaction > > How come those 2 million dead rows are not removable yet? My guess > (based on a quick search of the mailing lists) would be that they > were generated from your aggregation run, and that a long running > transaction started before your aggregation run committed (possibly > even before it started), and that transaction is still alive. > > Alternatively, it may be a different 2 million dead row versions now > than earlier, and may simply be a side effect of your particular > usage, and nothing to worry about. (Though it is exactly the same > number of rows, which strongly hints at being exactly the same rows.) Great detective work, you are correct. We have a daemon that runs and is constantly adding new data to that table, then we aggregated it daily (I said weekly before, I was incorrect) - which deletes several rows as it updates a bunch of others. So it sounds like upping max_fsm_pages is the best option. Thanks again everyone!
Sean Leach wrote > On Feb 25, 2008, at 1:19 PM, Stephen Denne wrote: > > > >> So should I do a vacuum full and then hope this doesn't > >> happen again? > >> Or should I run a VACUUM FULL after each aggregation run? > > > > If your usage pattern results in generating all of that > unused space > > in one transaction, and no further inserts or updates to > that table > > till next time you run the same process, then my guess is that you > > probably should run a vacuum full on that table after each > > aggregation run. In that case you wouldn't have to increase > > max_fsm_pages solely to keep track of large amount of unused space > > in that table, since you're cleaning it up as soon as you're > > generating it. > > > > You earlier had 5.5 million row versions, 2 million of them > dead but > > not yet removable, and you said (even earlier) that the table had > > 3.3 million rows in it. > > You now say you've got 6.2 million row versions (with the same 2M > > dead). So it looks like you're creating new row versions at > quite a > > pace, in which case increasing max_fsm_pages, and not > worrying about > > doing a vacuum full _every_ time is probably a good idea. > > So 281727 should be the minimum I bump it to correct? Please know that I'm very new at advising PostgreSQL users how they should tune their system... My understanding of your vacuum verbose output was that it was pointing out that max_fsm_pages was currently smaller than281727, so therefore there was no way it could contain mappings to all the reusable space. However I don't think it ishinting at, nor recommending a value that you should be using. If you do nothing, then this number of pages with reusable space will probably continue to grow, therefore, it probably hasbeen growing. So, for example, if your max_fsm_pages is currently only 20000, then perhaps 20000 of the 281727 pages with reusable spaceare in the free space map. The remaining 260000 pages _may_ have been generated through 20 different processes eachof which created 13000 more pages with reusable space than the map could reference. If that was the case, then a max_fsm_pagesof 33000 might be large enough. Do you see what I'm getting at? I think that you should do a vacuum full of that table once, then monitor the number of pages in it with reusable space fora while (over a few iterations of your regular processes). That should give you information about how much larger yourmax_fsm_pages should be than it currently is. Regards, Stephen Denne. Disclaimer: At the Datamail Group we value team commitment, respect, achievement, customer focus, and courage. This email with any attachmentsis confidential and may be subject to legal privilege. If it is not intended for you please advise by reply immediately,destroy it and do not copy, disclose or use it in any way. __________________________________________________________________ This email has been scanned by the DMZGlobal Business Quality Electronic Messaging Suite. Please see http://www.dmzglobal.com/services/bqem.htm for details. __________________________________________________________________
Sean Leach wrote > On Feb 25, 2008, at 1:19 PM, Stephen Denne wrote: > > > > > > Have you checked Scott Marlowe's note: > > > >>> unless you've got a long running transaction > > > > How come those 2 million dead rows are not removable yet? My guess > > (based on a quick search of the mailing lists) would be that they > > were generated from your aggregation run, and that a long running > > transaction started before your aggregation run committed > (possibly > > even before it started), and that transaction is still alive. > > > > Alternatively, it may be a different 2 million dead row > versions now > > than earlier, and may simply be a side effect of your particular > > usage, and nothing to worry about. (Though it is exactly the same > > number of rows, which strongly hints at being exactly the > same rows.) > > > Great detective work, you are correct. We have a daemon that > runs and > is constantly adding new data to that table, then we aggregated it > daily (I said weekly before, I was incorrect) - which deletes > several > rows as it updates a bunch of others. So it sounds like upping > max_fsm_pages is the best option. but... do you have a long running transaction? Are you happy having 30% to 40% of your table unusable (needlessly?) and slowingdown your sequential scans? Regards, Stephen Denne. Disclaimer: At the Datamail Group we value team commitment, respect, achievement, customer focus, and courage. This email with any attachmentsis confidential and may be subject to legal privilege. If it is not intended for you please advise by reply immediately,destroy it and do not copy, disclose or use it in any way. __________________________________________________________________ This email has been scanned by the DMZGlobal Business Quality Electronic Messaging Suite. Please see http://www.dmzglobal.com/services/bqem.htm for details. __________________________________________________________________
On Feb 25, 2008, at 2:59 PM, Stephen Denne wrote: >> > > Please know that I'm very new at advising PostgreSQL users how they > should tune their system... I'd never have known it if you hadn't said anything > > > My understanding of your vacuum verbose output was that it was > pointing out that max_fsm_pages was currently smaller than 281727, > so therefore there was no way it could contain mappings to all the > reusable space. However I don't think it is hinting at, nor > recommending a value that you should be using. > > If you do nothing, then this number of pages with reusable space > will probably continue to grow, therefore, it probably has been > growing. > > So, for example, if your max_fsm_pages is currently only 20000, then > perhaps 20000 of the 281727 pages with reusable space are in the > free space map. The remaining 260000 pages _may_ have been generated > through 20 different processes each of which created 13000 more > pages with reusable space than the map could reference. If that was > the case, then a max_fsm_pages of 33000 might be large enough. > > Do you see what I'm getting at? > I think that you should do a vacuum full of that table once, then > monitor the number of pages in it with reusable space for a while > (over a few iterations of your regular processes). That should give > you information about how much larger your max_fsm_pages should be > than it currently is. This sounds sane to me, will do. Thanks again!