Обсуждение: obsessive-compulsive vacuum behavior
I've got an 8.4.2 database where it appears that vacuum keeps redoing the same table and indexes, never thinking it's finished: auditor=# VACUUM analyze VERBOSE repair_queue ; INFO: vacuuming "public.repair_queue" INFO: scanned index "repair_queue_pkey" to remove 2795932 row versions DETAIL: CPU 14.98s/15.29u sec elapsed 312.50 sec. INFO: scanned index "repair_queue_auditor" to remove 2795932 row versions DETAIL: CPU 0.74s/0.50u sec elapsed 10.49 sec. INFO: scanned index "repair_queue_sort" to remove 2795932 row versions DETAIL: CPU 2.99s/1.58u sec elapsed 45.14 sec. INFO: scanned index "repair_queue_sort3" to remove 2795932 row versions DETAIL: CPU 0.89s/0.48u sec elapsed 10.99 sec. INFO: "repair_queue": removed 2795932 row versions in 43199 pages DETAIL: CPU 1.04s/0.39u sec elapsed 17.93 sec. INFO: scanned index "repair_queue_pkey" to remove 2795938 row versions DETAIL: CPU 14.71s/15.06u sec elapsed 362.37 sec. INFO: scanned index "repair_queue_auditor" to remove 2795938 row versions DETAIL: CPU 0.62s/0.45u sec elapsed 14.36 sec. INFO: scanned index "repair_queue_sort" to remove 2795938 row versions DETAIL: CPU 2.97s/1.65u sec elapsed 56.94 sec. INFO: scanned index "repair_queue_sort3" to remove 2795938 row versions DETAIL: CPU 0.82s/0.44u sec elapsed 10.54 sec. INFO: "repair_queue": removed 2795938 row versions in 41055 pages DETAIL: CPU 0.75s/0.34u sec elapsed 7.59 sec. INFO: scanned index "repair_queue_pkey" to remove 2795959 row versions DETAIL: CPU 14.20s/14.56u sec elapsed 539.17 sec. INFO: scanned index "repair_queue_auditor" to remove 2795959 row versions DETAIL: CPU 0.75s/0.48u sec elapsed 13.76 sec. INFO: scanned index "repair_queue_sort" to remove 2795959 row versions DETAIL: CPU 3.07s/1.65u sec elapsed 44.29 sec. INFO: scanned index "repair_queue_sort3" to remove 2795959 row versions DETAIL: CPU 0.78s/0.44u sec elapsed 12.52 sec. INFO: "repair_queue": removed 2795959 row versions in 41004 pages DETAIL: CPU 0.88s/0.42u sec elapsed 12.49 sec. ...and so on. It's been running for an hour or so now, when it appears it shouldn't take 10 minutes. This seems pretty weirdto me.... has anybody else seen this behavior? I'm not even sure what details I could report which would help figureout what's going on.
On Sun, Mar 7, 2010 at 11:58 PM, Ben Chobot <bench@silentmedia.com> wrote: > I've got an 8.4.2 database where it appears that vacuum keeps redoing the same table and indexes, never thinking it's finished: > > auditor=# VACUUM analyze VERBOSE repair_queue ; > INFO: vacuuming "public.repair_queue" > INFO: scanned index "repair_queue_pkey" to remove 2795932 row versions > DETAIL: CPU 14.98s/15.29u sec elapsed 312.50 sec. > INFO: scanned index "repair_queue_auditor" to remove 2795932 row versions > DETAIL: CPU 0.74s/0.50u sec elapsed 10.49 sec. > INFO: scanned index "repair_queue_sort" to remove 2795932 row versions > DETAIL: CPU 2.99s/1.58u sec elapsed 45.14 sec. > INFO: scanned index "repair_queue_sort3" to remove 2795932 row versions > DETAIL: CPU 0.89s/0.48u sec elapsed 10.99 sec. > INFO: "repair_queue": removed 2795932 row versions in 43199 pages > DETAIL: CPU 1.04s/0.39u sec elapsed 17.93 sec. > INFO: scanned index "repair_queue_pkey" to remove 2795938 row versions > DETAIL: CPU 14.71s/15.06u sec elapsed 362.37 sec. > INFO: scanned index "repair_queue_auditor" to remove 2795938 row versions > DETAIL: CPU 0.62s/0.45u sec elapsed 14.36 sec. > INFO: scanned index "repair_queue_sort" to remove 2795938 row versions > DETAIL: CPU 2.97s/1.65u sec elapsed 56.94 sec. > INFO: scanned index "repair_queue_sort3" to remove 2795938 row versions > DETAIL: CPU 0.82s/0.44u sec elapsed 10.54 sec. > INFO: "repair_queue": removed 2795938 row versions in 41055 pages > DETAIL: CPU 0.75s/0.34u sec elapsed 7.59 sec. > INFO: scanned index "repair_queue_pkey" to remove 2795959 row versions > DETAIL: CPU 14.20s/14.56u sec elapsed 539.17 sec. > INFO: scanned index "repair_queue_auditor" to remove 2795959 row versions > DETAIL: CPU 0.75s/0.48u sec elapsed 13.76 sec. > INFO: scanned index "repair_queue_sort" to remove 2795959 row versions > DETAIL: CPU 3.07s/1.65u sec elapsed 44.29 sec. > INFO: scanned index "repair_queue_sort3" to remove row versions > DETAIL: CPU 0.78s/0.44u sec elapsed 12.52 sec. > INFO: "repair_queue": removed 2795959 row versions in 41004 pages > DETAIL: CPU 0.88s/0.42u sec elapsed 12.49 sec. > > > ...and so on. It's been running for an hour or so now, when it appears it shouldn't take 10 minutes. This seems prettyweird to me.... has anybody else seen this behavior? I'm not even sure what details I could report which would helpfigure out what's going on. Those are all different relations, and it's reclaiming a good number of rows and pages. 41004 pages is ~320 Megs. Even if the rows are small it's gonna be 100 Megs or so per index reclaimed. Seems like vacuum is doing its job. Is it running often enough to prevent bloating?
On Mon, Mar 8, 2010 at 12:26 AM, Scott Marlowe <scott.marlowe@gmail.com> wrote: > On Sun, Mar 7, 2010 at 11:58 PM, Ben Chobot <bench@silentmedia.com> wrote: >> I've got an 8.4.2 database where it appears that vacuum keeps redoing the same table and indexes, never thinking it'sfinished: >> >> auditor=# VACUUM analyze VERBOSE repair_queue ; >> INFO: vacuuming "public.repair_queue" >> INFO: scanned index "repair_queue_pkey" to remove 2795932 row versions >> DETAIL: CPU 14.98s/15.29u sec elapsed 312.50 sec. >> INFO: scanned index "repair_queue_auditor" to remove 2795932 row versions >> DETAIL: CPU 0.74s/0.50u sec elapsed 10.49 sec. >> INFO: scanned index "repair_queue_sort" to remove 2795932 row versions >> DETAIL: CPU 2.99s/1.58u sec elapsed 45.14 sec. >> INFO: scanned index "repair_queue_sort3" to remove 2795932 row versions >> DETAIL: CPU 0.89s/0.48u sec elapsed 10.99 sec. >> INFO: "repair_queue": removed 2795932 row versions in 43199 pages >> DETAIL: CPU 1.04s/0.39u sec elapsed 17.93 sec. >> INFO: scanned index "repair_queue_pkey" to remove 2795938 row versions >> DETAIL: CPU 14.71s/15.06u sec elapsed 362.37 sec. >> INFO: scanned index "repair_queue_auditor" to remove 2795938 row versions >> DETAIL: CPU 0.62s/0.45u sec elapsed 14.36 sec. >> INFO: scanned index "repair_queue_sort" to remove 2795938 row versions >> DETAIL: CPU 2.97s/1.65u sec elapsed 56.94 sec. >> INFO: scanned index "repair_queue_sort3" to remove 2795938 row versions >> DETAIL: CPU 0.82s/0.44u sec elapsed 10.54 sec. >> INFO: "repair_queue": removed 2795938 row versions in 41055 pages >> DETAIL: CPU 0.75s/0.34u sec elapsed 7.59 sec. >> INFO: scanned index "repair_queue_pkey" to remove 2795959 row versions >> DETAIL: CPU 14.20s/14.56u sec elapsed 539.17 sec. >> INFO: scanned index "repair_queue_auditor" to remove 2795959 row versions >> DETAIL: CPU 0.75s/0.48u sec elapsed 13.76 sec. >> INFO: scanned index "repair_queue_sort" to remove 2795959 row versions >> DETAIL: CPU 3.07s/1.65u sec elapsed 44.29 sec. >> INFO: scanned index "repair_queue_sort3" to remove row versions >> DETAIL: CPU 0.78s/0.44u sec elapsed 12.52 sec. >> INFO: "repair_queue": removed 2795959 row versions in 41004 pages >> DETAIL: CPU 0.88s/0.42u sec elapsed 12.49 sec. >> >> >> ...and so on. It's been running for an hour or so now, when it appears it shouldn't take 10 minutes. This seems prettyweird to me.... has anybody else seen this behavior? I'm not even sure what details I could report which would helpfigure out what's going on. > > Those are all different relations, and it's reclaiming a good number > of rows and pages. 41004 pages is ~320 Megs. Even if the rows are > small it's gonna be 100 Megs or so per index reclaimed. Seems like > vacuum is doing its job. Is it running often enough to prevent > bloating? OK, they're not ALL different relations, but only one seems to repeat much and that's the _sort one.
Ben Chobot <bench@silentmedia.com> writes: > ...and so on. It's been running for an hour or so now, when it appears it shouldn't take 10 minutes. This seems prettyweird to me.... has anybody else seen this behavior? I'm not even sure what details I could report which would helpfigure out what's going on. You're apparently trying to vacuum a really large table with a whole lot of dead rows using a tiny maintenance_work_mem setting (16MB, looks like). Every time the list of dead tuples fills up, it has to perform a pass over the table and indexes to get rid of those tuples. Then it goes back to scanning for more dead tuples. If it's still going, I'd suggest cancelling it and starting again with a more realistic maintenance_work_mem, maybe something in the several hundred MB range, depending on how much memory is free in your server. After it's done you probably ought to look into your autovacuum settings. You shouldn't have gotten into a state with so many dead tuples in the first place, I would think, if autovac were functioning properly. regards, tom lane
On Mar 8, 2010, at 7:16 AM, Tom Lane wrote: > After it's done you probably ought to look into your autovacuum > settings. You shouldn't have gotten into a state with so many dead > tuples in the first place, I would think, if autovac were functioning > properly. Yeah, it turns out our autovacuums kept getting canceled because this table gets a lot of churn. Now that we've turn offthe cost delay, things are much happier. Thanks!