Re: [PERFORM] encouraging index-only scans
От | Andres Freund |
---|---|
Тема | Re: [PERFORM] encouraging index-only scans |
Дата | |
Msg-id | 20130919225929.GC11116@awork2.anarazel.de обсуждение исходный текст |
Ответ на | Re: [PERFORM] encouraging index-only scans (Robert Haas <robertmhaas@gmail.com>) |
Ответы |
Re: [PERFORM] encouraging index-only scans
|
Список | pgsql-hackers |
On 2013-09-19 14:39:43 -0400, Robert Haas wrote: > On Tue, Sep 17, 2013 at 7:10 PM, Andres Freund <andres@2ndquadrant.com> wrote: > > I generally think the current logic for triggering VACUUMs via > > autovacuum doesn't really make all that much sense in the days where we > > have the visibility map. > > Right now, whether or not to autovacuum is the rest of a two-pronged > test. The first prong is based on number of updates and deletes > relative to table size; that triggers a regular autovacuum. The > second prong is based on age(relfrozenxid) and triggers a > non-page-skipping vacuum (colloquially, an anti-wraparound vacuum). And I have some hopes we can get rid of that in 9.4 (that alone would be worth a bump to 10.0 ;)). I really like Heikki's patch, even if I am envious that I didn't have the idea :P. Although it needs quite a bit of work to be ready. > The typical case in which this doesn't work out well is when the table > has a lot of inserts but few or no updates and deletes. So I propose > that we change the first prong to count inserts as well as updates and > deletes when deciding whether it needs to vacuum the table. We > already use that calculation to decide whether to auto-analyze, so it > wouldn't be very novel. We know that the work of marking pages > all-visible will need to be done at some point, and doing it sooner > will result in doing it in smaller batches, which seems generally > good. Yes, that's a desperately needed change. The reason I suggested keeping track of the xids of unremovable tuples is that the current logic doesn't handle that at all. We just unconditionally set n_dead_tuples to zero after a vacuum even if not a single row could actually be cleaned out. Which has the effect that we will not start a vacuum until enough bloat (or after changing this, new inserts) has collected to start vacuum anew. Which then will do twice the work. Resetting n_dead_tuples to the actual remaining dead tuples wouldn't do much good either - we would just immediately trigger a new vacuum the next time we check, even if the xmin horizon is still the same. > However, I do have one concern: it might lead to excessive > index-vacuuming. Right now, we skip the index vac step only if there > ZERO dead tuples are found during the heap scan. Even one dead tuple > (or line pointer) will cause an index vac cycle, which may easily be > excessive. So I further propose that we introduce a threshold for > index-vac; so that we only do index vac cycle if the number of dead > tuples exceeds, say 0.1% of the table size. Yes, that's a pretty valid concern. But we can't really do it that easily. a) We can only remove dead line pointers when we know there's no index pointing to it anymore. Which we only know after the index has been removed. b) We cannot check the validity of an index pointer if there's no heap tuple for it. Sure, we could check whether we're pointing to a dead line pointer, but the random io costs of that are prohibitive. Now, we could just mark line pointers as dead and not mark that page as all-visible and pick it up again on the next vacuum cycle. But that would suck long-term. I think the only real solution here is to store removed tuples tids (i.e. items where we've marked as dead) somewhere. Whenever we've found sufficient tuples to-be-removed from indexes we do phase 2. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
В списке pgsql-hackers по дате отправления:
Предыдущее
От: "MauMau"Дата:
Сообщение: Re: UTF8 national character data type support WIP patch and list of open issues.