Re: vacuum, performance, and MVCC
От | Tom Lane |
---|---|
Тема | Re: vacuum, performance, and MVCC |
Дата | |
Msg-id | 16709.1151082529@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: vacuum, performance, and MVCC (Csaba Nagy <nagy@ecircle-ag.com>) |
Ответы |
Re: vacuum, performance, and MVCC
("Mark Woodward" <pgsql@mohawksoft.com>)
Re: vacuum, performance, and MVCC (Bruce Momjian <bruce@momjian.us>) Re: vacuum, performance, and MVCC (Hannu Krosing <hannu@skype.net>) |
Список | pgsql-hackers |
Csaba Nagy <nagy@ecircle-ag.com> writes: >> Surprisingly its mostly WAL traffic, the heap/index pages themselves are >> often not yet synced to disk by time of vacuum, so no additional traffic >> there. If you had made 5 updates per page and then vacuum it, then you >> make effectively 1 extra WAL write meaning 20% increase in WAL traffic. > Is this also holding about read traffic ? I thought vacuum will make a > full table scan... for big tables a full table scan is always badly > influencing the performance of the box. If the full table scan would be > avoided, then I wouldn't mind running vacuum in a loop... If you're doing heavy updates of a big table then it's likely to end up visiting most of the table anyway, no? There is talk of keeping a map of dirty pages, but I think it'd be a win for infrequently-updated tables, not ones that need constant vacuuming. I think a lot of our problems in this area could be solved with fairly straightforward tuning efforts on the existing autovacuum infrastructure. In particular, someone should be looking into recommendable default vacuum-cost-delay settings so that a background vacuum doesn't affect performance too much. Another problem with the current autovac infrastructure is that it doesn't respond very well to the case where there are individual tables that need constant attention as well as many that don't. If you have N databases then you can visit a particular table at most once every N*autovacuum_naptime seconds, and *every* table in the entire cluster gets reconsidered at that same rate. I'm not sure if we need the ability to have multiple autovac daemons running at the same time, but we definitely could use something with a more flexible table-visiting pattern. Perhaps it would be enough to look through the per-table stats for each database before selecting the database to autovacuum in each cycle, instead of going by "least recently autovacuumed". Bottom line: there's still lots of low-hanging fruit. Why are people feeling that we need to abandon or massively complicate our basic architecture to make progress? regards, tom lane
В списке pgsql-hackers по дате отправления: