Re: vacuum, performance, and MVCC

Поиск
Список
Период
Сортировка
От A.M.
Тема Re: vacuum, performance, and MVCC
Дата
Msg-id 54624.216.41.12.254.1151072992.squirrel@webmail.webopticon.org
обсуждение исходный текст
Ответ на Re: vacuum, performance, and MVCC  (Martijn van Oosterhout <kleptog@svana.org>)
Список pgsql-hackers
On Fri, June 23, 2006 9:56 am, Martijn van Oosterhout wrote:
> On Fri, Jun 23, 2006 at 03:30:45PM +0200, Csaba Nagy wrote:
>
>>> It sounds like you have a "big" problem and you need a "big"
>>> solution.
>>
>> Well, Postgres does a decent job as it is. The problem is under peek
>> load, sometimes it gets bogged down and the usual things like vacuum will
>> not help immediately. I think a few more features like the dead space
>> map for quick vacuum and even something like the original post's
>> proposition would make postgres fly under heavy load too...
>
> I know there have a been a number of suggestions in the past to deal
> with this thing. Some I don't remember being mentioned in this thread are:
>
>
> - Once a tuple has been determined to be invisible to everyone,
> truncate it to just the header. This would probably work wonders for
> frequently updated wide tables. However, this required keeping track of
> the oldest active xact, I'm not sure how that works at the moment.
>
> - Have the bgwriter do cleanup work before writing out a block. It
> could probably do the truncation bit above, but totally removing old tuples
> requires cleaning out the indexes too, which AIUI is the hard part of
> vacuuming.
>
> One totally whacked out idea I just thought of: Instead of just
> truncating tuples when they're invisible, mark them "dying" and make the
> data section store an array of CTIDs pointing to the index tuples pointing
> to it. Lookups that find the tuple via an index could store the CTID of
> the index tuple before continuing. If the bgwriter sees it has a full set,
> it can efficiently remove the tuple straight away.
>
> There are ofcourse drawbacks to this approach, you'd probably need
> something like the half-dirty pages to avoid a large increase in write
> load. If it's even beneficial at all given concurrency issues.

A lot of these recommendations sound like garbage collection ideas found
in modern programming languages. Perhaps it would be worth considering
allowing sessions to keep track of which pages they alter and spawn a
separate process per connection to sweep up slowly along the way.

Also, it's nice that vacuum now has slow-down settings, but why isn't
there a option to autovacuum during periods of idleness and pause when
busy?

-M




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Simon Riggs
Дата:
Сообщение: Re: xlog viewer proposal
Следующее
От: "Larry Rosenman"
Дата:
Сообщение: Re: Anyone still care about Cygwin? (was Re: [CORE] GPL Source and Copyright Questions)