autovacuum next steps

Поиск
Список
Период
Сортировка
От Alvaro Herrera
Тема autovacuum next steps
Дата
Msg-id 20070216210057.GI870@alvh.no-ip.org
обсуждение исходный текст
Ответы Re: autovacuum next steps  ("Matthew T. O'Connor" <matthew@zeut.net>)
Re: autovacuum next steps  (Gregory Stark <stark@enterprisedb.com>)
Re: autovacuum next steps  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: autovacuum next steps  (Ron Mayer <rm_pg@cheapcomplexdevices.com>)
Re: autovacuum next steps, take 2  (Alvaro Herrera <alvherre@commandprompt.com>)
Список pgsql-hackers
After staring at my previous notes for autovac scheduling, it has become
clear that this basics of it is not really going to work as specified.
So here is a more realistic plan:

First, we introduce an autovacuum_max_workers parameter, to limit the
total amount of workers that can be running at any time.  Use this
number to create extra PGPROC entries, etc, similar to the way we handle
the prepared xacts stuff.  The default should be low, say 3 o 4.

The launcher sends a worker into a database just like it does currently.
This worker determines what tables need vacuuming per the pg_autovacuum
settings and pgstat data.  If it's more than one table, it puts the
number of tables in shared memory and sends a signal to the launcher.

The launcher then starts
min(autovacuum_max_workers - currently running workers, tables to vacuum - 1)
more workers to process that database.  Maybe we could have a
max-workers parameter per-database in pg_database to use as a limit here
as well.

Each worker, including the initial one, starts vacuuming tables
according to pgstat data.  They recheck the pgstat data after finishing
each table, so that a table vacuumed by another worker is not processed
twice (maybe problematic: a table with high update rate may be vacuumed
more than once.  Maybe this is a feature not a bug).


Once autovacuum_naptime has passed, if the workers have not finished
yet, the launcher wants to vacuum another database.  At this point, the
launcher wants some of the workers processing the first database to exit
early as soon as they finish one table, so that they can help vacuuming
the other database.  It can do this by setting a flag in shmem that the
workers can check when finished with a table; if the flag is set, they
exit instead of continuing with another table.  The launcher then starts
a worker in the second database.  The launcher does this until the
number of workers is even among both databases.  This can be done till
having one worker per database; so at most autovacuum_max_workers
databases can be under automatic vacuuming at any time, one worker each.

When there are autovacuum_max_workers databases under vacuum, the
launcher doesn't have anything else to do until some worker exits on its
own.

When there is a single worker processing a database, it does not recheck
pgstat data after each table.  This is to prevent a high-update-rate
table from starving the vacuuming of other databases.


How does this sound?

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: RFC: Temporal Extensions for PostgreSQL
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Confusing message on startup after a crash while recovering