Re: pgsql: autovacuum: handle analyze for partitioned tables

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: pgsql: autovacuum: handle analyze for partitioned tables
Дата
Msg-id 1901125.1617904665@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: pgsql: autovacuum: handle analyze for partitioned tables  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Ответы Re: pgsql: autovacuum: handle analyze for partitioned tables  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Re: pgsql: autovacuum: handle analyze for partitioned tables  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Список pgsql-hackers
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> On 2021-Apr-08, Tom Lane wrote:
>> Looks like this has issues under EXEC_BACKEND:
>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=culicidae&dt=2021-04-08%2005%3A50%3A08

> Hmm, I couldn't reproduce this under EXEC_BACKEND or otherwise, but I
> think this is unrelated to that, but rather a race condition.

Yeah.  I hit this on another machine that isn't using EXEC_BACKEND,
and I concur it looks more like a race condition.  I think the problem
is that autovacuum is calling find_all_inheritors() on a relation it
has no lock on, contrary to that function's API spec.  find_all_inheritors
assumes the OID it's given is valid and locked, and adds it to the
result list automatically.  Then it looks for children, and won't find
any in the race case where somebody else just dropped the table.
So we come back to relation_needs_vacanalyze with a list of just the
original rel OID, and since this loop believes that every syscache fetch
it does will succeed, kaboom.

I do not think it is sane to do find_all_inheritors() with no lock,
so I'd counsel doing something about that rather than band-aiding the
symptom.  On the other hand, it's also not really okay not to have
an if-test-and-elog after the SearchSysCache call.  "A cache lookup
cannot fail" is not an acceptable assumption in my book.

BTW, another thing that looks like a race condition is the
extract_autovac_opts() call that is done a little bit earlier,
also without lock.  I think this is actually safe, but it's ONLY
safe because we resisted the calls by certain people to add a
toast table to pg_class.  Otherwise, fetching reloptions could
have involved a toast pointer dereference, and it would then be
racy whether the toasted data was still there.  As-is, even if
the pg_class row we're looking at has been deleted, we can safely
disassemble its reloptions.  I think this matter is deserving
of a comment at least.

            regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: test runner (was Re: SQL-standard function body)
Следующее
От: Andres Freund
Дата:
Сообщение: Re: test runner (was Re: SQL-standard function body)