vac_truncate_clog()'s bogus check leads to bogusness

Поиск
Список
Период
Сортировка
От Andres Freund
Тема vac_truncate_clog()'s bogus check leads to bogusness
Дата
Msg-id 20230621221208.vhsqgduwfpzwxnpg@awork3.anarazel.de
обсуждение исходный текст
Ответы Re: vac_truncate_clog()'s bogus check leads to bogusness  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
Hi,

When vac_truncate_clog() returns early, due to one of these paths:

    /*
     * Do not truncate CLOG if we seem to have suffered wraparound already;
     * the computed minimum XID might be bogus.  This case should now be
     * impossible due to the defenses in GetNewTransactionId, but we keep the
     * test anyway.
     */
    if (frozenAlreadyWrapped)
    {
        ereport(WARNING,
                (errmsg("some databases have not been vacuumed in over 2 billion transactions"),
                 errdetail("You might have already suffered transaction-wraparound data loss.")));
        return;
    }

    /* chicken out if data is bogus in any other way */
    if (bogus)
        return;

we haven't released the lwlock that we acquired earlier:

    /* Restrict task to one backend per cluster; see SimpleLruTruncate(). */
    LWLockAcquire(WrapLimitsVacuumLock, LW_EXCLUSIVE);

as this isn't a path raising an error, the lock isn't released during abort.
Until there's some cause for the session to call LWLockReleaseAll(), the lock
is held. Until then neither the process holding the lock, nor any other
process, can finish vacuuming.  We don't even have an assert against a
self-deadlock with an already held lock, oddly enough.


This is somewhat nasty - there's no real way to get out of this without an
immediate restart, and it's hard to pinpoint the problem as well :(.


Ok, the subject line is not the most precise, but it was just too good an
opportunity.


To reproduce (only on a throwaway system please!):

CREATE DATABASE invalid;
UPDATE pg_database SET datfrozenxid = '10002' WHERE datname = 'invalid';
DROP TABLE IF EXISTS foo_tbl; CREATE TABLE foo_tbl(); DROP TABLE foo_tbl; VACUUM FREEZE;
DROP TABLE IF EXISTS foo_tbl; CREATE TABLE foo_tbl(); DROP TABLE foo_tbl; VACUUM FREEZE;
<hang>


Found this while writing a test for the fix for partial dropping of
databases [1].


Separately, I think it's quite bad that we *silently* return from
vac_truncate_clog() when finding a bogus xid. That's a quite severe condition,
we should at least tell the user about it.


Greetings,

Andres Freund

[1] https://postgr.es/m/20230621190204.nsaelabojxppiuix%40awork3.anarazel.de



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Nathan Bossart
Дата:
Сообщение: Re: Preventing non-superusers from altering session authorization
Следующее
От: Bəxtiyar Neyman
Дата:
Сообщение: Re: Can JoinFilter condition be pushed down into IndexScan?