Re: POC: Cleaning up orphaned files using undo logs

Поиск
Список
Период
Сортировка
От Antonin Houska
Тема Re: POC: Cleaning up orphaned files using undo logs
Дата
Msg-id 87363.1611941415@antos
обсуждение исходный текст
Ответ на Re: POC: Cleaning up orphaned files using undo logs  (Antonin Houska <ah@cybertec.at>)
Ответы Re: POC: Cleaning up orphaned files using undo logs  (Bruce Momjian <bruce@momjian.us>)
Список pgsql-hackers
Antonin Houska <ah@cybertec.at> wrote:

> Dmitry Dolgov <9erthalion6@gmail.com> wrote:
> 
> > Thanks for the updated patch. As I've mentioned off the list I'm slowly
> > looking through it with the intent to concentrate on undo progress
> > tracking. But before I will post anything I want to mention couple of
> > strange issues I see, otherwise I will forget for sure. Maybe it's
> > already known, but running several times 'make installcheck' against a
> > freshly build postgres with the patch applied from time to time I
> > observe various errors.
> > 
> > This one happens on a crash recovery, seems like
> > UndoRecordSetXLogBufData has usr_type = USRT_INVALID and is involved in
> > the replay process:
> > 
> >     TRAP: FailedAssertion("page_offset + this_page_bytes <= uph->ud_insertion_point", File: "undopage.c", Line:
300)
> >     postgres: startup recovering 000000010000000000000012(ExceptionalCondition+0xa1)[0x558b38b8a350]
> >     postgres: startup recovering 000000010000000000000012(UndoPageSkipOverwrite+0x0)[0x558b38761b7e]
> >     postgres: startup recovering 000000010000000000000012(UndoReplay+0xa1d)[0x558b38766f32]
> >     postgres: startup recovering 000000010000000000000012(XactUndoReplay+0x77)[0x558b38769281]
> >     postgres: startup recovering 000000010000000000000012(smgr_redo+0x1af)[0x558b387aa7bd]
> > 
> > This one is somewhat similar:
> > 
> >     TRAP: FailedAssertion("page_offset >= SizeOfUndoPageHeaderData", File: "undopage.c", Line: 287)
> >     postgres: undo worker for database 36893 (ExceptionalCondition+0xa1)[0x5559c90f1350]
> >     postgres: undo worker for database 36893 (UndoPageOverwrite+0xa6)[0x5559c8cc8ae3]
> >     postgres: undo worker for database 36893 (UpdateLastAppliedRecord+0xbe)[0x5559c8ccd008]
> >     postgres: undo worker for database 36893 (smgr_undo+0xa6)[0x5559c8d11989]
> 
> Well, on repeated run of the test I could also hit the first one. I could fix
> it and will post a new version of the patch (along with some other small
> changes) this week.

Attached is the next version. Changes done:

  * Removed the progress tracking and implemented undo discarding in a simpler
    way. Now, instead of maintaining the pointer to the last record applied,
    only a boolean field in the chunk header is set when ROLLBACK is
    done. This helps to determine whether the undo of a non-committed
    transaction can be discarded.

  * Removed the "undo worker" that the previous version only used to apply the
    undo after crash recovery. The startup process does the work now.

  * Umplemented cleanup after crashed CREATE DATABASE and ALTER DATABASE ... SET TABLESPACE.

    BTW, I wonder if this change allows these commands to be executed in a
    transaction block. I think the reason to prohibit that is to minimize the
    window between creation of the files and transaction commit - if the
    server crashes in that window, the new database files survive but the
    catalog changes don't. But maybe there are other reasons. (I don't claim
    it's terribly useful to create database in a transaction block though
    because the client cannot connect to it w/o leaving the current
    transaction.)

  * Reordered the diffs, i.e. moved the discarding in front of the actual
    features.

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com


Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Dumping/restoring fails on inherited generated column
Следующее
От: Alexey Kondratov
Дата:
Сообщение: Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace on the fly