Re: error context for vacuum to include block number

Поиск
Список
Период
Сортировка
От Masahiko Sawada
Тема Re: error context for vacuum to include block number
Дата
Msg-id CA+fd4k7mabf_yD=YqCNPc5RKJPsgVP6e2zVx7gBp62Be7UGFQA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: error context for vacuum to include block number  (Justin Pryzby <pryzby@telsasoft.com>)
Ответы Re: error context for vacuum to include block number  (Justin Pryzby <pryzby@telsasoft.com>)
Список pgsql-hackers
On Mon, 17 Feb 2020 at 12:57, Justin Pryzby <pryzby@telsasoft.com> wrote:
>
> On Mon, Feb 17, 2020 at 10:47:47AM +0900, Masahiko Sawada wrote:
> > Thank you for updating the patch!
> >

Thank you for updating the patch.

> > 1.
> > The above lines need a new line.
>
> Done, thanks.
>
> > 2.
> > In lazy_vacuum_heap, we set the error context and then call
> > pg_rusage_init whereas lazy_vacuum_index and lazy_cleanup_index does
> > the opposite. And lazy_scan_heap also call pg_rusage first. I think
> > lazy_vacuum_heap should follow them for consistency. That is, we can
> > set error context after pages = 0.
>
> Right. Maybe I did it the other way because the two uses of
> PROGRESS_VACUUM_PHASE_VACUUM_HEAP were right next to each other.
>
> > 3.
> > We have 2 other phases: PROGRESS_VACUUM_PHASE_TRUNCATE and
> > PROGRESS_VACUUM_PHASE_FINAL_CLEANUP. I think it's better to set the
> > error context in lazy_truncate_heap as well. What do you think?
> >
> > I'm not sure it's worth to set the error context for FINAL_CLENAUP but
> > we should add the case of FINAL_CLENAUP to vacuum_error_callback as
> > no-op or explain it as a comment even if we don't.
>
> I don't have strong feelings either way.
>
> I looked a bit at the truncation phase.  It also truncates the FSM and VM
> forks, which could be misleading if the error was in one of those files and not
> the main filenode.
>
> I'd have to find a way to test it...
> ...and was pleasantly surprised to see that earlier phases don't choke if I
> (re)create a fake 4GB table like:
>
> postgres=# CREATE TABLE trunc(i int);
> CREATE TABLE
> postgres=# \x\t
> Expanded display is on.
> Tuples only is on.
> postgres=# SELECT relfilenode FROM pg_class WHERE oid='trunc'::regclass;
> relfilenode | 59068
>
> postgres=# \! bash -xc 'truncate -s 1G ./pgsql13.dat111/base/12689/59068{,.{1..3}}'
> + truncate -s 1G ./pgsql13.dat111/base/12689/59074 ./pgsql13.dat111/base/12689/59074.1
./pgsql13.dat111/base/12689/59074.2./pgsql13.dat111/base/12689/59074.3
 
>
> postgres=# \timing
> Timing is on.
> postgres=# SET client_min_messages=debug; SET statement_timeout='13s'; VACUUM VERBOSE trunc;
> INFO:  vacuuming "public.trunc"
> INFO:  "trunc": found 0 removable, 0 nonremovable row versions in 524288 out of 524288 pages
> DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 2098
> There were 0 unused item identifiers.
> Skipped 0 pages due to buffer pins, 0 frozen pages.
> 524288 pages are entirely empty.
> CPU: user: 5.00 s, system: 1.50 s, elapsed: 6.52 s.
> ERROR:  canceling statement due to statement timeout
> CONTEXT:  while truncating relation "public.trunc" to 0 blocks
>

Yeah lazy_scan_heap deals with all dummy files as new empty pages.

> The callback surrounding RelationTruncate() seems hard to hit unless you add
> CHECK_FOR_INTERRUPTS(); the same was true for index cleanup.
>
> The truncation uses a prefetch, which is more likely to hit any lowlevel error,
> so I added callback there, too.
>
> BTW, for the index cases, I didn't like repeating the namespace here, but WDYT ?
> |CONTEXT:  while vacuuming index "public.t_i_idx3" of relation "t"

The current message looks good to me because we cannot have a table
and its index in the different schema.

1.
    pg_rusage_init(&ru0);
    npages = 0;

    /*
     * Setup error traceback support for ereport()
     */
    init_vacuum_error_callback(&errcallback, &errcbarg, onerel,
            PROGRESS_VACUUM_PHASE_VACUUM_HEAP);
    error_context_stack = &errcallback;

    tupindex = 0;

Oops it seems to me that it's better to set error context after
tupindex = 0. Sorry for my bad.

And the above comment can be written in a single line as other same comments.

2.
@@ -2568,6 +2643,12 @@ count_nondeletable_pages(Relation onerel,
LVRelStats *vacrelstats)
    BlockNumber blkno;
    BlockNumber prefetchedUntil;
    instr_time  starttime;
+   ErrorContextCallback errcallback;
+   vacuum_error_callback_arg errcbarg;
+
+   /* Setup error traceback support for ereport() */
+   init_vacuum_error_callback(&errcallback, &errcbarg, onerel,
+           PROGRESS_VACUUM_PHASE_TRUNCATE);

Hmm I don't think it's a good idea to have count_nondeletable_pages
set the error context of PHASE_TRUNCATE. Because the patch sets the
error context during RelationTruncate that actually truncates the heap
but count_nondeletable_pages doesn't truncate it. I think setting the
error context only during RelationTruncate would be a good start. We
can hear other opinions from other hackers. Some hackers may want to
set the error context for whole lazy_truncate_heap.

Regards,

-- 
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Thomas Munro
Дата:
Сообщение: Re: Collation versioning
Следующее
От: Kasahara Tatsuhito
Дата:
Сообщение: Re: small improvement of the elapsed time for truncating heap in vacuum