Re: shared-memory based stats collector

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: shared-memory based stats collector
Дата
Msg-id 20200309184754.yvrgzqpzs3iynszq@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: shared-memory based stats collector  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Ответы Re: shared-memory based stats collector  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
Hi,

On 2020-03-09 15:37:05 -0300, Alvaro Herrera wrote:
> Tom Lane escribió:
> 
> In patch 0003,
> 
> >          /*
> > -         * Was it the archiver?  If so, just try to start a new one; no need
> > -         * to force reset of the rest of the system.  (If fail, we'll try
> > -         * again in future cycles of the main loop.).  Unless we were waiting
> > -         * for it to shut down; don't restart it in that case, and
> > -         * PostmasterStateMachine() will advance to the next shutdown step.
> > +         * Was it the archiver?  Normal exit can be ignored; we'll start a new
> > +         * one at the next iteration of the postmaster's main loop, if
> > +         * necessary. Any other exit condition is treated as a crash.
> >           */
> >          if (pid == PgArchPID)
> >          {
> >              PgArchPID = 0;
> >              if (!EXIT_STATUS_0(exitstatus))
> > -                LogChildExit(LOG, _("archiver process"),
> > -                             pid, exitstatus);
> > -            if (PgArchStartupAllowed())
> > -                PgArchPID = pgarch_start();
> > +                HandleChildCrash(pid, exitstatus,
> > +                                 _("archiver process"));
> >              continue;
> >          }
> 
> I'm worried that we're causing all processes to terminate when an
> archiver dies in some ugly way; but in the current coding, it's pretty
> harmless and we'd just start a new one.  I think this needs to be
> reconsidered.  As far as I know, pgarchiver remains unconnected to
> shared memory so a crash-restart cycle is not necessary.  We should
> continue to just log the error message and move on.

Why is it worth having the archiver be "robust" that way? Except that
random implementation details led to it not being connected to shared
memory, and thus allowing a restart for any exit code, I don't see a
need? It doesn't have exit paths that could validly trigger another exit
code, as far as I can see.

Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: shared-memory based stats collector
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Bug in pg_restore with EventTrigger in parallel mode