Re: [HACKERS] parallel.c oblivion of worker-startup failures

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: [HACKERS] parallel.c oblivion of worker-startup failures
Дата
Msg-id CAH2-Wz=3aLj3FcneJBJqk3Qncs8VHHBsXpDJh8epDJ_CmjMgVw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] parallel.c oblivion of worker-startup failures  (Thomas Munro <thomas.munro@enterprisedb.com>)
Список pgsql-hackers
On Wed, Jan 24, 2018 at 1:57 AM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> On Wed, Jan 24, 2018 at 5:25 PM, Thomas Munro
> <thomas.munro@enterprisedb.com> wrote:
>> If there were some way for the postmaster to cause reason
>> PROCSIG_PARALLEL_MESSAGE to be set in the leader process instead of
>> just notification via kill(SIGUSR1) when it fails to fork a parallel
>> worker, we'd get (1) for free in any latch/CFI loop code.  But I
>> understand that we can't do that by project edict.
>
> Based on the above observation, here is a terrible idea you'll all
> hate.  It is pessimistic and expensive: it thinks that every latch
> wake might be the postmaster telling us it's failed to fork() a
> parallel worker, until we've seen a sign of life on every worker's
> error queue.  Untested illustration code only.  This is the only way
> I've come up with to discover fork failure in any latch/CFI loop (ie
> without requiring client code to explicitly try to read either error
> or tuple queues).

The question, I suppose, is how expensive this is in the real world.
If it's actually not a cost that anybody is likely to notice, then I
think we should pursue this approach. I wouldn't put too much weight
on keeping this simple for users of the parallel infrastructure,
though, because something like Amit's WaitForParallelWorkersToAttach()
idea still seems acceptable. "Call this function before trusting the
finality of nworkers_launched" isn't too onerous a rule to have to
follow.

-- 
Peter Geoghegan


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: pgsql: Add parallel-aware hash joins.
Следующее
От: Robert Haas
Дата:
Сообщение: Re: [HACKERS] parallel.c oblivion of worker-startup failures