Re: [sqlsmith] Parallel worker executor crash on master

Поиск
Список
Период
Сортировка
От Andreas Seltenreich
Тема Re: [sqlsmith] Parallel worker executor crash on master
Дата
Msg-id 87d13etft6.fsf@ansel.ydns.eu
обсуждение исходный текст
Ответ на Re: [sqlsmith] Parallel worker executor crash on master  (Thomas Munro <thomas.munro@enterprisedb.com>)
Ответы Re: [sqlsmith] Parallel worker executor crash on master  (Thomas Munro <thomas.munro@enterprisedb.com>)
Список pgsql-hackers
Thomas Munro writes:

> On Sat, Dec 16, 2017 at 10:13 PM, Andreas Seltenreich
> <seltenreich@gmx.de> wrote:
>> Core was generated by `postgres: smith regression [local] SELECT                        '.
>> Program terminated with signal SIGSEGV, Segmentation fault.
>> #0  gather_getnext (gatherstate=0x555a5fff1350) at nodeGather.c:283
>> 283                             estate->es_query_dsa = gatherstate->pei->area;
>> #1  ExecGather (pstate=0x555a5fff1350) at nodeGather.c:216
>
> Hmm, thanks.  That's not good.  Do we know if gatherstate->pei is
> NULL, or if it's somehow pointing to garbage?

It was NULL on all the coredumps I looked into.  Below[1] is a full
gatherstate.

> Not sure how either of those things could happen, since we only set it
> to NULL in ExecShutdownGather() after which point we shouldn't call
> ExecGather() again, and any MemoryContext problems with pei should
> have caused problems already without this patch (for example in
> ExecParallelCleanup).  Clearly I'm missing something.

FWIW, all backtraces collected so far are identical for the first nine
frames.  After ExecProjectSet, they are pretty random executor innards.

,----
| #1  ExecGather at nodeGather.c:216
| #2  0x0000555bc9fb41ea in ExecProcNode at ../../../src/include/executor/executor.h:242
| #3  ExecutePlan at execMain.c:1718
| #4  standard_ExecutorRun at execMain.c:361
| #5  0x0000555bc9fc07cc in postquel_getnext at functions.c:865
| #6  fmgr_sql (fcinfo=0x555bcba07748) at functions.c:1161
| #7  0x0000555bc9fbc4f7 in ExecMakeFunctionResultSet at execSRF.c:604
| #8  0x0000555bc9fd7cbb in ExecProjectSRF at nodeProjectSet.c:175
| #9  0x0000560828dc8df5 in ExecProjectSet at nodeProjectSet.c:105
`----

regards,
Andreas

Footnotes: 
[1]
(gdb) p *gatherstate
$3 = {
  ps = {
    type = T_GatherState, 
    plan = 0x555bcb9faf30, 
    state = 0x555bcba3d098, 
    ExecProcNode = 0x555bc9fc9e30 <ExecGather>, 
    ExecProcNodeReal = 0x555bc9fc9e30 <ExecGather>, 
    instrument = 0x0, 
    worker_instrument = 0x0, 
    qual = 0x0, 
    lefttree = 0x555bcba3d678, 
    righttree = 0x0, 
    initPlan = 0x0, 
    subPlan = 0x0, 
    chgParam = 0x0, 
    ps_ResultTupleSlot = 0x555bcba3d5b8, 
    ps_ExprContext = 0x555bcba3d3c8, 
    ps_ProjInfo = 0x0
  }, 
  initialized = 1 '\001', 
  need_to_scan_locally = 1 '\001', 
  tuples_needed = -1, 
  funnel_slot = 0x555bcba3d4c0, 
  pei = 0x0, 
  nworkers_launched = 0, 
  nreaders = 0, 
  nextreader = 0, 
  reader = 0x0
}




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Thomas Munro
Дата:
Сообщение: Re: [sqlsmith] Parallel worker executor crash on master
Следующее
От: David Rowley
Дата:
Сообщение: Why does array_position_common bitwise NOT an Oid type?