Обсуждение: [parallel query] random server crash while running tpc-h query on power2

Поиск
Список
Период
Сортировка

[parallel query] random server crash while running tpc-h query on power2

От
Rushabh Lathia
Дата:
Hi All,

Recently while running tpc-h queries on postgresql master branch, I am noticed
random server crash. Most of the time server crash coming while turn tpch query
number 3 - (but its very random).

Call Stack of server crash:

(gdb) bt
#0  0x00000000102aa9ac in ExplainNode (planstate=0x1001a7471d8, ancestors=0x1001a745348, relationship=0x10913d10 "Outer", plan_name=0x0, es=0x1001a65ad48)
    at explain.c:1516
#1  0x00000000102aaeb8 in ExplainNode (planstate=0x1001a746b50, ancestors=0x1001a745348, relationship=0x10913d10 "Outer", plan_name=0x0, es=0x1001a65ad48)
    at explain.c:1599
#2  0x00000000102aaeb8 in ExplainNode (planstate=0x1001a7468b8, ancestors=0x1001a745348, relationship=0x10913d10 "Outer", plan_name=0x0, es=0x1001a65ad48)
    at explain.c:1599
#3  0x00000000102aaeb8 in ExplainNode (planstate=0x1001a745e48, ancestors=0x1001a745348, relationship=0x10913d10 "Outer", plan_name=0x0, es=0x1001a65ad48)
    at explain.c:1599
#4  0x00000000102aaeb8 in ExplainNode (planstate=0x1001a745bb0, ancestors=0x1001a745348, relationship=0x10913d10 "Outer", plan_name=0x0, es=0x1001a65ad48)
    at explain.c:1599
#5  0x00000000102aaeb8 in ExplainNode (planstate=0x1001a745870, ancestors=0x1001a745348, relationship=0x0, plan_name=0x0, es=0x1001a65ad48) at explain.c:1599
#6  0x00000000102a81d8 in ExplainPrintPlan (es=0x1001a65ad48, queryDesc=0x1001a7442a0) at explain.c:599
#7  0x00000000102a7e20 in ExplainOnePlan (plannedstmt=0x1001a744208, into=0x0, es=0x1001a65ad48,
    queryString=0x1001a6a8c08 "explain (analyze,buffers,verbose) select\n\tl_orderkey,\n\tsum(l_extendedprice * (1 - l_discount)) as revenue,\n\to_orderdate,\n\to_shippriority\nfrom\n\tcustomer,\n\torders,\n\tlineitem\nwhere\n\tc_mktsegment = 'BUILD"..., params=0x0, planduration=0x3fffe17c6488) at explain.c:515
#8  0x00000000102a7968 in ExplainOneQuery (query=0x1001a65ab98, into=0x0, es=0x1001a65ad48,
    queryString=0x1001a6a8c08 "explain (analyze,buffers,verbose) select\n\tl_orderkey,\n\tsum(l_extendedprice * (1 - l_discount)) as revenue,\n\to_orderdate,\n\to_shippriority\nfrom\n\tcustomer,\n\torders,\n\tlineitem\nwhere\n\tc_mktsegment = 'BUILD"..., params=0x0) at explain.c:357
#9  0x00000000102a74b8 in ExplainQuery (stmt=0x1001a6fe468,
    queryString=0x1001a6a8c08 "explain (analyze,buffers,verbose) select\n\tl_orderkey,\n\tsum(l_extendedprice * (1 - l_discount)) as revenue,\n\to_orderdate,\n\to_shippriority\nfrom\n\tcustomer,\n\torders,\n\tlineitem\nwhere\n\tc_mktsegment = 'BUILD"..., params=0x0, dest=0x1001a65acb0) at explain.c:245

(gdb) p *w
$2 = {num_workers = 2139062143, instrument = 0x1001af8a8d0}
(gdb) p planstate
$3 = (PlanState *) 0x1001a7471d8
(gdb) p *planstate
$4 = {type = T_HashJoinState, plan = 0x1001a740730, state = 0x1001a745758, instrument = 0x1001a7514a8, worker_instrument = 0x1001af8a8c8,
  targetlist = 0x1001a747448, qual = 0x0, lefttree = 0x1001a74a0e8, righttree = 0x1001a74f2f8, initPlan = 0x0, subPlan = 0x0, chgParam = 0x0,
  ps_ResultTupleSlot = 0x1001a750c10, ps_ExprContext = 0x1001a7472f0, ps_ProjInfo = 0x1001a751220, ps_TupFromTlist = 0 '\000'}
(gdb) p *planstate->worker_instrument
$5 = {num_workers = 2139062143, instrument = 0x1001af8a8d0}
(gdb) p n
$6 = 13055
(gdb) p n
$7 = 13055

Here its clear that work_instrument is either corrupted or Un-inililized that is the
reason its ending up with server crash.

With bit more debugging and looked at git history I found that issue started coming
with commit af33039317ddc4a0e38a02e2255c2bf453115fd2. gather_readnext() calls
ExecShutdownGatherWorkers() when nreaders == 0. ExecShutdownGatherWorkers()
calls ExecParallelFinish() which collects the instrumentation before marking
ParallelExecutorInfo to finish. ExecParallelRetrieveInstrumentation() do the allocation
of planstate->worker_instrument.

With commit af33039317 now we calling the gather_readnext() with per-tuple context,
but with nreader == 0 with ExecShutdownGatherWorkers() we end up with allocation
of planstate->worker_instrument into per-tuple context - which is wrong.

Now fix can be:

1) Avoid calling ExecShutdownGatherWorkers() from the gather_readnext() and let
ExecEndGather() do that things. But with this change, gather_readread() and
gather_getnext() depend on planstate->reader structure to continue reading tuple.
Now either we can change those condition to be depend on planstate->nreaders or
just pfree(planstate->reader) into gather_readnext() instead of calling
ExecShutdownGatherWorkers().

2) Teach ExecParallelRetrieveInstrumentation() to do allocation of
planstate->worker_instrument into proper memory context.

Attaching patch, which fix the issue with approach 1).


Regards.

Rushabh Lathia
Вложения

Re: [parallel query] random server crash while running tpc-h query on power2

От
Amit Kapila
Дата:
On Sat, Aug 13, 2016 at 11:10 AM, Rushabh Lathia
<rushabh.lathia@gmail.com> wrote:
> Hi All,
>
> Recently while running tpc-h queries on postgresql master branch, I am
> noticed
> random server crash. Most of the time server crash coming while turn tpch
> query
> number 3 - (but its very random).
>
>
> Here its clear that work_instrument is either corrupted or Un-inililized
> that is the
> reason its ending up with server crash.
>
> With bit more debugging and looked at git history I found that issue started
> coming
> with commit af33039317ddc4a0e38a02e2255c2bf453115fd2. gather_readnext()
> calls
> ExecShutdownGatherWorkers() when nreaders == 0. ExecShutdownGatherWorkers()
> calls ExecParallelFinish() which collects the instrumentation before marking
> ParallelExecutorInfo to finish. ExecParallelRetrieveInstrumentation() do the
> allocation
> of planstate->worker_instrument.
>
> With commit af33039317 now we calling the gather_readnext() with per-tuple
> context,
> but with nreader == 0 with ExecShutdownGatherWorkers() we end up with
> allocation
> of planstate->worker_instrument into per-tuple context - which is wrong.
>
> Now fix can be:
>
> 1) Avoid calling ExecShutdownGatherWorkers() from the gather_readnext() and
> let
> ExecEndGather() do that things.
>

I don't think we can wait till ExecEndGather() to collect statistics,
as we need it before that for explain path.  However, we do call
ExecShutdownNode() from ExecutePlan() when there are no more tuples
which can take care of ensuring the shutdown of Gather node.   I think
the advantage of calling it in
gather_readnext() is that it will resources to be released early and
populating the instrumentation/statistics as early as possible.

> But with this change, gather_readread() and
> gather_getnext() depend on planstate->reader structure to continue reading
> tuple.
> Now either we can change those condition to be depend on planstate->nreaders
> or
> just pfree(planstate->reader) into gather_readnext() instead of calling
> ExecShutdownGatherWorkers().
>
>
> Attaching patch, which fix the issue with approach 1).
>

AFAICS, your patch seems to be the right fix for this issue, unless we
need the instrumentation information during execution (other than for
explain) for some purpose.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: [parallel query] random server crash while running tpc-h query on power2

От
Robert Haas
Дата:
On Sat, Aug 13, 2016 at 4:36 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> AFAICS, your patch seems to be the right fix for this issue, unless we
> need the instrumentation information during execution (other than for
> explain) for some purpose.

Hmm, I disagree.  It should be the job of
ExecParallelRetrieveInstrumentation to allocate its data in the
correct context, not the responsibility of nodeGather.c to work around
the fact that it doesn't.  The worker instrumentation should be
allocated in the same context as the regular instrumentation
information, which I assume is probably the per-query context.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [parallel query] random server crash while running tpc-h query on power2

От
Rushabh Lathia
Дата:


On Mon, Aug 15, 2016 at 6:02 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Sat, Aug 13, 2016 at 4:36 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> AFAICS, your patch seems to be the right fix for this issue, unless we
> need the instrumentation information during execution (other than for
> explain) for some purpose.

Hmm, I disagree.  It should be the job of
ExecParallelRetrieveInstrumentation to allocate its data in the
correct context, not the responsibility of nodeGather.c to work around
the fact that it doesn't.  The worker instrumentation should be
allocated in the same context as the regular instrumentation
information, which I assume is probably the per-query context.

I agree, this make sense.

Here is the patch to allocate worker instrumentation into same context
as the regular instrumentation which is per-query context.

PFA patch.


--
Rushabh Lathia
Вложения

Re: [parallel query] random server crash while running tpc-h query on power2

От
Robert Haas
Дата:
On Tue, Aug 16, 2016 at 1:05 AM, Rushabh Lathia
<rushabh.lathia@gmail.com> wrote:
> I agree, this make sense.
>
> Here is the patch to allocate worker instrumentation into same context
> as the regular instrumentation which is per-query context.

Looks good, committed.  I am not sure it was a very good idea for
af33039317ddc4a0e38a02e2255c2bf453115fd2 by Tom Lane to change the
current memory context for the entire execution of gather_readnext();
this might not be the only or the last bug that results from that
decision.  However, I don't really want to get an argument about that
right now, and this at least fixes the problem we know about.  Thanks
for the report and patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company