Обсуждение: Assertion failure with barriers in parallel hash join
Hi all, prion, that uses -DRELCACHE_FORCE_RELEASE -DCATCACHE_FORCE_RELEASE, has just failed with an interesting failure: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=prion&dt=2020-09-29%2005%3A24%3A11 The assertion failure happens in a parallel worker when attempting to attach a barrier: #2 0x00000000009027d2 in ExceptionalCondition (conditionName=conditionName@entry=0xa80846 "!barrier->static_party", errorType=errorType@entry=0x955e22 "FailedAssertion", fileName=fileName@entry=0xa807a8 "/home/ec2-user/bf/root/REL_13_STABLE/pgsql.build/../pgsql/src/backend/storage/ipc/barrier.c", lineNumber=lineNumber@entry=218) at /home/ec2-user/bf/root/REL_13_STABLE/pgsql.build/../pgsql/src/backend/utils/error/assert.c:67 #3 0x00000000007b6b1f in BarrierAttach (barrier=barrier@entry=0x7f73c9d76008) at /home/ec2-user/bf/root/REL_13_STABLE/pgsql.build/../pgsql/src/backend/storage/ipc/barrier.c:218 #4 0x0000000000682ebf in ExecParallelHashJoinNewBatch (hjstate=<optimized out>, hjstate=<optimized out>) at /home/ec2-user/bf/root/REL_13_STABLE/pgsql.build/../pgsql/src/backend/executor/nodeHashjoin.c:1132 #5 ExecHashJoinImpl (parallel=true, pstate=0x1248d88) at /home/ec2-user/bf/root/REL_13_STABLE/pgsql.build/../pgsql/src/backend/executor/nodeHashjoin.c:560 #6 ExecParallelHashJoin (pstate=0x1248d88) at /home/ec2-user/bf/root/REL_13_STABLE/pgsql.build/../pgsql/src/backend/executor/nodeHashjoin.c:607 #7 0x0000000000666d48 in ExecProcNodeInstr (node=0x1248d88) at /home/ec2-user/bf/root/REL_13_STABLE/pgsql.build/../pgsql/src/backend/executor/execProcnode.c:466 #8 0x000000000065f322 in ExecProcNode (node=0x1248d88) at /home/ec2-user/bf/root/REL_13_STABLE/pgsql.build/../pgsql/src/include/executor/executor.h:245 Thanks, -- Michael
Вложения
On Tue, Sep 29, 2020 at 7:11 PM Michael Paquier <michael@paquier.xyz> wrote: > #2 0x00000000009027d2 in ExceptionalCondition > (conditionName=conditionName@entry=0xa80846 "!barrier->static_party", > #4 0x0000000000682ebf in ExecParallelHashJoinNewBatch Thanks. Ohhh. I think I see how that condition was reached and what to do about it, but I'll need to look more closely. I'm away on vacation right now, and will update in a couple of days when I'm back at a real computer.
On Tue, Sep 29, 2020 at 9:12 PM Thomas Munro <thomas.munro@gmail.com> wrote: > On Tue, Sep 29, 2020 at 7:11 PM Michael Paquier <michael@paquier.xyz> wrote: > > #2 0x00000000009027d2 in ExceptionalCondition > > (conditionName=conditionName@entry=0xa80846 "!barrier->static_party", > > > #4 0x0000000000682ebf in ExecParallelHashJoinNewBatch > > Thanks. Ohhh. I think I see how that condition was reached and what > to do about it, but I'll need to look more closely. I'm away on > vacation right now, and will update in a couple of days when I'm back > at a real computer. Here's a throw-away patch to add some sleeps that trigger the problem, and a first draft fix. I'll do some more testing of this next week and see if I can simplify it.
Вложения
On Thu, Oct 1, 2020 at 8:08 PM Thomas Munro <thomas.munro@gmail.com> wrote:
On Tue, Sep 29, 2020 at 9:12 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> On Tue, Sep 29, 2020 at 7:11 PM Michael Paquier <michael@paquier.xyz> wrote:
> > #2 0x00000000009027d2 in ExceptionalCondition
> > (conditionName=conditionName@entry=0xa80846 "!barrier->static_party",
>
> > #4 0x0000000000682ebf in ExecParallelHashJoinNewBatch
>
> Thanks. Ohhh. I think I see how that condition was reached and what
> to do about it, but I'll need to look more closely. I'm away on
> vacation right now, and will update in a couple of days when I'm back
> at a real computer.
Here's a throw-away patch to add some sleeps that trigger the problem,
and a first draft fix. I'll do some more testing of this next week
and see if I can simplify it.
I was just taking a look at the patch and noticed the commit message
says:
says:
> With unlucky timing and parallel_leader_participation off...
Is parallel_leader_participation being off required to reproduce the
issue?
On Tue, Oct 13, 2020 at 12:15 PM Melanie Plageman <melanieplageman@gmail.com> wrote: > On Thu, Oct 1, 2020 at 8:08 PM Thomas Munro <thomas.munro@gmail.com> wrote: >> On Tue, Sep 29, 2020 at 9:12 PM Thomas Munro <thomas.munro@gmail.com> wrote: >> Here's a throw-away patch to add some sleeps that trigger the problem, >> and a first draft fix. I'll do some more testing of this next week >> and see if I can simplify it. > > I was just taking a look at the patch and noticed the commit message > says: > > > With unlucky timing and parallel_leader_participation off... > > Is parallel_leader_participation being off required to reproduce the > issue? Yeah, because otherwise the leader detaches last so the problem doesn't arise.
On Wed, Mar 17, 2021 at 8:18 AM Thomas Munro <thomas.munro@gmail.com> wrote: > > On Wed, Mar 17, 2021 at 6:58 PM Thomas Munro <thomas.munro@gmail.com> wrote: > > According to BF animal elver there is something wrong with this > > commit. Looking into it. > > Assertion failure reproduced here and understood, but unfortunately > it'll take some more time to fix this. I've reverted the commit for > now to unbreak the ~5 machines that are currently showing red in the > build farm. So, I think the premise of the patch v6-0001-Fix-race-condition-in-parallel-hash-join-batch-cl.patch makes sense: freeing the hashtable memory should happen atomically with updating the flag that indicates to all others that the memory is not to be accessed any longer. As was likely reported by the buildfarm leading to you reverting the patch, when the inner side is empty and we dump out before advancing the build barrier past PHJ_BUILD_HASHING_OUTER, we trip the new Asserts you've added in ExecHashTableDetach(). Assert(!pstate || BarrierPhase(&pstate->build_barrier) >= PHJ_BUILD_RUNNING); Hmm. Maybe if the inner side is empty, we can advance the build barrier to the end? We help batch 0 along like this in ExecParallelHashJoinSetUpBatches(). But, I'm not sure we can expect the process executing this code to be attached to the build barrier, can we? @@ -296,7 +304,19 @@ ExecHashJoinImpl(PlanState *pstate, bool parallel) * outer relation. */ if (hashtable->totalTuples == 0 && !HJ_FILL_OUTER(node)) + { + if (parallel) + { + Barrier *build_barrier; + + build_barrier = ¶llel_state->build_barrier; + while (BarrierPhase(build_barrier) < PHJ_BUILD_DONE) + BarrierArriveAndWait(build_barrier, 0); + BarrierDetach(build_barrier); + } + return NULL; + }
On Wed, Mar 17, 2021 at 8:18 AM Thomas Munro <thomas.munro@gmail.com> wrote: > > On Wed, Mar 17, 2021 at 6:58 PM Thomas Munro <thomas.munro@gmail.com> wrote: > > According to BF animal elver there is something wrong with this > > commit. Looking into it. > > Assertion failure reproduced here and understood, but unfortunately > it'll take some more time to fix this. I've reverted the commit for > now to unbreak the ~5 machines that are currently showing red in the > build farm. Also, silly question: why couldn't we just set the pstate->batches pointer to InvalidDsaPointer before doing dsa_free() (of course saving the pointer so that we can actually do the freeing with it)? Is there still a race?
On Wed, Mar 31, 2021 at 6:25 PM Melanie Plageman <melanieplageman@gmail.com> wrote: > > On Wed, Mar 17, 2021 at 8:18 AM Thomas Munro <thomas.munro@gmail.com> wrote: > > > > On Wed, Mar 17, 2021 at 6:58 PM Thomas Munro <thomas.munro@gmail.com> wrote: > > > According to BF animal elver there is something wrong with this > > > commit. Looking into it. > > > > Assertion failure reproduced here and understood, but unfortunately > > it'll take some more time to fix this. I've reverted the commit for > > now to unbreak the ~5 machines that are currently showing red in the > > build farm. > > So, I think the premise of the patch > v6-0001-Fix-race-condition-in-parallel-hash-join-batch-cl.patch makes > sense: freeing the hashtable memory should happen atomically with > updating the flag that indicates to all others that the memory is not to > be accessed any longer. > > As was likely reported by the buildfarm leading to you reverting the > patch, when the inner side is empty and we dump out before advancing the > build barrier past PHJ_BUILD_HASHING_OUTER, we trip the new Asserts > you've added in ExecHashTableDetach(). > > Assert(!pstate || > BarrierPhase(&pstate->build_barrier) >= PHJ_BUILD_RUNNING); > > Hmm. > > Maybe if the inner side is empty, we can advance the build barrier to > the end? > We help batch 0 along like this in ExecParallelHashJoinSetUpBatches(). > > But, I'm not sure we can expect the process executing this code to be > attached to the build barrier, can we? > > @@ -296,7 +304,19 @@ ExecHashJoinImpl(PlanState *pstate, bool parallel) > * outer relation. > */ > if (hashtable->totalTuples == 0 && > !HJ_FILL_OUTER(node)) > + { > + if (parallel) > + { > + Barrier *build_barrier; > + > + build_barrier = > ¶llel_state->build_barrier; > + while > (BarrierPhase(build_barrier) < PHJ_BUILD_DONE) > + > BarrierArriveAndWait(build_barrier, 0); > + BarrierDetach(build_barrier); > + } > + > return NULL; > + } I've attached a new version of the patch which contains my suggested fix for the problem with the empty inner optimization. If you apply Thomas' injected sleeps original bug repro patch and use the following DDL and query, you can reproduce the issue with the empty inner optimization present in his v2 patch. Then, if you apply my attached v3 patch, you can see that we no longer trip the assert. drop table if exists empty_simple; create table empty_simple(id int, col2 text); update pg_class set reltuples = 10000 where relname = 'empty_simple'; drop table if exists simple; create table simple as select generate_series(1, 20000) AS id, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'; analyze simple; set min_parallel_table_scan_size = 0; set parallel_setup_cost = 0; set enable_hashjoin = on; set parallel_leader_participation to off; set work_mem = '4MB'; set enable_parallel_hash = on; select count(*) from empty_simple join simple using (id); - Melanie
Вложения
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: tested, passed Documentation: not tested Hi all, We recently encountered the same bug in the field. Oleksii Kozlov managed to come up with reproduction steps, which reliablytrigger it. Interestingly, the bug does not only manifest as failing assertion, but also as segmentation fault; inbuilds with disabled and with enabled (!) assertions. So it can crash production environments. We applied the proposedpatch v3 from Melanie to the REL_14_3 branch and can confirm that with the patch neither the assertion nor the segmentationfault still occur. I have also glanced at the code and the implementation looks fine. However, I'm not an expert for the fairly involved hashjoin state machine. There seems to be no need for additional documentation. For completeness here is the stack trace of the segmentation fault. Like the stack trace from the assertion failure initially shared by Michael and also encountered by us, the stack trace ofthe segmentation fault also contains ExecParallelHashJoinNewBatch(). #9 | Source "/opt/src/backend/executor/execMain.c", line 361, in standard_ExecutorRun | Source "/opt/src/backend/executor/execMain.c", line 1551, in ExecutePlan Source "/opt/src/include/executor/executor.h", line 257, in ExecProcNode [0x657e4d] #8 | Source "/opt/src/backend/executor/nodeAgg.c", line 2179, in ExecAgg Source "/opt/src/backend/executor/nodeAgg.c", line 2364, in agg_retrieve_direct [0x66ba60] #7 | Source "/opt/src/backend/executor/nodeAgg.c", line 581, in fetch_input_tuple Source "/opt/src/include/executor/executor.h", line 257, in ExecProcNode [0x66d585] #6 | Source "/opt/src/backend/executor/nodeHashjoin.c", line 607, in ExecParallelHashJoin | Source "/opt/src/backend/executor/nodeHashjoin.c", line 560, in ExecHashJoinImpl Source "/opt/src/backend/executor/nodeHashjoin.c", line 1132, in ExecParallelHashJoinNewBatch [0x67a89b] #5 | Source "/opt/src/backend/storage/ipc/barrier.c", line 242, in BarrierAttach Source "/opt/src/include/storage/s_lock.h", line 228, in tas [0x7c2a1b] #4 Object "/lib/x86_64-linux-gnu/libpthread.so.0", at 0x7f4db364841f, in __funlockfile -- David Geier (SericeNow) The new status of this patch is: Ready for Committer
On Thu, Jun 2, 2022 at 9:31 PM David Geier <geidav.pg@gmail.com> wrote: > We recently encountered the same bug in the field. Oleksii Kozlov managed to come up with reproduction steps, which reliablytrigger it. Interestingly, the bug does not only manifest as failing assertion, but also as segmentation fault; inbuilds with disabled and with enabled (!) assertions. So it can crash production environments. We applied the proposedpatch v3 from Melanie to the REL_14_3 branch and can confirm that with the patch neither the assertion nor the segmentationfault still occur. Thanks for the report, testing, and for creating the CF entry. I assume you are using parallel_leader_participation=off, and the reason we haven't heard more about this is because few people do that. By coincidence I was just about to restart a bunch of hash join projects and have been paging the topic area back into my brain, so I'll start with another round of testing/analysis of this bug/patch next week.
Hi Thomas,
Correct. We're running with disabled parallel leader participation and we have to do so, because another custom plan node we built relies on that.
That would be great. Anything else I can help with to get this patch in?
Thanks!
--
David
(ServiceNow)
On Fri, 3 Jun 2022 at 00:06, Thomas Munro <thomas.munro@gmail.com> wrote:
On Thu, Jun 2, 2022 at 9:31 PM David Geier <geidav.pg@gmail.com> wrote:
> We recently encountered the same bug in the field. Oleksii Kozlov managed to come up with reproduction steps, which reliably trigger it. Interestingly, the bug does not only manifest as failing assertion, but also as segmentation fault; in builds with disabled and with enabled (!) assertions. So it can crash production environments. We applied the proposed patch v3 from Melanie to the REL_14_3 branch and can confirm that with the patch neither the assertion nor the segmentation fault still occur.
Thanks for the report, testing, and for creating the CF entry.
I assume you are using parallel_leader_participation=off, and the
reason we haven't heard more about this is because few people do that.
By coincidence I was just about to restart a bunch of hash join
projects and have been paging the topic area back into my brain, so
I'll start with another round of testing/analysis of this bug/patch
next week.
Hi Thomas,
Can we make progress with this patch in the current commit fest, or discuss what is still missing to bring this in?
Thanks!
--
David Geier
(ServiceNow)
On 6/6/22 17:01, David Geier wrote:
Hi Thomas,Correct. We're running with disabled parallel leader participation and we have to do so, because another custom plan node we built relies on that.That would be great. Anything else I can help with to get this patch in?Thanks!--David(ServiceNow)On Fri, 3 Jun 2022 at 00:06, Thomas Munro <thomas.munro@gmail.com> wrote:On Thu, Jun 2, 2022 at 9:31 PM David Geier <geidav.pg@gmail.com> wrote:
> We recently encountered the same bug in the field. Oleksii Kozlov managed to come up with reproduction steps, which reliably trigger it. Interestingly, the bug does not only manifest as failing assertion, but also as segmentation fault; in builds with disabled and with enabled (!) assertions. So it can crash production environments. We applied the proposed patch v3 from Melanie to the REL_14_3 branch and can confirm that with the patch neither the assertion nor the segmentation fault still occur.
Thanks for the report, testing, and for creating the CF entry.
I assume you are using parallel_leader_participation=off, and the
reason we haven't heard more about this is because few people do that.
By coincidence I was just about to restart a bunch of hash join
projects and have been paging the topic area back into my brain, so
I'll start with another round of testing/analysis of this bug/patch
next week.
On Thu, Nov 17, 2022 at 8:01 PM David Geier <geidav.pg@gmail.com> wrote: > Can we make progress with this patch in the current commit fest, or discuss what is still missing to bring this in? Hi David, Sorry for the delay. I'll aim to get this done in the next few days.
Thanks! Please let me know if I can help out, e.g. with re-testing. -- David Geier (ServiceNow) On 11/17/22 08:28, Thomas Munro wrote: > On Thu, Nov 17, 2022 at 8:01 PM David Geier <geidav.pg@gmail.com> wrote: >> Can we make progress with this patch in the current commit fest, or discuss what is still missing to bring this in? > Hi David, > Sorry for the delay. I'll aim to get this done in the next few days.
Pushed and back-patched, with minor comment tweaks. Apologies for taking so long.