On Mon, Mar 7, 2016 at 3:37 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> The currently-committed code generates paths where nested loops and
> hash joins get pushed beneath the Gather node, but does not generate
> paths where merge joins have been pushed beneath the Gather node. And
> the reason I didn't try to generate those paths is because I believe
> they will almost always suck. As of now, what we know how to do is
> build a partial path for a join by joining a partial path for the
> outer input rel against an ordinary path for the inner rel. That
> means that the work of generating the inner rel has to be redone in
> each worker. That's not a problem if we've got something like a
> nested loop with a parameterized inner index scan, because that sort
> of plan redoes all the work for every row anyway. It is a problem for
> a hash join, but it's not too hard for it to be worthwhile anyway if
> the build table is small. For a merge join, though, it seems rather
> unpromising. It's really doubtful that we want each worker to
> independently sort the inner rel and then have them join their own
> subset of the outer rel against their own copy of the sort. *Maybe*
> it could win if the inner path is an index scan, but I wasn't really
> sure that would come up and be a win often enough to be worth the cost
> of generating the path. We tend to only use merge joins when both of
> the relations involved are large, and index-scanning a large relation
> tends to lose to sorting it. So it just seemed like a dead end.
This is the first message on this subthread that actually gave me a
feeling I understood the issue under discussion. It explains the
distinction between plans that are parallel-safe and plans that would
actually do something different under a parallel worker
--
greg