Обсуждение: [HACKERS] Inadequate parallel-safety check for SubPlans

Поиск
Список
Период
Сортировка

[HACKERS] Inadequate parallel-safety check for SubPlans

От
Tom Lane
Дата:
While poking at the question of parallel_safe marking for Plans,
I noticed that max_parallel_hazard_walker() does this:
   /* We can push the subplans only if they are parallel-safe. */   else if (IsA(node, SubPlan))       return
!((SubPlan*) node)->parallel_safe; 

This is 100% wrong.  It's failing to recurse into the subexpressions of
the SubPlan, which could very easily include parallel-unsafe function
calls.  Example:

regression=# explain select * from tenk1 where int4(unique1 + random()) not in (select ten from tenk2);
              QUERY PLAN                                   
-----------------------------------------------------------------------------Gather  (cost=470.00..884.25 rows=5000
width=244) Workers Planned: 4  ->  Parallel Seq Scan on tenk1  (cost=470.00..884.25 rows=1250 width=244)        Filter:
(NOT(hashed SubPlan 1))        SubPlan 1          ->  Seq Scan on tenk2  (cost=0.00..445.00 rows=10000 width=4) 

random() is parallel-restricted so it's not cool that the SubPlan was
allowed to be passed down to workers.

I tried to fix it like this:
   else if (IsA(node, SubPlan))   {       if (!((SubPlan *) node)->parallel_safe &&
max_parallel_hazard_test(PROPARALLEL_RESTRICTED,context))           return true;   } 

but got some failures in the regression tests due to things not getting
parallelized when expected.  Poking into that, I remembered that for some
SubPlans, the testexpr contains Params representing the output columns
of the sub-select.  So the recursive call of max_parallel_hazard_walker
visits those and deems the expression parallel-restricted.

Basically, therefore, this logic is totally inadequate.  I think what
we need to do is improve matters so that while looking at the testexpr
of a SubPlan, we pass down a whitelist saying that the Params having
the numbers indicated by the SubLink's paramIds list are OK.

I'm also suspicious that the existing dumb treatment of Params here
may be blocking recognition that correlated subplans are parallel-safe.
But that would only be a failure to apply a possible optimization,
not a failure to generate a valid plan.
        regards, tom lane



Re: [HACKERS] Inadequate parallel-safety check for SubPlans

От
Robert Haas
Дата:
On Wed, Apr 12, 2017 at 2:41 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> While poking at the question of parallel_safe marking for Plans,
> I noticed that max_parallel_hazard_walker() does this:
>
>     /* We can push the subplans only if they are parallel-safe. */
>     else if (IsA(node, SubPlan))
>         return !((SubPlan *) node)->parallel_safe;
>
> This is 100% wrong.  It's failing to recurse into the subexpressions of
> the SubPlan, which could very easily include parallel-unsafe function
> calls.

My understanding (apparently flawed?) is that the parallel_safe flag
on the SubPlan is supposed to encapsulate whether than SubPlan is
entirely parallel safe, so that no recursion is needed.  If the
parallel_safe flag on the SubPlan is being set wrong, doesn't that
imply that the Path from which the subplan was created had the wrong
value of this flag, too?

...

Oh, I see.  The testexpr is separate from the Path.  Oops.

> Basically, therefore, this logic is totally inadequate.  I think what
> we need to do is improve matters so that while looking at the testexpr
> of a SubPlan, we pass down a whitelist saying that the Params having
> the numbers indicated by the SubLink's paramIds list are OK.

Sounds reasonable.  Do you want to have a go at that?

> I'm also suspicious that the existing dumb treatment of Params here
> may be blocking recognition that correlated subplans are parallel-safe.
> But that would only be a failure to apply a possible optimization,
> not a failure to generate a valid plan.

Smarter treatment of Params would be very nice, but possibly hard.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] Inadequate parallel-safety check for SubPlans

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, Apr 12, 2017 at 2:41 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> This is 100% wrong.  It's failing to recurse into the subexpressions of
>> the SubPlan, which could very easily include parallel-unsafe function
>> calls.

> My understanding (apparently flawed?) is that the parallel_safe flag
> on the SubPlan is supposed to encapsulate whether than SubPlan is
> entirely parallel safe, so that no recursion is needed.  If the
> parallel_safe flag on the SubPlan is being set wrong, doesn't that
> imply that the Path from which the subplan was created had the wrong
> value of this flag, too?

> ...

> Oh, I see.  The testexpr is separate from the Path.  Oops.

Right.  Although you're nudging up against an alternate idea that
I just had: we could define the parallel_safe flag on the SubPlan as
encapsulating not only whether the child plan is safe, but whether
the contained testexpr (and args) are safe.  If we were to make
an is_parallel_safe() check on the testexpr before injecting
PARAM_EXEC Params into it, and record the results in the SubPlan,
maybe that would lead to a simpler answer.

OTOH, that might not work out nicely; and I have a feeling that
we'd end up needing a whitelist anyway later, to handle the
case of correlated subplans.

> Sounds reasonable.  Do you want to have a go at that?

Yeah.  I'm knocking off for the day a bit early today, but I'll have
a look at it tomorrow, unless anyone in the Far East beats me to it.
        regards, tom lane



Re: [HACKERS] Inadequate parallel-safety check for SubPlans

От
Amit Kapila
Дата:
On Thu, Apr 13, 2017 at 1:05 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Wed, Apr 12, 2017 at 2:41 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> This is 100% wrong.  It's failing to recurse into the subexpressions of
>>> the SubPlan, which could very easily include parallel-unsafe function
>>> calls.
>
>> My understanding (apparently flawed?) is that the parallel_safe flag
>> on the SubPlan is supposed to encapsulate whether than SubPlan is
>> entirely parallel safe, so that no recursion is needed.  If the
>> parallel_safe flag on the SubPlan is being set wrong, doesn't that
>> imply that the Path from which the subplan was created had the wrong
>> value of this flag, too?
>
>> ...
>
>> Oh, I see.  The testexpr is separate from the Path.  Oops.
>
> Right.  Although you're nudging up against an alternate idea that
> I just had: we could define the parallel_safe flag on the SubPlan as
> encapsulating not only whether the child plan is safe, but whether
> the contained testexpr (and args) are safe.  If we were to make
> an is_parallel_safe() check on the testexpr before injecting
> PARAM_EXEC Params into it, and record the results in the SubPlan,
> maybe that would lead to a simpler answer.
>
> OTOH, that might not work out nicely; and I have a feeling that
> we'd end up needing a whitelist anyway later, to handle the
> case of correlated subplans.
>

The problem with supporting correlated subplans is that it is not easy
to decide about params that belong to whitelist because we need to
ensure that such params are available below the gather node.   It is
quite clear how whitelist of params can be useful for the case in hand
but it is not immediately clear to me how we will use it for
correlated sublans.   However, if you have ideas on how to make it
work, I can try to draft a patch as well on those lines as it can
certainly help some TPC-H queries.

>> Sounds reasonable.  Do you want to have a go at that?
>
> Yeah.  I'm knocking off for the day a bit early today, but I'll have
> a look at it tomorrow, unless anyone in the Far East beats me to it.
>

Thanks for looking into it.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: [HACKERS] Inadequate parallel-safety check for SubPlans

От
Noah Misch
Дата:
On Wed, Apr 12, 2017 at 02:41:09PM -0400, Tom Lane wrote:
> While poking at the question of parallel_safe marking for Plans,
> I noticed that max_parallel_hazard_walker() does this:
> 
>     /* We can push the subplans only if they are parallel-safe. */
>     else if (IsA(node, SubPlan))
>         return !((SubPlan *) node)->parallel_safe;
> 
> This is 100% wrong.  It's failing to recurse into the subexpressions of
> the SubPlan, which could very easily include parallel-unsafe function
> calls.  Example:
> 
> regression=# explain select * from tenk1 where int4(unique1 + random()) not in (select ten from tenk2);
>                                  QUERY PLAN                                  
> -----------------------------------------------------------------------------
>  Gather  (cost=470.00..884.25 rows=5000 width=244)
>    Workers Planned: 4
>    ->  Parallel Seq Scan on tenk1  (cost=470.00..884.25 rows=1250 width=244)
>          Filter: (NOT (hashed SubPlan 1))
>          SubPlan 1
>            ->  Seq Scan on tenk2  (cost=0.00..445.00 rows=10000 width=4)
> 
> random() is parallel-restricted so it's not cool that the SubPlan was
> allowed to be passed down to workers.
> 
> I tried to fix it like this:
> 
>     else if (IsA(node, SubPlan))
>     {
>         if (!((SubPlan *) node)->parallel_safe &&
>             max_parallel_hazard_test(PROPARALLEL_RESTRICTED, context))
>             return true;
>     }
> 
> but got some failures in the regression tests due to things not getting
> parallelized when expected.  Poking into that, I remembered that for some
> SubPlans, the testexpr contains Params representing the output columns
> of the sub-select.  So the recursive call of max_parallel_hazard_walker
> visits those and deems the expression parallel-restricted.
> 
> Basically, therefore, this logic is totally inadequate.  I think what
> we need to do is improve matters so that while looking at the testexpr
> of a SubPlan, we pass down a whitelist saying that the Params having
> the numbers indicated by the SubLink's paramIds list are OK.

The above-described topic is currently a PostgreSQL 10 open item.  Robert,
since you committed the patch believed to have created it, you own this open
item.  If some other commit is more relevant or if this does not belong as a
v10 open item, please let us know.  Otherwise, I will look for an update
according to your current blanket status update[1]:
 I will begin investigating this no later than April 24th, my first day back from vacation, and will provide a further
updateby that same day.
 

Thanks.

[1] https://www.postgresql.org/message-id/CA+Tgmoa1-529KFwdB-+A1eG2MFc3j9eqJenBUFU=MsT-H9Q8BQ@mail.gmail.com



Re: [HACKERS] Inadequate parallel-safety check for SubPlans

От
Tom Lane
Дата:
Noah Misch <noah@leadboat.com> writes:
> On Wed, Apr 12, 2017 at 02:41:09PM -0400, Tom Lane wrote:
>> This is 100% wrong.  It's failing to recurse into the subexpressions of
>> the SubPlan, which could very easily include parallel-unsafe function
>> calls.  Example:

> The above-described topic is currently a PostgreSQL 10 open item.  Robert,
> since you committed the patch believed to have created it, you own this open
> item.  If some other commit is more relevant or if this does not belong as a
> v10 open item, please let us know.

FYI, I have this on my to-look-at list, and expect to fix it before Robert
returns from vacation.
        regards, tom lane



Re: [HACKERS] Inadequate parallel-safety check for SubPlans

От
Amit Kapila
Дата:
On Sun, Apr 16, 2017 at 9:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Noah Misch <noah@leadboat.com> writes:
>> On Wed, Apr 12, 2017 at 02:41:09PM -0400, Tom Lane wrote:
>>> This is 100% wrong.  It's failing to recurse into the subexpressions of
>>> the SubPlan, which could very easily include parallel-unsafe function
>>> calls.  Example:
>
>> The above-described topic is currently a PostgreSQL 10 open item.  Robert,
>> since you committed the patch believed to have created it, you own this open
>> item.  If some other commit is more relevant or if this does not belong as a
>> v10 open item, please let us know.
>
> FYI, I have this on my to-look-at list, and expect to fix it before Robert
> returns from vacation.
>

Let me know if an initial patch by someone else can be helpful?  I was
not completely sure whether we need to pass any sort of whitelist of
params for parallel safety, but if you think that is the better way to
go, I can give it a try.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: [HACKERS] Inadequate parallel-safety check for SubPlans

От
Tom Lane
Дата:
Amit Kapila <amit.kapila16@gmail.com> writes:
> On Sun, Apr 16, 2017 at 9:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> FYI, I have this on my to-look-at list, and expect to fix it before Robert
>> returns from vacation.

> Let me know if an initial patch by someone else can be helpful?

Sure, have a go at it.  I won't get to this for a day or so ...

> I was not completely sure whether we need to pass any sort of whitelist
> of params for parallel safety, but if you think that is the better way
> to go, I can give it a try.

There's certainly more than one way to do it.
        regards, tom lane



Re: [HACKERS] Inadequate parallel-safety check for SubPlans

От
Robert Haas
Дата:
On Mon, Apr 17, 2017 at 1:24 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Amit Kapila <amit.kapila16@gmail.com> writes:
>> On Sun, Apr 16, 2017 at 9:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> FYI, I have this on my to-look-at list, and expect to fix it before Robert
>>> returns from vacation.
>
>> Let me know if an initial patch by someone else can be helpful?
>
> Sure, have a go at it.  I won't get to this for a day or so ...

Thanks to both of you.  I appreciate the help.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] Inadequate parallel-safety check for SubPlans

От
Amit Kapila
Дата:
On Mon, Apr 17, 2017 at 10:54 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Amit Kapila <amit.kapila16@gmail.com> writes:
>> On Sun, Apr 16, 2017 at 9:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> FYI, I have this on my to-look-at list, and expect to fix it before Robert
>>> returns from vacation.
>
>> Let me know if an initial patch by someone else can be helpful?
>
> Sure, have a go at it.  I won't get to this for a day or so ...
>

I have ended up doing something along the lines suggested by you (or
at least what I have understood from your e-mail).  Basically, pass
the safe-param-ids list to parallel safety function and decide based
on that if Param node reference in input expression is safe.  Note
that I have added a new parameter paramids list to build_subplan to
make AlternativeSubPlan case consistent with SubPlan case even though
we should be able to do without that as it seems to be exercised only
for the correlated EXISTS case.  Also, I think we don't need to check
parallel-safety for splan->args as that also is related to correlated
queries for which parallelism is not supported as of now.

I have also explored it to do it by checking parallel-safety of
testexpr before PARAM_EXEC params are replaced in testexpr, however
that won't work because we add PARAM_SUBLINK params at the time of
parse-analysis in transformSubLink().  I am not sure if it is a good
idea to test parallel-safety of testexpr at the time of parse-analysis
phase, so didn't pursue that path.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

Re: [HACKERS] Inadequate parallel-safety check for SubPlans

От
Tom Lane
Дата:
Amit Kapila <amit.kapila16@gmail.com> writes:
> I have ended up doing something along the lines suggested by you (or
> at least what I have understood from your e-mail).  Basically, pass
> the safe-param-ids list to parallel safety function and decide based
> on that if Param node reference in input expression is safe.

I did not like changing the signature of is_parallel_safe() for this:
that's getting a lot of call sites involved in something they don't care
about, and I'm not even sure that it's formally correct.  The key thing
here is that a Param could only be considered parallel-safe in context.
That is, if you looked at the testexpr alone and asked if it could be
pushed down to a worker, the answer would have to be "no".  Only when
you look at the whole SubPlan tree and ask if it can be pushed down
should the answer be "yes".  Therefore, I feel pretty strongly that
what we want is for the safe_param_ids whitelist to be mutated
internally in is_parallel_safe() as it descends the tree.  That way
it can give the right answer when considering a fragment of a plan.
I've committed a patch that does it like that.

> ... Also, I think we don't need to check
> parallel-safety for splan->args as that also is related to correlated
> queries for which parallelism is not supported as of now.

I think leaving that sort of thing out is just creating a latent bug
that is certain to bite you later.  It's true that as long as the args
list contains only Vars, it would never be parallel-unsafe --- but
primnodes.h is pretty clear that one shouldn't assume that it will
stay that way.  So it's better to recurse over the whole tree rather
than ignoring parts of it, especially if you're not going to document
the assumption you're making anywhere.

> I have also explored it to do it by checking parallel-safety of
> testexpr before PARAM_EXEC params are replaced in testexpr, however
> that won't work because we add PARAM_SUBLINK params at the time of
> parse-analysis in transformSubLink().

AFAICS we don't do parallel-safety checks early enough to need to bother
with that, so the existing handling of SubLink and PARAM_SUBLINK params
should be fine.
        regards, tom lane



Re: [HACKERS] Inadequate parallel-safety check for SubPlans

От
Amit Kapila
Дата:
On Wed, Apr 19, 2017 at 1:27 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Amit Kapila <amit.kapila16@gmail.com> writes:
>> I have ended up doing something along the lines suggested by you (or
>> at least what I have understood from your e-mail).  Basically, pass
>> the safe-param-ids list to parallel safety function and decide based
>> on that if Param node reference in input expression is safe.
>
> I did not like changing the signature of is_parallel_safe() for this:
> that's getting a lot of call sites involved in something they don't care
> about, and I'm not even sure that it's formally correct.  The key thing
> here is that a Param could only be considered parallel-safe in context.
> That is, if you looked at the testexpr alone and asked if it could be
> pushed down to a worker, the answer would have to be "no".  Only when
> you look at the whole SubPlan tree and ask if it can be pushed down
> should the answer be "yes".  Therefore, I feel pretty strongly that
> what we want is for the safe_param_ids whitelist to be mutated
> internally in is_parallel_safe() as it descends the tree.  That way
> it can give the right answer when considering a fragment of a plan.
> I've committed a patch that does it like that.
>

Thanks.

>> ... Also, I think we don't need to check
>> parallel-safety for splan->args as that also is related to correlated
>> queries for which parallelism is not supported as of now.
>
> I think leaving that sort of thing out is just creating a latent bug
> that is certain to bite you later.  It's true that as long as the args
> list contains only Vars, it would never be parallel-unsafe --- but
> primnodes.h is pretty clear that one shouldn't assume that it will
> stay that way.

Sure, but the point I was trying to make was whenever subplan has
args, I think it won't be parallel-safe as those args are used to pass
params.

>  So it's better to recurse over the whole tree rather
> than ignoring parts of it, especially if you're not going to document
> the assumption you're making anywhere.
>

No problem.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: [HACKERS] Inadequate parallel-safety check for SubPlans

От
Tom Lane
Дата:
Amit Kapila <amit.kapila16@gmail.com> writes:
> On Wed, Apr 19, 2017 at 1:27 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I think leaving that sort of thing out is just creating a latent bug
>> that is certain to bite you later.  It's true that as long as the args
>> list contains only Vars, it would never be parallel-unsafe --- but
>> primnodes.h is pretty clear that one shouldn't assume that it will
>> stay that way.

> Sure, but the point I was trying to make was whenever subplan has
> args, I think it won't be parallel-safe as those args are used to pass
> params.

Right now, yes, but surely we're going to be trying to relax that sometime
soon.  And at that point it would be a latent bug for this function to
not recurse into the args list.  Whatever the restrictions are on the
tree as a whole, they'll apply to that subtree too.
        regards, tom lane