Обсуждение: inconsistent results querying table partitioned by date

Поиск
Список
Период
Сортировка

inconsistent results querying table partitioned by date

От
Alan Jackson
Дата:
Hi

Im having a problem with querying a table partitioned by date.

Depending on the sequence of operations on a date parameter used in a where clause, I either get the expected results,
orno results. 

This suggests a bug in the handling of date parameters and partition range handling.

I’ve replicated this down to a single sequence of create table, insert data, query.

This issue occurs for me on postgresql 11.2 on a mac, installed via brew.

In this case the table is partitioned by an id and then by the date, if it is partitioned by only the date everything
worksas expected. 

However, I am attempting to add partitioning to a fairly large sofware-as-a-service platform, so making changes to the
tabledefinitions or global code changes is not really practical. 

The sql in question is below.

I hope there is something simple I can change in the partition definitions to work around this.

Many Thanks,
Alan Jackson
Data Architect
TVSquared


--SQL STARTS HERE

--drop table dataid;
CREATE TABLE dataid
(
id integer not null,
datadatetime timestamp without time zone NOT NULL,
CONSTRAINT dataid_pkey PRIMARY KEY (id, datadatetime)
) PARTITION BY RANGE (id, datadatetime)
;

CREATE TABLE dataid_201902 PARTITION OF dataid FOR VALUES FROM (1, '2019-02-01 00:00:00') TO (1, '2019-03-01
00:00:00');

CREATE TABLE dataid_default PARTITION OF dataid DEFAULT;

insert into dataid values (1,'2019-02-24T00:00:00');

--- returns 1 row as expected
select * from dataid where id=1 and datadatetime <  (('2019-02-26T00:00:00'::timestamp::timestamp at time zone
'America/New_York') at time zone 'UTC' + '2 days'::interval); 

--- returns no rows
select * from dataid where id=1 and datadatetime <  (('2019-02-26T00:00:00'::timestamp::timestamp at time zone
'America/New_York'+ '2 days'::interval) at time zone 'UTC'); 

-- both date expressions evaluate to the same date.
select
(('2019-02-26T00:00:00'::timestamp::timestamp at time zone 'America/New_York' ) at time zone 'UTC' + '2
days'::interval)as workingdate, 
(('2019-02-26T00:00:00'::timestamp::timestamp at time zone 'America/New_York' + '2 days'::interval) at time zone 'UTC')
asnotworkingdate; 

--SQL ENDS HERE





--
TV Squared Limited is a company registered in Scotland.  Registered number:
SC421072.  Registered office: CodeBase, Argyle House, 3 Lady Lawson Street,
Edinburgh, EH3 9DR.
 
TV Squared Inc (File No. 5600204) is an Incorporated
company registered in Delaware. Principal office: 1412 Broadway, 22 Fl, New
York, New York, 10018

TV Squared GmbH is a company registered in Munich.
Registered number: HRB 236077. Registered office: Oskar-von-Miller-Ring 20,
c/o wework, 80333 Munchen

This message is private and confidential.  If
you have received this message in error, please notify us and remove it
from your system.



Re: inconsistent results querying table partitioned by date

От
Tom Lane
Дата:
Alan Jackson <ajax@tvsquared.com> writes:
> Im having a problem with querying a table partitioned by date.
> Depending on the sequence of operations on a date parameter used in a where clause, I either get the expected
results,or no results. 

Yeah, this is pretty clearly broken.  It looks to me like the partition
pruning code is making insupportable assumptions about a comparison to
a stable expression.  Using your example table:

regression=# explain select * from dataid where id=1 and datadatetime < localtimestamp;
     
                                    QUERY PLAN
----------------------------------------------------------------------------------
 Bitmap Heap Scan on dataid_default  (cost=4.19..11.31 rows=3 width=12)
   Recheck Cond: ((id = 1) AND (datadatetime < LOCALTIMESTAMP))
   ->  Bitmap Index Scan on dataid_default_pkey  (cost=0.00..4.19 rows=3 width=0)
         Index Cond: ((id = 1) AND (datadatetime < LOCALTIMESTAMP))
(4 rows)

It should absolutely not have pruned away the dataid_201902 partition,
but it did.  It's okay with an immutable expression:

regression=# explain select * from dataid where id=1 and datadatetime < '2019-05-09'::timestamp;
                                                  QUERY PLAN
--------------------------------------------------------------------------------------------------------------
 Append  (cost=4.18..22.63 rows=6 width=12)
   ->  Bitmap Heap Scan on dataid_201902  (cost=4.18..11.30 rows=3 width=12)
         Recheck Cond: ((id = 1) AND (datadatetime < '2019-05-09 00:00:00'::timestamp without time zone))
         ->  Bitmap Index Scan on dataid_201902_pkey  (cost=0.00..4.18 rows=3 width=0)
               Index Cond: ((id = 1) AND (datadatetime < '2019-05-09 00:00:00'::timestamp without time zone))
   ->  Bitmap Heap Scan on dataid_default  (cost=4.18..11.30 rows=3 width=12)
         Recheck Cond: ((id = 1) AND (datadatetime < '2019-05-09 00:00:00'::timestamp without time zone))
         ->  Bitmap Index Scan on dataid_default_pkey  (cost=0.00..4.18 rows=3 width=0)
               Index Cond: ((id = 1) AND (datadatetime < '2019-05-09 00:00:00'::timestamp without time zone))
(9 rows)

or a volatile one:

regression=# explain select * from dataid where id=1 and datadatetime < '2019-05-09'::timestamp + random()*'1
day'::interval;
                                                       QUERY PLAN


------------------------------------------------------------------------------------------------------------------------
 Append  (cost=4.23..29.80 rows=6 width=12)
   ->  Bitmap Heap Scan on dataid_201902  (cost=4.23..14.88 rows=3 width=12)
         Recheck Cond: (id = 1)
         Filter: (datadatetime < ('2019-05-09 00:00:00'::timestamp without time zone + (random() * '1 day'::interval)))
         ->  Bitmap Index Scan on dataid_201902_pkey  (cost=0.00..4.23 rows=10 width=0)
               Index Cond: (id = 1)
   ->  Bitmap Heap Scan on dataid_default  (cost=4.23..14.88 rows=3 width=12)
         Recheck Cond: (id = 1)
         Filter: (datadatetime < ('2019-05-09 00:00:00'::timestamp without time zone + (random() * '1 day'::interval)))
         ->  Bitmap Index Scan on dataid_default_pkey  (cost=0.00..4.23 rows=10 width=0)
               Index Cond: (id = 1)
(11 rows)

but somebody's confused about what can be done with stable expressions.

While I'm on about it, this behavior is also insupportable:

regression=# explain select * from dataid where id=1 and datadatetime < '2018-05-09'::timestamptz ;
                                               QUERY PLAN
--------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on dataid_default  (cost=4.18..11.30 rows=3 width=12)
   Recheck Cond: ((id = 1) AND (datadatetime < '2018-05-09 00:00:00-04'::timestamp with time zone))
   ->  Bitmap Index Scan on dataid_default_pkey  (cost=0.00..4.18 rows=3 width=0)
         Index Cond: ((id = 1) AND (datadatetime < '2018-05-09 00:00:00-04'::timestamp with time zone))
(4 rows)

because timestamp-against-timestamptz comparison is inherently only
stable; the pruning code is way exceeding its authority by supposing
that a comparison that holds at plan time will hold at runtime,
even with a constant comparison value.

The reason for the difference in your results is that one expression
is immutable and the other is only stable:

regression=# explain verbose select
(('2019-02-26T00:00:00'::timestamp::timestamp at time zone 'America/New_York' ) at time zone 'UTC' + '2
days'::interval)as workingdate, 
(('2019-02-26T00:00:00'::timestamp::timestamp at time zone 'America/New_York' + '2 days'::interval) at time zone 'UTC')
asnotworkingdate; 
                                                                           QUERY PLAN
                                        

----------------------------------------------------------------------------------------------------------------------------------------------------------------
 Result  (cost=0.00..0.01 rows=1 width=16)
   Output: '2019-02-28 05:00:00'::timestamp without time zone, timezone('UTC'::text, ('2019-02-26
00:00:00-05'::timestampwith time zone + '2 days'::interval)) 
(2 rows)

the reason being that timestamptz + interval depends on the timezone
setting (for some intervals) but timestamp + interval never does.

Seems to be equally broken in v11 and HEAD.  I didn't try v10.

> I hope there is something simple I can change in the partition definitions to work around this.

Until we fix the bug, I think the best you can do is not use stable
expressions in this context.

            regards, tom lane



Re: inconsistent results querying table partitioned by date

От
Amit Langote
Дата:
On 2019/05/10 8:22, Tom Lane wrote:
> Alan Jackson <ajax@tvsquared.com> writes:
>> Im having a problem with querying a table partitioned by date.
>> Depending on the sequence of operations on a date parameter used in a where clause, I either get the expected
results,or no results.
 
> 
> Yeah, this is pretty clearly broken.

Indeed it is.

> It looks to me like the partition
> pruning code is making insupportable assumptions about a comparison to
> a stable expression.  Using your example table:
> 
> regression=# explain select * from dataid where id=1 and datadatetime < localtimestamp;
      
 
>                                     QUERY PLAN                                    
> ----------------------------------------------------------------------------------
>  Bitmap Heap Scan on dataid_default  (cost=4.19..11.31 rows=3 width=12)
>    Recheck Cond: ((id = 1) AND (datadatetime < LOCALTIMESTAMP))
>    ->  Bitmap Index Scan on dataid_default_pkey  (cost=0.00..4.19 rows=3 width=0)
>          Index Cond: ((id = 1) AND (datadatetime < LOCALTIMESTAMP))
> (4 rows)
> 
> It should absolutely not have pruned away the dataid_201902 partition,
> but it did.
> 
> While I'm on about it, this behavior is also insupportable:
> 
> regression=# explain select * from dataid where id=1 and datadatetime < '2018-05-09'::timestamptz ;

(likely a typo in the condition: s/2018/2019)

>                                                QUERY PLAN                                               
> --------------------------------------------------------------------------------------------------------
>  Bitmap Heap Scan on dataid_default  (cost=4.18..11.30 rows=3 width=12)
>    Recheck Cond: ((id = 1) AND (datadatetime < '2018-05-09 00:00:00-04'::timestamp with time zone))
>    ->  Bitmap Index Scan on dataid_default_pkey  (cost=0.00..4.18 rows=3 width=0)
>          Index Cond: ((id = 1) AND (datadatetime < '2018-05-09 00:00:00-04'::timestamp with time zone))
> (4 rows)
> 
> because timestamp-against-timestamptz comparison is inherently only
> stable; the pruning code is way exceeding its authority by supposing
> that a comparison that holds at plan time will hold at runtime,
> even with a constant comparison value.

I looked into it and the problem is not really that plan-time pruning is
comparing stable expressions against partition bounds.  If it had, it
wouldn't have pruned dataid_201902 anyway, because its bounding range for
datadatetime is '2019-02-01' to '2019-03-01', which is clearly less than
'2019-05-09' (Tom's localtimestamp).

The real problem seems to be with the way range partition pruning assumes
an operator strategy to perform pruning with.  Quoted query's WHERE clause
looks something like this: (firstkey = CONSTANT AND secondkey <
STABLE_EXPRESSION).  From this set of clauses, a (CONSTANT,
STABLE_EXPRESSION) tuple is formed to be compared against partition bounds
using row-comparison-like semantics.  As things stand today, the actual
comparison function (partprune.c: get_matching_range_bounds()) receives
only the strategy of the last expression, which in this case is that of a
LESS operator.  When the tuple (CONSTANT, STABLE_EXPRESSION) passes
through the last step to extract Datum values to be passed to
get_matching_range_bounds, it's correctly determined that
STABLE_EXPRESSION cannot be computed during planning and so the tuple is
truncated to just (CONSTANT-Datum), but the strategy to assume during
pruning is still that of the LESS operator, whereas now it should really
be EQUAL.  With LESS semantics, get_matching_range_bounds() concludes that
no partition bounds are smaller than 1 (extracted from id=1 in the above
query), except the default partition, so it prunes 'dataid_201902'.

I've attached a patch to fix that.  Actually, I've attached two patches --
the 1st one adds a test for the misbehaving case with *wrong* output
wherein a partition is incorrectly pruned, and the 2nd actually fixes the
bug and updates the output of the test added by the 1st patch.  Divided
the patch this way just to show the bug clearly.

> Seems to be equally broken in v11 and HEAD.  I didn't try v10.

v10 is fine, as it uses constraint exclusion.

Attached patches apply to both v11 and HEAD.

Thanks,
Amit

Вложения

Re: inconsistent results querying table partitioned by date

От
Kyotaro HORIGUCHI
Дата:
At Fri, 10 May 2019 14:37:34 +0900, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote in
<49cd5235-cbe6-e686-8014-85c1e45efe56@lab.ntt.co.jp>
> > because timestamp-against-timestamptz comparison is inherently only
> > stable; the pruning code is way exceeding its authority by supposing
> > that a comparison that holds at plan time will hold at runtime,
> > even with a constant comparison value.
> 
> I looked into it and the problem is not really that plan-time pruning is
> comparing stable expressions against partition bounds.  If it had, it
> wouldn't have pruned dataid_201902 anyway, because its bounding range for
> datadatetime is '2019-02-01' to '2019-03-01', which is clearly less than
> '2019-05-09' (Tom's localtimestamp).
> 
> The real problem seems to be with the way range partition pruning assumes
> an operator strategy to perform pruning with.  Quoted query's WHERE clause
> looks something like this: (firstkey = CONSTANT AND secondkey <
> STABLE_EXPRESSION).  From this set of clauses, a (CONSTANT,
> STABLE_EXPRESSION) tuple is formed to be compared against partition bounds
> using row-comparison-like semantics.  As things stand today, the actual
> comparison function (partprune.c: get_matching_range_bounds()) receives
> only the strategy of the last expression, which in this case is that of a
> LESS operator.  When the tuple (CONSTANT, STABLE_EXPRESSION) passes
> through the last step to extract Datum values to be passed to
> get_matching_range_bounds, it's correctly determined that
> STABLE_EXPRESSION cannot be computed during planning and so the tuple is
> truncated to just (CONSTANT-Datum), but the strategy to assume during
> pruning is still that of the LESS operator, whereas now it should really
> be EQUAL.  With LESS semantics, get_matching_range_bounds() concludes that
> no partition bounds are smaller than 1 (extracted from id=1 in the above
> query), except the default partition, so it prunes 'dataid_201902'.

I concluded the same.

> I've attached a patch to fix that.  Actually, I've attached two patches --
> the 1st one adds a test for the misbehaving case with *wrong* output
> wherein a partition is incorrectly pruned, and the 2nd actually fixes the
> bug and updates the output of the test added by the 1st patch.  Divided
> the patch this way just to show the bug clearly.

But this seems a bit wrong.

If the two partition keys were in reverse order, pruning still
fails.

CREATE TABLE dataid2 (
       datadatetime timestamp without time zone NOT NULL,
       id integer not null,
       CONSTRAINT dataid2_pkey PRIMARY KEY (datadatetime, id)
) PARTITION BY RANGE (datadatetime, id);

CREATE TABLE dataid2_201902 PARTITION OF dataid2 FOR VALUES FROM ('2019-02-01 00:00:00', 1) TO ('2019-03-01 00:00:00',
1);

CREATE TABLE dataid2_default PARTITION OF dataid2 DEFAULT;

insert into dataid2 values ('2019-02-24T00:00:00', 1);

select * from dataid2 where id = 1 and datadatetime <  (('2019-02-26T00:00:00'::timestamp::timestamp at time zone
'America/New_York'+ '2 days'::interval) at time zone 'UTC');
 
 datadatetime | id 
--------------+----
(0 rows)

This is wrong.

The condition is divided into two part (id = 1) and (datadatetime
< ..) and the latter reduces to nothing and the former remains
unchanged. Pruning continues using id = 1 and (I suppose) but
that is not partition_range_datum_bsearch()'s assumption. As the
result all partitions (other than default) are gone.

In passing I found a typo while looking this issue.

|   case BTLessStrategyNumber:
|
|     /*
|      * Look for the greatest bound that is < or <= lookup value and
|      * set minoff to its offset.

I think the "minoff" is typo of "maxoff".


> > Seems to be equally broken in v11 and HEAD.  I didn't try v10.
> 
> v10 is fine, as it uses constraint exclusion.
> 
> Attached patches apply to both v11 and HEAD.

Mmm. This doesn't apply on head on my environment.

> patching file src/test/regress/expected/partition_prune.out
> Hunk #1 FAILED at 951.

git rev-parse --short HEAD
d0bbf871ca

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: inconsistent results querying table partitioned by date

От
Amit Langote
Дата:
Horiguchi-san,

Thanks for checking.

On 2019/05/10 15:33, Kyotaro HORIGUCHI wrote:
> At Fri, 10 May 2019 14:37:34 +0900, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
>> I've attached a patch to fix that.  Actually, I've attached two patches --
>> the 1st one adds a test for the misbehaving case with *wrong* output
>> wherein a partition is incorrectly pruned, and the 2nd actually fixes the
>> bug and updates the output of the test added by the 1st patch.  Divided
>> the patch this way just to show the bug clearly.
> 
> But this seems a bit wrong.
> 
> If the two partition keys were in reverse order, pruning still
> fails.
> 
> CREATE TABLE dataid2 (
>        datadatetime timestamp without time zone NOT NULL,
>        id integer not null,
>        CONSTRAINT dataid2_pkey PRIMARY KEY (datadatetime, id)
> ) PARTITION BY RANGE (datadatetime, id);
> 
> CREATE TABLE dataid2_201902 PARTITION OF dataid2 FOR VALUES FROM ('2019-02-01 00:00:00', 1) TO ('2019-03-01
00:00:00',1);
 
> 
> CREATE TABLE dataid2_default PARTITION OF dataid2 DEFAULT;
> 
> insert into dataid2 values ('2019-02-24T00:00:00', 1);
> 
> select * from dataid2 where id = 1 and datadatetime <  (('2019-02-26T00:00:00'::timestamp::timestamp at time zone
'America/New_York'+ '2 days'::interval) at time zone 'UTC');
 
>  datadatetime | id 
> --------------+----
> (0 rows)
> 
> This is wrong.

Ah, indeed.

> The condition is divided into two part (id = 1) and (datadatetime
> < ..) and the latter reduces to nothing and the former remains
> unchanged. Pruning continues using id = 1 and (I suppose) but
> that is not partition_range_datum_bsearch()'s assumption. As the
> result all partitions (other than default) are gone.

I'd have expected this to result in no values being passed to
get_matching_range_bounds (nvalues = 0), because the value of the
expression compared against the 1st key is unavailable at planning time,
meaning the values for subsequent keys should be ignored.  That would have
meant there's nothing to prune with and hence no pruning should have
occurred.  The problem however does not seem to be with the new logic I
proposed but what turned out to be another bug in
gen_prune_steps_from_opexps().

Given that datadatetime is the first key in your example table and it's
being compared using a non-inclusive operator, clauses for subsequent key
columns (in this case id = 1) should have been ignored by pruning, which
fails to happen due to a bug in gen_prune_steps_from_opexps().  I've fixed
that in the attached updated patch.  With the patch, queries like yours
return the correct result as shown below:

truncate dataid2;

insert into dataid2 (datadatetime, id) select
(('2019-02-22T00:00:00'::timestamp::timestamp at time zone
'America/New_York' + '2 days'::interval) at time zone 'UTC'), 1;

insert into dataid2 (datadatetime, id) select
(('2019-02-22T00:00:00'::timestamp::timestamp at time zone
'America/New_York' + '1 days'::interval) at time zone 'UTC'), 1;

-- pruning steps generated for only 1st column (key1 < expr1)
-- but no pruning occurs during planning because the value for
-- datadatetime's clause is unavailable

explain select * from dataid2 where id = 1 and datadatetime <
(('2019-02-22T00:00:00'::timestamp::timestamp at time zone
'America/New_York' + '2 days'::interval) at time zone 'UTC');

QUERY PLAN


────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 Append  (cost=10.96..36.21 rows=6 width=12)
   ->  Bitmap Heap Scan on dataid2_201902  (cost=10.96..18.09 rows=3 width=12)
         Recheck Cond: ((datadatetime < timezone('UTC'::text, ('2019-02-22
14:00:00+09'::timestamp with time zone + '2 days'::interval))) AND (id = 1))
         ->  Bitmap Index Scan on dataid2_201902_pkey  (cost=0.00..10.96
rows=3 width=0)
               Index Cond: ((datadatetime < timezone('UTC'::text,
('2019-02-22 14:00:00+09'::timestamp with time zone + '2
days'::interval))) AND (id = 1))
   ->  Bitmap Heap Scan on dataid2_default  (cost=10.96..18.09 rows=3
width=12)
         Recheck Cond: ((datadatetime < timezone('UTC'::text, ('2019-02-22
14:00:00+09'::timestamp with time zone + '2 days'::interval))) AND (id = 1))
         ->  Bitmap Index Scan on dataid2_default_pkey  (cost=0.00..10.96
rows=3 width=0)
               Index Cond: ((datadatetime < timezone('UTC'::text,
('2019-02-22 14:00:00+09'::timestamp with time zone + '2
days'::interval))) AND (id = 1))
(9 rows)

select * from dataid2 where id = 1 and datadatetime <
(('2019-02-22T00:00:00'::timestamp::timestamp at time zone
'America/New_York' + '2 days'::interval) at time zone 'UTC');
    datadatetime     │ id
─────────────────────┼────
 2019-02-23 05:00:00 │  1
(1 row)

-- pruning steps generated for both columns (key1 = expr1, key2 = expr2),
-- but no pruning occurs during planning because the value for
-- datadatetime's clause is unavailable

explain select * from dataid2 where id = 1 and datadatetime =
(('2019-02-22T00:00:00'::timestamp::timestamp at time zone
'America/New_York' + '2 days'::interval) at time zone 'UTC');

QUERY PLAN


──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 Append  (cost=0.16..16.37 rows=2 width=12)
   Subplans Removed: 1
   ->  Index Only Scan using dataid2_201902_pkey on dataid2_201902
(cost=0.16..8.18 rows=1 width=12)
         Index Cond: ((datadatetime = timezone('UTC'::text, ('2019-02-22
14:00:00+09'::timestamp with time zone + '2 days'::interval))) AND (id = 1))
(4 rows)


select * from dataid2 where id = 1 and datadatetime =
(('2019-02-22T00:00:00'::timestamp::timestamp at time zone
'America/New_York' + '2 days'::interval) at time zone 'UTC');
    datadatetime     │ id
─────────────────────┼────
 2019-02-24 05:00:00 │  1
(1 row)

> In passing I found a typo while looking this issue.
> 
> |   case BTLessStrategyNumber:
> |
> |     /*
> |      * Look for the greatest bound that is < or <= lookup value and
> |      * set minoff to its offset.
> 
> I think the "minoff" is typo of "maxoff".

Ah, fixed in the updated patch.

>>> Seems to be equally broken in v11 and HEAD.  I didn't try v10.
>>
>> v10 is fine, as it uses constraint exclusion.
>>
>> Attached patches apply to both v11 and HEAD.
> 
> Mmm. This doesn't apply on head on my environment.
> 
>> patching file src/test/regress/expected/partition_prune.out
>> Hunk #1 FAILED at 951.
> 
> git rev-parse --short HEAD
> d0bbf871ca

Hmm, I rebased the patch's branch over master and didn't get any
conflicts.  Please check with the new patch.

Thanks,
Amit

Вложения

Re: inconsistent results querying table partitioned by date

От
Alan Jackson
Дата:
Hi

Many thanks for the quick and patch-filled response to this issue. It’s great to see such an able and active response!

I’m glad my bug report highlighted the problem sufficiently clearly.

Is this likely to turn into an official patch, and if so, where can I track which release etc it will be included in?

Thanks again,
Alan Jackson
Data Architect
TVSquared



> On 10 May 2019, at 10:18, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
>
> Horiguchi-san,
>
> Thanks for checking.
>
>> On 2019/05/10 15:33, Kyotaro HORIGUCHI wrote:
>> At Fri, 10 May 2019 14:37:34 +0900, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
>>> I've attached a patch to fix that.  Actually, I've attached two patches --
>>> the 1st one adds a test for the misbehaving case with *wrong* output
>>> wherein a partition is incorrectly pruned, and the 2nd actually fixes the
>>> bug and updates the output of the test added by the 1st patch.  Divided
>>> the patch this way just to show the bug clearly.
>>
>> But this seems a bit wrong.
>>
>> If the two partition keys were in reverse order, pruning still
>> fails.
>>
>> CREATE TABLE dataid2 (
>>       datadatetime timestamp without time zone NOT NULL,
>>       id integer not null,
>>       CONSTRAINT dataid2_pkey PRIMARY KEY (datadatetime, id)
>> ) PARTITION BY RANGE (datadatetime, id);
>>
>> CREATE TABLE dataid2_201902 PARTITION OF dataid2 FOR VALUES FROM ('2019-02-01 00:00:00', 1) TO ('2019-03-01
00:00:00',1); 
>>
>> CREATE TABLE dataid2_default PARTITION OF dataid2 DEFAULT;
>>
>> insert into dataid2 values ('2019-02-24T00:00:00', 1);
>>
>> select * from dataid2 where id = 1 and datadatetime <  (('2019-02-26T00:00:00'::timestamp::timestamp at time zone
'America/New_York'+ '2 days'::interval) at time zone 'UTC'); 
>> datadatetime | id
>> --------------+----
>> (0 rows)
>>
>> This is wrong.
>
> Ah, indeed.
>
>> The condition is divided into two part (id = 1) and (datadatetime
>> < ..) and the latter reduces to nothing and the former remains
>> unchanged. Pruning continues using id = 1 and (I suppose) but
>> that is not partition_range_datum_bsearch()'s assumption. As the
>> result all partitions (other than default) are gone.
>
> I'd have expected this to result in no values being passed to
> get_matching_range_bounds (nvalues = 0), because the value of the
> expression compared against the 1st key is unavailable at planning time,
> meaning the values for subsequent keys should be ignored.  That would have
> meant there's nothing to prune with and hence no pruning should have
> occurred.  The problem however does not seem to be with the new logic I
> proposed but what turned out to be another bug in
> gen_prune_steps_from_opexps().
>
> Given that datadatetime is the first key in your example table and it's
> being compared using a non-inclusive operator, clauses for subsequent key
> columns (in this case id = 1) should have been ignored by pruning, which
> fails to happen due to a bug in gen_prune_steps_from_opexps().  I've fixed
> that in the attached updated patch.  With the patch, queries like yours
> return the correct result as shown below:
>
> truncate dataid2;
>
> insert into dataid2 (datadatetime, id) select
> (('2019-02-22T00:00:00'::timestamp::timestamp at time zone
> 'America/New_York' + '2 days'::interval) at time zone 'UTC'), 1;
>
> insert into dataid2 (datadatetime, id) select
> (('2019-02-22T00:00:00'::timestamp::timestamp at time zone
> 'America/New_York' + '1 days'::interval) at time zone 'UTC'), 1;
>
> -- pruning steps generated for only 1st column (key1 < expr1)
> -- but no pruning occurs during planning because the value for
> -- datadatetime's clause is unavailable
>
> explain select * from dataid2 where id = 1 and datadatetime <
> (('2019-02-22T00:00:00'::timestamp::timestamp at time zone
> 'America/New_York' + '2 days'::interval) at time zone 'UTC');
>
> QUERY PLAN
>
>
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
> Append  (cost=10.96..36.21 rows=6 width=12)
>   ->  Bitmap Heap Scan on dataid2_201902  (cost=10.96..18.09 rows=3 width=12)
>         Recheck Cond: ((datadatetime < timezone('UTC'::text, ('2019-02-22
> 14:00:00+09'::timestamp with time zone + '2 days'::interval))) AND (id = 1))
>         ->  Bitmap Index Scan on dataid2_201902_pkey  (cost=0.00..10.96
> rows=3 width=0)
>               Index Cond: ((datadatetime < timezone('UTC'::text,
> ('2019-02-22 14:00:00+09'::timestamp with time zone + '2
> days'::interval))) AND (id = 1))
>   ->  Bitmap Heap Scan on dataid2_default  (cost=10.96..18.09 rows=3
> width=12)
>         Recheck Cond: ((datadatetime < timezone('UTC'::text, ('2019-02-22
> 14:00:00+09'::timestamp with time zone + '2 days'::interval))) AND (id = 1))
>         ->  Bitmap Index Scan on dataid2_default_pkey  (cost=0.00..10.96
> rows=3 width=0)
>               Index Cond: ((datadatetime < timezone('UTC'::text,
> ('2019-02-22 14:00:00+09'::timestamp with time zone + '2
> days'::interval))) AND (id = 1))
> (9 rows)
>
> select * from dataid2 where id = 1 and datadatetime <
> (('2019-02-22T00:00:00'::timestamp::timestamp at time zone
> 'America/New_York' + '2 days'::interval) at time zone 'UTC');
>    datadatetime     │ id
> ─────────────────────┼────
> 2019-02-23 05:00:00 │  1
> (1 row)
>
> -- pruning steps generated for both columns (key1 = expr1, key2 = expr2),
> -- but no pruning occurs during planning because the value for
> -- datadatetime's clause is unavailable
>
> explain select * from dataid2 where id = 1 and datadatetime =
> (('2019-02-22T00:00:00'::timestamp::timestamp at time zone
> 'America/New_York' + '2 days'::interval) at time zone 'UTC');
>
> QUERY PLAN
>
>
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
> Append  (cost=0.16..16.37 rows=2 width=12)
>   Subplans Removed: 1
>   ->  Index Only Scan using dataid2_201902_pkey on dataid2_201902
> (cost=0.16..8.18 rows=1 width=12)
>         Index Cond: ((datadatetime = timezone('UTC'::text, ('2019-02-22
> 14:00:00+09'::timestamp with time zone + '2 days'::interval))) AND (id = 1))
> (4 rows)
>
>
> select * from dataid2 where id = 1 and datadatetime =
> (('2019-02-22T00:00:00'::timestamp::timestamp at time zone
> 'America/New_York' + '2 days'::interval) at time zone 'UTC');
>    datadatetime     │ id
> ─────────────────────┼────
> 2019-02-24 05:00:00 │  1
> (1 row)
>
>> In passing I found a typo while looking this issue.
>>
>> |   case BTLessStrategyNumber:
>> |
>> |     /*
>> |      * Look for the greatest bound that is < or <= lookup value and
>> |      * set minoff to its offset.
>>
>> I think the "minoff" is typo of "maxoff".
>
> Ah, fixed in the updated patch.
>
>>>> Seems to be equally broken in v11 and HEAD.  I didn't try v10.
>>>
>>> v10 is fine, as it uses constraint exclusion.
>>>
>>> Attached patches apply to both v11 and HEAD.
>>
>> Mmm. This doesn't apply on head on my environment.
>>
>>> patching file src/test/regress/expected/partition_prune.out
>>> Hunk #1 FAILED at 951.
>>
>> git rev-parse --short HEAD
>> d0bbf871ca
>
> Hmm, I rebased the patch's branch over master and didn't get any
> conflicts.  Please check with the new patch.
>
> Thanks,
> Amit
> <v2-0001-Add-test.patch>
> <v2-0002-Bug-fix.patch>

--
TV Squared Limited is a company registered in Scotland.  Registered number:
SC421072.  Registered office: CodeBase, Argyle House, 3 Lady Lawson Street,
Edinburgh, EH3 9DR.
 
TV Squared Inc (File No. 5600204) is an Incorporated
company registered in Delaware. Principal office: 1412 Broadway, 22 Fl, New
York, New York, 10018

TV Squared GmbH is a company registered in Munich.
Registered number: HRB 236077. Registered office: Oskar-von-Miller-Ring 20,
c/o wework, 80333 Munchen

This message is private and confidential.  If
you have received this message in error, please notify us and remove it
from your system.



Re: inconsistent results querying table partitioned by date

От
Tom Lane
Дата:
Alan Jackson <ajax@tvsquared.com> writes:
> Is this likely to turn into an official patch, and if so, where can I track which release etc it will be included in?

We're definitely going to do *something*, though whether the current
patches are the last word is too early to say.

Unfortunately, we just missed the regular quarterly minor releases [1].
Unless a bug turns up that's sufficiently urgent to justify an
off-schedule release (no, this one isn't), the next patch release
for v11 will be 11.4 in August.  I'd say the odds are near 100%
that some fix will be committed before then.

Actually, the odds are high that we'll have committed a fix within
a few days; we don't normally let this sort of discussion drag on.
If this problem is enough of a show-stopper for you, you could
consider applying the patch locally once it's committed.

            regards, tom lane

[1] https://www.postgresql.org/developer/roadmap/



Re: inconsistent results querying table partitioned by date

От
Tom Lane
Дата:
Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes:
> [ v2 patches ]

While this does seem to be fixing real bugs, it's failing to address my
point about stable vs. immutable operators.

Taking Alan's test case again (and using the v2 patch), consider:

regression=# explain select * from dataid where id=1 and datadatetime < '2018-01-01'::timestamp;
                                               QUERY PLAN
--------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on dataid_default  (cost=4.18..11.30 rows=3 width=12)
   Recheck Cond: ((id = 1) AND (datadatetime < '2018-01-01 00:00:00'::timestamp without time zone))
   ->  Bitmap Index Scan on dataid_default_pkey  (cost=0.00..4.18 rows=3 width=0)
         Index Cond: ((id = 1) AND (datadatetime < '2018-01-01 00:00:00'::timestamp without time zone))
(4 rows)

That's fine.  The given date is older than anything in the dataid_201902
partition, so we can prune.  But change it to this:

regression=# explain select * from dataid where id=1 and datadatetime < '2018-01-01'::timestamptz;
                                               QUERY PLAN
--------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on dataid_default  (cost=4.18..11.30 rows=3 width=12)
   Recheck Cond: ((id = 1) AND (datadatetime < '2018-01-01 00:00:00-05'::timestamp with time zone))
   ->  Bitmap Index Scan on dataid_default_pkey  (cost=0.00..4.18 rows=3 width=0)
         Index Cond: ((id = 1) AND (datadatetime < '2018-01-01 00:00:00-05'::timestamp with time zone))
(4 rows)

That's not fine.  What we have here is a "timestamp < timestamptz"
operator, which is only stable, therefore it might give different
results at runtime than at plan time.  You can't make plan-time
pruning decisions with that.  What we should have gotten here was
an Append node that could do run-time pruning.

            regards, tom lane



Re: inconsistent results querying table partitioned by date

От
Amit Langote
Дата:
On 2019/05/11 6:05, Tom Lane wrote:
> regression=# explain select * from dataid where id=1 and datadatetime < '2018-01-01'::timestamptz;
>                                                QUERY PLAN                                               
> --------------------------------------------------------------------------------------------------------
>  Bitmap Heap Scan on dataid_default  (cost=4.18..11.30 rows=3 width=12)
>    Recheck Cond: ((id = 1) AND (datadatetime < '2018-01-01 00:00:00-05'::timestamp with time zone))
>    ->  Bitmap Index Scan on dataid_default_pkey  (cost=0.00..4.18 rows=3 width=0)
>          Index Cond: ((id = 1) AND (datadatetime < '2018-01-01 00:00:00-05'::timestamp with time zone))
> (4 rows)
> 
> That's not fine.  What we have here is a "timestamp < timestamptz"
> operator, which is only stable, therefore it might give different
> results at runtime than at plan time.  You can't make plan-time
> pruning decisions with that.  What we should have gotten here was
> an Append node that could do run-time pruning.

You're right.  It seems that prune_append_rel_partitions() is forgetting
to filter mutable clauses from rel->baserestrictinfo, like
relation_excluded_by_constraints() does.  I fixed that in the attached
0003 patch, which also adds a test for this scenario.  I needed to also
tweak run-time pruning support code a bit so that it considers the cases
involving mutable functions as requiring (startup) run-time pruning, in
addition to the cases with mutable expressions.  Adding David if he wants
to comment.

0003 patch cannot be applied as-is to both REL_11_STABLE and HEAD
branches, so I've attached two files, one for each branch.

BTW, while looking at this, I came across this comment in parse_coerce.c:
coerce_type():

       * XXX if the typinput function is not immutable, we really ought to
       * postpone evaluation of the function call until runtime. But there
       * is no way to represent a typinput function call as an expression
       * tree, because C-string values are not Datums. (XXX This *is*
       * possible as of 7.3, do we want to do it?)

Should something be done about that?

Thanks,
Amit

Вложения

Re: inconsistent results querying table partitioned by date

От
David Rowley
Дата:
On Mon, 13 May 2019 at 17:40, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
>
> On 2019/05/11 6:05, Tom Lane wrote:
> > regression=# explain select * from dataid where id=1 and datadatetime < '2018-01-01'::timestamptz;
> >                                                QUERY PLAN
> > --------------------------------------------------------------------------------------------------------
> >  Bitmap Heap Scan on dataid_default  (cost=4.18..11.30 rows=3 width=12)
> >    Recheck Cond: ((id = 1) AND (datadatetime < '2018-01-01 00:00:00-05'::timestamp with time zone))
> >    ->  Bitmap Index Scan on dataid_default_pkey  (cost=0.00..4.18 rows=3 width=0)
> >          Index Cond: ((id = 1) AND (datadatetime < '2018-01-01 00:00:00-05'::timestamp with time zone))
> > (4 rows)
> >
> > That's not fine.  What we have here is a "timestamp < timestamptz"
> > operator, which is only stable, therefore it might give different
> > results at runtime than at plan time.  You can't make plan-time
> > pruning decisions with that.  What we should have gotten here was
> > an Append node that could do run-time pruning.
>
> You're right.  It seems that prune_append_rel_partitions() is forgetting
> to filter mutable clauses from rel->baserestrictinfo, like
> relation_excluded_by_constraints() does.  I fixed that in the attached
> 0003 patch, which also adds a test for this scenario.  I needed to also
> tweak run-time pruning support code a bit so that it considers the cases
> involving mutable functions as requiring (startup) run-time pruning, in
> addition to the cases with mutable expressions.  Adding David if he wants
> to comment.

Yeah. I don't think you're going about this the right way.  I don't
really see why we need to make any changes to the run-time pruning
code here, that part seems fine to me.  The problem seems to be that
match_clause_to_partition_key() thinks it can use a non-const
expression to compare to the partition key.  All immutable function
calls will already be folded to constants by this time, so what's
wrong with just insisting that the value being compared to the
partition key is a constant when generating steps during planning?
When we generate steps for run-time pruning we can use stable function
return values, but obviously still not volatile functions.   So isn't
the fix just to provide gen_partprune_steps with some sort of context
as to the environment that it's generating steps for?  Fixing it by
insisting the planner pruning only uses consts saves us some cycles
where we check for Vars and volatile expressions too. That's a pretty
good saving when you consider the fact that the expression sometimes
could be fairly complex.

I did it that way in the attached then tried running with your tests
and got failures.  After looking at those your tests look like they're
expecting the wrong results.

E.g

+explain (analyze, costs off, summary off, timing off) select * from
mc3p where a = 1 and abs(b) < (select 2);
+                       QUERY PLAN
+--------------------------------------------------------
+ Append (actual rows=0 loops=1)
+   InitPlan 1 (returns $0)
+     ->  Result (actual rows=1 loops=1)
+   ->  Seq Scan on mc3p0 (actual rows=0 loops=1)
+         Filter: ((a = 1) AND (abs(b) < $0))
+   ->  Seq Scan on mc3p_default (actual rows=0 loops=1)
+         Filter: ((a = 1) AND (abs(b) < $0))
+(7 rows)

to which, it does not seem very good that you pruned "mc3p1", since
(with my patch)

postgres=# insert into mc3p (a,b,c) values(1,1,1);
INSERT 0 1
postgres=# select tableoid::regclass,* from mc3p where a = 1 and
abs(b) < (select 2);
 tableoid | a | b | c
----------+---+---+---
 mc3p1    | 1 | 1 | 1
(1 row)

If those tests were passing in your code then you'd have missed that
row... Oops.

I've attached how I think the fix should look and included some tests
too.  I was a bit unsure if it was worth adding the new enum or not.
Probably could be just a bool, but I thought enum as it makes looking
at function calls easier on the human eye.  I also included a test
that ensures an immutable function's value is used for pruning during
planning...  Half of me thinks that test is pretty useless since the
return value is converted to a const well before pruning, but I left
it there anyway...

I've not done the PG11 version of the patch but I can do it if the
PG12 one looks okay.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Вложения

Re: inconsistent results querying table partitioned by date

От
Amit Langote
Дата:
On 2019/05/14 10:50, David Rowley wrote:
> On Mon, 13 May 2019 at 17:40, Amit Langote wrote:
>> On 2019/05/11 6:05, Tom Lane wrote:
>>> regression=# explain select * from dataid where id=1 and datadatetime < '2018-01-01'::timestamptz;
>>>                                                QUERY PLAN
>>> --------------------------------------------------------------------------------------------------------
>>>  Bitmap Heap Scan on dataid_default  (cost=4.18..11.30 rows=3 width=12)
>>>    Recheck Cond: ((id = 1) AND (datadatetime < '2018-01-01 00:00:00-05'::timestamp with time zone))
>>>    ->  Bitmap Index Scan on dataid_default_pkey  (cost=0.00..4.18 rows=3 width=0)
>>>          Index Cond: ((id = 1) AND (datadatetime < '2018-01-01 00:00:00-05'::timestamp with time zone))
>>> (4 rows)
>>>
>>> That's not fine.  What we have here is a "timestamp < timestamptz"
>>> operator, which is only stable, therefore it might give different
>>> results at runtime than at plan time.  You can't make plan-time
>>> pruning decisions with that.  What we should have gotten here was
>>> an Append node that could do run-time pruning.
>>
>> You're right.  It seems that prune_append_rel_partitions() is forgetting
>> to filter mutable clauses from rel->baserestrictinfo, like
>> relation_excluded_by_constraints() does.  I fixed that in the attached
>> 0003 patch, which also adds a test for this scenario.  I needed to also
>> tweak run-time pruning support code a bit so that it considers the cases
>> involving mutable functions as requiring (startup) run-time pruning, in
>> addition to the cases with mutable expressions.  Adding David if he wants
>> to comment.
> 
> Yeah. I don't think you're going about this the right way.  I don't
> really see why we need to make any changes to the run-time pruning
> code here, that part seems fine to me.  The problem seems to be that
> match_clause_to_partition_key() thinks it can use a non-const
> expression to compare to the partition key.  All immutable function
> calls will already be folded to constants by this time, so what's
> wrong with just insisting that the value being compared to the
> partition key is a constant when generating steps during planning?

The problem is different.  '2018-01-01'::timestamptz' in the condition
datadatetime < '2018-01-01'::timestamptz as presented to
match_clause_to_partition_key() is indeed a Const node, making it think
that it's OK to prune using it, that is, with or without your patch.
Here's the result for Tom's query quoted above, with your patch applied:

explain analyze select * from dataid where id=1 and datadatetime <
'2018-01-01'::timestamptz;
                                                         QUERY PLAN


────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 Bitmap Heap Scan on dataid_default  (cost=4.18..11.30 rows=3 width=12)
(actual time=0.050..0.058 rows=0 loops=1)
   Recheck Cond: ((id = 1) AND (datadatetime < '2018-01-01
00:00:00+09'::timestamp with time zone))
   ->  Bitmap Index Scan on dataid_default_pkey  (cost=0.00..4.18 rows=3
width=0) (actual time=0.027..0.035 rows=0 loops=1)
         Index Cond: ((id = 1) AND (datadatetime < '2018-01-01
00:00:00+09'::timestamp with time zone))
 Planning Time: 33660.807 ms
 Execution Time: 0.236 ms
(6 rows)

which is same as without the patch and is wrong as Tom complains.  His
complaint is that planning-time pruning should not have considered this
clause, because its result is only stable, not immutable.  That is, the
operator '<' (function timestamp_lt_timestamptz() in this case) is only
stable, not immutable.  The expected plan in this case is Append node with
run-time pruning set up and initial pruning will do the pruning, which you
get with my patch:

explain select * from dataid where id=1 and datadatetime <
'2018-01-01'::timestamptz;
                                                  QUERY PLAN

──────────────────────────────────────────────────────────────────────────────────────────────────────────────
 Append  (cost=4.18..22.63 rows=6 width=12)
   Subplans Removed: 1
   ->  Bitmap Heap Scan on dataid_default  (cost=4.18..11.30 rows=3 width=12)
         Recheck Cond: ((id = 1) AND (datadatetime < '2018-01-01
00:00:00+09'::timestamp with time zone))
         ->  Bitmap Index Scan on dataid_default_pkey  (cost=0.00..4.18
rows=3 width=0)
               Index Cond: ((id = 1) AND (datadatetime < '2018-01-01
00:00:00+09'::timestamp with time zone))
(6 rows)

The case you seem to be thinking of is where the condition is of shape
"partkey op stable-valued-function", but that case works fine today, so
needs no fixing.

Thanks,
Amit




Re: inconsistent results querying table partitioned by date

От
David Rowley
Дата:
On Tue, 14 May 2019 at 16:07, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
>
> The problem is different.  '2018-01-01'::timestamptz' in the condition
> datadatetime < '2018-01-01'::timestamptz as presented to
> match_clause_to_partition_key() is indeed a Const node, making it think
> that it's OK to prune using it, that is, with or without your patch.

Looks like I misunderstood the problem.

Is it not still better to ignore the unsupported quals in
match_clause_to_partition_key() rather than crafting a new List of
just the valid ones like you've done in your patch?

I admit that at some point during our parallel development of pruning
and run-time pruning, I thought that the pruning steps generated for
plan-time pruning could have been reused for run-time pruning, but we
can't do that since we must also use parameterised quals, which are
not in the baserestrictinfo therefore not there when we generate the
steps for the plan-time pruning. I feel that what you've got spreads
the logic out a bit too much. match_clause_to_partition_key() has
traditionally been in charge of deciding what quals can be used and
which ones can't. You've gone and added logic in
prune_append_rel_partitions() that makes part of this decision now and
that feels a bit wrong to me.

I've attached patch of how I think it should work.  This includes the
changes you made to analyze_partkey_exprs() and your tests from
v1-0003-Fix-planner-to-not-prune-partitions-with-non-immu.patch

Do you think it's a bad idea to do it this way?

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Вложения

Re: inconsistent results querying table partitioned by date

От
Amit Langote
Дата:
On 2019/05/14 15:09, David Rowley wrote:
> On Tue, 14 May 2019 at 16:07, Amit Langote
> <Langote_Amit_f8@lab.ntt.co.jp> wrote:
>>
>> The problem is different.  '2018-01-01'::timestamptz' in the condition
>> datadatetime < '2018-01-01'::timestamptz as presented to
>> match_clause_to_partition_key() is indeed a Const node, making it think
>> that it's OK to prune using it, that is, with or without your patch.
> 
> Looks like I misunderstood the problem.
> 
> Is it not still better to ignore the unsupported quals in
> match_clause_to_partition_key() rather than crafting a new List of
> just the valid ones like you've done in your patch?

That's another way...

> I feel that what you've got spreads
> the logic out a bit too much. match_clause_to_partition_key() has
> traditionally been in charge of deciding what quals can be used and
> which ones can't.  You've gone and added logic in> prune_append_rel_partitions() that makes part of this decision now
and
> that feels a bit wrong to me.

Actually, I had considered a solution like yours of distinguishing
prune_append_rel_partitions()'s invocations of gen_partprune_steps() from
make_partition_pruneinfo()'s, but thought it would be too much churn.

Also, another thing that pushed me toward the approach I took is what I
saw in predtest.c.  I mentioned upthread that constraint exclusion doesn't
have this problem, that is, it knows to skip stable-valued clauses, so I
went into predicate_refuted_by() and underlings to see how they do that,
but the logic wasn't down there.  It was in the following code in
relation_excluded_by_constraints():

    /*
     * ... We dare not make
     * deductions with non-immutable functions
     * ...
     */
    safe_restrictions = NIL;
    foreach(lc, rel->baserestrictinfo)
    {
        RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc);

        if (!contain_mutable_functions((Node *) rinfo->clause))
            safe_restrictions = lappend(safe_restrictions, rinfo->clause);
    }

I think prune_append_rel_partitions() is playing a similar role as
relation_excluded_by_constraints(), that is, of dispatching the
appropriate clauses to the main pruning logic.  So, I thought enforcing
this policy in prune_append_rel_partitions() wouldn't be such a bad idea.

> I've attached patch of how I think it should work.  This includes the
> changes you made to analyze_partkey_exprs() and your tests from
> v1-0003-Fix-planner-to-not-prune-partitions-with-non-immu.patch

Thanks for updating the patch.  It works now.

> Do you think it's a bad idea to do it this way?

As I mentioned above, I think this may be quite a bit of code churn for
enforcing a policy that's elsewhere enforced in a much simpler manner.
How about deferring this to committer?

Thanks,
Amit




Re: inconsistent results querying table partitioned by date

От
Tom Lane
Дата:
Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes:
> On 2019/05/14 15:09, David Rowley wrote:
>> Do you think it's a bad idea to do it this way?

> As I mentioned above, I think this may be quite a bit of code churn for
> enforcing a policy that's elsewhere enforced in a much simpler manner.
> How about deferring this to committer?

Not sure I'll be the one to commit this, but ...

I don't much like the approach in
v1-0003-Fix-planner-to-not-prune-partitions-with-non-immu.patch
because (a) it appears to me to add duplicative expression mutability
testing, which is kind of expensive since that adds syscache lookups,
plus it adds such tests even for clauses that aren't going to match any
partition keys; and (b) it's extremely nonobvious that this restricts
plan-time pruning and not run-time pruning.  I do see that
prune_append_rel_partitions is used only for the former; but you sure
as heck can't deduce that from any of its comments, so somebody might
try to use it for run-time pruning.

So I think David's got the right idea that match_clause_to_partition_key
is where to handle this, and I like the fact that his patch explicitly
represents whether we're trying to do run-time or plan-time pruning.
I agree it's kind of invasive, and I wonder why.  Shouldn't the
additional flag just be a field in the "context" struct, instead of
being a separate parameter that has to be laboriously passed through
many call levels?

(For myself, I'd have made it just a bool not an enum, given that
I don't foresee a need for additional values.  But that's definitely
a matter of preference.)

Also, I'm a bit confused by the fact that we seem to have a bunch
of spare-parts patches in this thread.  What combination is actually
being proposed for commit?

            regards, tom lane



Re: inconsistent results querying table partitioned by date

От
David Rowley
Дата:
On Wed, 15 May 2019 at 08:58, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> So I think David's got the right idea that match_clause_to_partition_key
> is where to handle this, and I like the fact that his patch explicitly
> represents whether we're trying to do run-time or plan-time pruning.
> I agree it's kind of invasive, and I wonder why.  Shouldn't the
> additional flag just be a field in the "context" struct, instead of
> being a separate parameter that has to be laboriously passed through
> many call levels?

Thanks for having a look.  I originally stayed clear of "context"
since it looks more like how we output the pruning steps, rather than
a context as to how they should be generated. But it's true, using it
saves having to pass the extra argument around, so I made that change.

> (For myself, I'd have made it just a bool not an enum, given that
> I don't foresee a need for additional values.  But that's definitely
> a matter of preference.)

I've changed to bool, mostly because it makes the patch a bit smaller.

> Also, I'm a bit confused by the fact that we seem to have a bunch
> of spare-parts patches in this thread.  What combination is actually
> being proposed for commit?

I was also confused at the extra two patches Amit posted. The extra
one for the tests didn't look very valid to me, as I mentioned
yesterday.

I propose the attached for master, and I'll post something for v11 shortly.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Вложения

Re: inconsistent results querying table partitioned by date

От
David Rowley
Дата:
On Wed, 15 May 2019 at 11:52, David Rowley <david.rowley@2ndquadrant.com> wrote:
> I propose the attached for master, and I'll post something for v11 shortly.

Here's the v11 version, as promised.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Вложения

Re: inconsistent results querying table partitioned by date

От
Amit Langote
Дата:
On 2019/05/15 5:58, Tom Lane wrote:
> Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes:
>> On 2019/05/14 15:09, David Rowley wrote:
>>> Do you think it's a bad idea to do it this way?
> 
>> As I mentioned above, I think this may be quite a bit of code churn for
>> enforcing a policy that's elsewhere enforced in a much simpler manner.
>> How about deferring this to committer?
> 
> Not sure I'll be the one to commit this, but ...

Thanks for the feedback.

> I don't much like the approach in
> v1-0003-Fix-planner-to-not-prune-partitions-with-non-immu.patch
> because (a) it appears to me to add duplicative expression mutability
> testing, which is kind of expensive since that adds syscache lookups,
> plus it adds such tests even for clauses that aren't going to match any
> partition keys; and (b) it's extremely nonobvious that this restricts
> plan-time pruning and not run-time pruning.  I do see that
> prune_append_rel_partitions is used only for the former; but you sure
> as heck can't deduce that from any of its comments, so somebody might
> try to use it for run-time pruning.

OK, both a and b are good arguments.

> So I think David's got the right idea that match_clause_to_partition_key
> is where to handle this, and I like the fact that his patch explicitly
> represents whether we're trying to do run-time or plan-time pruning.

Agreed that David's latest patch is good for this.

> Also, I'm a bit confused by the fact that we seem to have a bunch
> of spare-parts patches in this thread.  What combination is actually
> being proposed for commit?

Sorry for the confusion.

0001 adds tests for the other bugs but with the wrong expected output

0002 fixes those bugs and shows the corrected output

0003 was for the bug that stable operators are being used for plan-time
pruning which has its own test (only 0003 needed separate versions for
HEAD and v11)

Parts of 0003 are now superseded by David's patch, of which he has posted
both for-HEAD and for-v11 versions.

0001+0002 is obviously meant to be squashed, which I'll do in a bit.

Thanks,
Amit




Re: inconsistent results querying table partitioned by date

От
Amit Langote
Дата:
Thanks for the updated patches.

On 2019/05/15 8:52, David Rowley wrote:
> On Wed, 15 May 2019 at 08:58, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> So I think David's got the right idea that match_clause_to_partition_key
>> is where to handle this, and I like the fact that his patch explicitly
>> represents whether we're trying to do run-time or plan-time pruning.
>> I agree it's kind of invasive, and I wonder why.  Shouldn't the
>> additional flag just be a field in the "context" struct, instead of
>> being a separate parameter that has to be laboriously passed through
>> many call levels?
> 
> Thanks for having a look.  I originally stayed clear of "context"
> since it looks more like how we output the pruning steps, rather than
> a context as to how they should be generated. But it's true, using it
> saves having to pass the extra argument around, so I made that change.
> 
>> (For myself, I'd have made it just a bool not an enum, given that
>> I don't foresee a need for additional values.  But that's definitely
>> a matter of preference.)
> 
> I've changed to bool, mostly because it makes the patch a bit smaller.

Looks good.  Thanks for tweaking many comments for clarity, except this
one could be improved a bit more:

+ * 'forplanner' must be true when being called from the query planner.

This description of 'forplanner' might make it sound like it's redundant
(always true), because gen_partprune_steps() is only called from the
planner today.  How about making it say the same thing as the comment next
to its definition in GeneratePruningStepsContext struct):

'forplanner' must be true if generating steps to be used during query
planning.

>> Also, I'm a bit confused by the fact that we seem to have a bunch
>> of spare-parts patches in this thread.  What combination is actually
>> being proposed for commit?
> 
> I was also confused at the extra two patches Amit posted. The extra
> one for the tests didn't look very valid to me, as I mentioned
> yesterday.

As I said in my previous reply, 0001 contains wrong expected output
because of the other bugs discussed earlier.  I guess that's not very helpful.

Anyway, I'm attaching a squashed version of those two patches as
0001-Fix-bugs-in-pruning-with-composite-range-partition-k.patch.  It
applies as-is to both master and v11 branches.

> I propose the attached for master, and I'll post something for v11 shortly.

Thanks.  I'm re-attaching both of your patches here just to have all the
patches in one place.

Thanks,
Amit

Вложения

Re: inconsistent results querying table partitioned by date

От
David Rowley
Дата:
On Wed, 15 May 2019 at 14:10, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
> Anyway, I'm attaching a squashed version of those two patches as
> 0001-Fix-bugs-in-pruning-with-composite-range-partition-k.patch.  It
> applies as-is to both master and v11 branches.

I had a look at
0001-Fix-bugs-in-pruning-with-composite-range-partition-k.patch, and
due to my patch no longer generating pruning steps during planning for
non-consts, I don't think the test you have there is doing a very good
job of highlighting the bug. Maybe you could exploit this bug in the
tests by having both initplan run-time pruning and exec-time run-time
pruning run. Something like:

set plan_cache_mode = 'force_generic_plan';
prepare q1 (int) as select * from mc3p where a = $1 and abs(b) < (select 2);
explain (costs off) execute q1(1);

Additionally, I'm wondering if we should also apply the attached as
part of my dont_prune_with_nonimmutable_opfuncs_during_planning_v2.patch
patch. We should never get a non-Const in the pruning steps for
planning, so there's likely no point in doing a run-time check to
ensure we have a planstate.  This code can execute quite often during
run-time pruning, once each time a parameter changes, which could be
each row.   I think I'll go do that now, and also fix up that
forplanner comment you mentioned.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Вложения

Re: inconsistent results querying table partitioned by date

От
David Rowley
Дата:
On Wed, 15 May 2019 at 14:20, David Rowley <david.rowley@2ndquadrant.com> wrote:
> Additionally, I'm wondering if we should also apply the attached as
> part of my dont_prune_with_nonimmutable_opfuncs_during_planning_v2.patch
> patch. We should never get a non-Const in the pruning steps for
> planning, so there's likely no point in doing a run-time check to
> ensure we have a planstate.  This code can execute quite often during
> run-time pruning, once each time a parameter changes, which could be
> each row.   I think I'll go do that now, and also fix up that
> forplanner comment you mentioned.

I've made the change to partkey_datum_from_expr() in master's version
only. I did this because I had to document that the context argument
in get_matching_partitions() must have a valid planstate set when the
given pruning_steps were generated for the executor. It's probably
unlikely that anything else is using these functions, but still
thought it was a bad idea to change that in v11. At this stage, I
don't see any big issues changing that in master.

The only change in
pg11_dont_prune_with_nonimmutable_opfuncs_during_planning_v3.patch
since v2 is the inaccurate comment mentioned by Amit.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Вложения

Re: inconsistent results querying table partitioned by date

От
Amit Langote
Дата:
You are right that tests I'd added in 0001 no longer test for the bugs it
fixes, that is, after applying your patch, because we no longer end up in
a state during *plan-time* pruning where those bugs manifest themselves.
Thanks for pointing that out.  I have rewritten the tests.

Also, I'm attaching the other patch as *0002* this time to signal that it
is to be applied after applying yours, that is, if not together.

On 2019/05/15 11:47, David Rowley wrote:
> On Wed, 15 May 2019 at 14:20, David Rowley <david.rowley@2ndquadrant.com> wrote:
>> Additionally, I'm wondering if we should also apply the attached as
>> part of my dont_prune_with_nonimmutable_opfuncs_during_planning_v2.patch
>> patch. We should never get a non-Const in the pruning steps for
>> planning, so there's likely no point in doing a run-time check to
>> ensure we have a planstate.  This code can execute quite often during
>> run-time pruning, once each time a parameter changes, which could be
>> each row.   I think I'll go do that now, and also fix up that
>> forplanner comment you mentioned.
> 
> I've made the change to partkey_datum_from_expr() in master's version
> only. I did this because I had to document that the context argument
> in get_matching_partitions() must have a valid planstate set when the
> given pruning_steps were generated for the executor. It's probably
> unlikely that anything else is using these functions, but still
> thought it was a bad idea to change that in v11. At this stage, I
> don't see any big issues changing that in master.

Agreed about changing that "if" to an "Assert" and doing it only in the
master.  FWIW, I also ended up adding an Assert in the 0002 patch, because
one of the bugs being fixed can only occur in a run-time pruning scenario.
 That means I had to create a version of the patch for v11 without the
Assert; that's the only difference between the attached pg11-0002- and
0002- patches.

Thanks,
Amit

Вложения

Re: inconsistent results querying table partitioned by date

От
Tom Lane
Дата:
David Rowley <david.rowley@2ndquadrant.com> writes:
> [ dont_prune_with_nonimmutable_opfuncs_during_planning_v3.patch ]

I started working through this, and changed some comments I thought could
be improved, and noticed that you'd missed out making the same changes
for ScalarArrayOp so I did that.  However, when I got to the changes in
analyze_partkey_exprs, my bogometer went off.  Why do we need to recheck
this, in fact why does that function exist at all?  ISTM if we have used
a pruning qual at plan time there is no need to check it again at runtime;
therefore, it shouldn't even be in the list that analyze_partkey_exprs
is looking at.

I am wondering therefore whether we shouldn't do one of these two
things:

* Teach match_clause_to_partition_key to not emit plan-time-usable
quals at all if it is called with forplanner = false, or

* Add a bool field to PartitionPruneStep indicating whether the step
can be used at plan time, and use that to skip redundant checks at run
time.  This would make more sense if we can then fix things so that
we only run match_clause_to_partition_key once per qual clause ---
I gather that it's now being run twice, which seems pretty inefficient.

Perhaps this implies too much code churn/restructuring to be reasonable
to consider right now, but I thought I'd ask.  It sure seems like this
work is being done in a pretty inefficient and duplicative way.

            regards, tom lane

diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 59f34f5..a628858 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -39,6 +39,7 @@
 #include "access/nbtree.h"
 #include "catalog/pg_operator.h"
 #include "catalog/pg_opfamily.h"
+#include "catalog/pg_proc.h"
 #include "catalog/pg_type.h"
 #include "executor/executor.h"
 #include "miscadmin.h"
@@ -93,8 +94,12 @@ typedef enum PartClauseMatchStatus
  */
 typedef struct GeneratePruningStepsContext
 {
+    /* Input data: */
+    bool        forplanner;        /* true when generating steps to be used
+                                 * during query planning */
+    /* Working state and result data: */
     int            next_step_id;
-    List       *steps;
+    List       *steps;            /* output, list of PartitionPruneSteps */
 } GeneratePruningStepsContext;

 /* The result of performing one PartitionPruneStep */
@@ -117,7 +122,7 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
                               List *partitioned_rels, List *prunequal,
                               Bitmapset **matchedsubplans);
 static List *gen_partprune_steps(RelOptInfo *rel, List *clauses,
-                    bool *contradictory);
+                    bool forplanner, bool *contradictory);
 static List *gen_partprune_steps_internal(GeneratePruningStepsContext *context,
                              RelOptInfo *rel, List *clauses,
                              bool *contradictory);
@@ -409,8 +414,13 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
                                                   targetpart->relids);
         }

-        /* Convert pruning qual to pruning steps. */
-        pruning_steps = gen_partprune_steps(subpart, partprunequal,
+        /*
+         * Convert pruning qual to pruning steps.  Since these steps will be
+         * used in the executor, we can pass 'forplanner' as false to allow
+         * steps to be generated that are unsafe for evaluation during
+         * planning, e.g. evaluation of stable functions.
+         */
+        pruning_steps = gen_partprune_steps(subpart, partprunequal, false,
                                             &contradictory);

         if (contradictory)
@@ -523,16 +533,21 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 /*
  * gen_partprune_steps
  *        Process 'clauses' (a rel's baserestrictinfo list of clauses) and return
- *        a list of "partition pruning steps"
+ *        a list of "partition pruning steps".
+ *
+ * 'forplanner' must be true when generating steps to be evaluated during
+ * query planning, false when generating steps to be used at run-time.
  *
  * If the clauses in the input list are contradictory or there is a
  * pseudo-constant "false", *contradictory is set to true upon return.
  */
 static List *
-gen_partprune_steps(RelOptInfo *rel, List *clauses, bool *contradictory)
+gen_partprune_steps(RelOptInfo *rel, List *clauses, bool forplanner,
+                    bool *contradictory)
 {
     GeneratePruningStepsContext context;

+    context.forplanner = forplanner;
     context.next_step_id = 0;
     context.steps = NIL;

@@ -574,9 +589,11 @@ gen_partprune_steps(RelOptInfo *rel, List *clauses, bool *contradictory)

 /*
  * prune_append_rel_partitions
- *        Returns indexes into rel->part_rels of the minimum set of child
- *        partitions which must be scanned to satisfy rel's baserestrictinfo
- *        quals.
+ *        Process rel's baserestrictinfo and make use of quals which can be
+ *        evaluated during query planning in order to determine the minimum set
+ *        of partitions which must be scanned to satisfy these quals.  Returns
+ *        the matching partitions in the form of a Bitmapset containing the
+ *        partitions' indexes in the rel's part_rels array.
  *
  * Callers must ensure that 'rel' is a partitioned table.
  */
@@ -603,9 +620,12 @@ prune_append_rel_partitions(RelOptInfo *rel)

     /*
      * Process clauses.  If the clauses are found to be contradictory, we can
-     * return the empty set.
+     * return the empty set.  Pass 'forplanner' as true to indicate to the
+     * pruning code that we only want pruning steps that can be evaluated
+     * during planning.
      */
-    pruning_steps = gen_partprune_steps(rel, clauses, &contradictory);
+    pruning_steps = gen_partprune_steps(rel, clauses, true,
+                                        &contradictory);
     if (contradictory)
         return NULL;

@@ -635,6 +655,9 @@ prune_append_rel_partitions(RelOptInfo *rel)
  * get_matching_partitions
  *        Determine partitions that survive partition pruning
  *
+ * Note: context->planstate must be set to a valid PlanState when the
+ * pruning_steps were generated with 'forplanner' = false.
+ *
  * Returns a Bitmapset of the RelOptInfo->part_rels indexes of the surviving
  * partitions.
  */
@@ -1572,8 +1595,7 @@ match_clause_to_partition_key(RelOptInfo *rel,
     Expr       *expr;

     /*
-     * Recognize specially shaped clauses that match with the Boolean
-     * partition key.
+     * Recognize specially shaped clauses that match a Boolean partition key.
      */
     if (match_boolean_partition_clause(partopfamily, clause, partkey, &expr))
     {
@@ -1648,16 +1670,47 @@ match_clause_to_partition_key(RelOptInfo *rel,
          * Matched with this key.  Now check various properties of the clause
          * to see if it's sane to use it for pruning.  In most of these cases,
          * we can return UNSUPPORTED because the same failure would occur no
-         * matter which partkey it's matched to.
+         * matter which partkey it's matched to.  (In particular, now that
+         * we've successfully matched one side of the operator to a partkey,
+         * there is no chance that matching the other side to another partkey
+         * will produce a usable result, since that'd mean there are Vars on
+         * both sides.)
          */
+        if (context->forplanner)
+        {
+            /*
+             * When pruning in the planner, we only support pruning using
+             * comparisons to constants.  Immutable subexpressions will have
+             * been folded to constants already, and we cannot prune on the
+             * basis of anything that's not immutable.
+             */
+            if (!IsA(expr, Const))
+                return PARTCLAUSE_UNSUPPORTED;

-        /*
-         * We can't prune using an expression with Vars.  (Report failure as
-         * UNSUPPORTED, not NOMATCH: as in the no-commutator case above, we
-         * now know there are Vars on both sides, so it's no good.)
-         */
-        if (contain_var_clause((Node *) expr))
-            return PARTCLAUSE_UNSUPPORTED;
+            /*
+             * Also, the comparison operator itself must be immutable.
+             */
+            if (op_volatile(opno) != PROVOLATILE_IMMUTABLE)
+                return PARTCLAUSE_UNSUPPORTED;
+        }
+        else
+        {
+            /*
+             * Otherwise, non-consts are allowed, but we can't prune using an
+             * expression that contains Vars.
+             */
+            if (contain_var_clause((Node *) expr))
+                return PARTCLAUSE_UNSUPPORTED;
+
+            /*
+             * And we must reject anything containing a volatile function.
+             * Stable functions are OK though.  (We need not check this for
+             * the comparison operator itself: anything that belongs to a
+             * partitioning operator family must be at least stable.)
+             */
+            if (contain_volatile_functions((Node *) expr))
+                return PARTCLAUSE_UNSUPPORTED;
+        }

         /*
          * Only allow strict operators.  This will guarantee nulls are
@@ -1666,10 +1719,6 @@ match_clause_to_partition_key(RelOptInfo *rel,
         if (!op_strict(opno))
             return PARTCLAUSE_UNSUPPORTED;

-        /* We can't use any volatile expressions to prune partitions. */
-        if (contain_volatile_functions((Node *) expr))
-            return PARTCLAUSE_UNSUPPORTED;
-
         /*
          * See if the operator is relevant to the partitioning opfamily.
          *
@@ -1798,20 +1847,51 @@ match_clause_to_partition_key(RelOptInfo *rel,
         if (IsA(leftop, RelabelType))
             leftop = ((RelabelType *) leftop)->arg;

-        /* Check it matches this partition key */
+        /* check if the LHS matches this partition key */
         if (!equal(leftop, partkey) ||
             !PartCollMatchesExprColl(partcoll, saop->inputcollid))
             return PARTCLAUSE_NOMATCH;

         /*
          * Matched with this key.  Check various properties of the clause to
-         * see if it can sanely be used for partition pruning (this is mostly
-         * the same as for a plain OpExpr).
+         * see if it can sanely be used for partition pruning (this is
+         * identical to the logic for a plain OpExpr).
          */
+        if (context->forplanner)
+        {
+            /*
+             * When pruning in the planner, we only support pruning using
+             * comparisons to constants.  Immutable subexpressions will have
+             * been folded to constants already, and we cannot prune on the
+             * basis of anything that's not immutable.
+             */
+            if (!IsA(rightop, Const))
+                return PARTCLAUSE_UNSUPPORTED;

-        /* We can't prune using an expression with Vars. */
-        if (contain_var_clause((Node *) rightop))
-            return PARTCLAUSE_UNSUPPORTED;
+            /*
+             * Also, the comparison operator itself must be immutable.
+             */
+            if (op_volatile(saop_op) != PROVOLATILE_IMMUTABLE)
+                return PARTCLAUSE_UNSUPPORTED;
+        }
+        else
+        {
+            /*
+             * Otherwise, non-consts are allowed, but we can't prune using an
+             * expression that contains Vars.
+             */
+            if (contain_var_clause((Node *) rightop))
+                return PARTCLAUSE_UNSUPPORTED;
+
+            /*
+             * And we must reject anything containing a volatile function.
+             * Stable functions are OK though.  (We need not check this for
+             * the comparison operator itself: anything that belongs to a
+             * partitioning operator family must be at least stable.)
+             */
+            if (contain_volatile_functions((Node *) rightop))
+                return PARTCLAUSE_UNSUPPORTED;
+        }

         /*
          * Only allow strict operators.  This will guarantee nulls are
@@ -1820,10 +1900,6 @@ match_clause_to_partition_key(RelOptInfo *rel,
         if (!op_strict(saop_op))
             return PARTCLAUSE_UNSUPPORTED;

-        /* We can't use any volatile expressions to prune partitions. */
-        if (contain_volatile_functions((Node *) rightop))
-            return PARTCLAUSE_UNSUPPORTED;
-
         /*
          * In case of NOT IN (..), we get a '<>', which we handle if list
          * partitioning is in use and we're able to confirm that it's negator
@@ -2961,15 +3037,17 @@ analyze_partkey_exprs(PartitionedRelPruneInfo *pinfo, List *steps,
     {
         PartitionPruneStepOp *step = (PartitionPruneStepOp *) lfirst(lc);
         ListCell   *lc2;
+        ListCell   *lc3;
         int            keyno;

         if (!IsA(step, PartitionPruneStepOp))
             continue;

         keyno = 0;
-        foreach(lc2, step->exprs)
+        forboth(lc2, step->exprs, lc3, step->cmpfns)
         {
             Expr       *expr = lfirst(lc2);
+            Oid            fnoid = lfirst_oid(lc3);

             if (!IsA(expr, Const))
             {
@@ -2992,6 +3070,18 @@ analyze_partkey_exprs(PartitionedRelPruneInfo *pinfo, List *steps,

                 doruntimeprune = true;
             }
+
+            /*
+             * Pruning done during planning won't have pruned using quals with
+             * a non-immutable comparison functions, so we'll switch on
+             * run-time pruning for these.
+             */
+            else if (func_volatile(fnoid) != PROVOLATILE_IMMUTABLE)
+            {
+                /* No need to check for exec params inside a const */
+                doruntimeprune = true;
+                pinfo->do_initial_prune = true;
+            }
             keyno++;
         }
     }
@@ -3347,6 +3437,12 @@ partkey_datum_from_expr(PartitionPruneContext *context,
     else
     {
         /*
+         * We should never see a non-Const in a step unless we're running in
+         * the executor.
+         */
+        Assert(context->planstate != NULL);
+
+        /*
          * When called from the executor we'll have a valid planstate so we
          * may be able to evaluate an expression which could not be folded to
          * a Const during planning.  Since run-time pruning can occur both
@@ -3354,9 +3450,7 @@ partkey_datum_from_expr(PartitionPruneContext *context,
          * must be careful here to evaluate expressions containing PARAM_EXEC
          * Params only when told it's OK.
          */
-        if (context->planstate &&
-            (context->evalexecparams ||
-             !context->exprhasexecparam[stateidx]))
+        if (context->evalexecparams || !context->exprhasexecparam[stateidx])
         {
             ExprState  *exprstate;
             ExprContext *ectx;
diff --git a/src/test/regress/expected/partition_prune.out b/src/test/regress/expected/partition_prune.out
index bd64bed..cad5967 100644
--- a/src/test/regress/expected/partition_prune.out
+++ b/src/test/regress/expected/partition_prune.out
@@ -3036,6 +3036,41 @@ select * from listp where a = (select null::int);
 (7 rows)

 drop table listp;
+--
+-- check that stable query clauses are only used in run-time pruning
+--
+create table stable_qual_pruning (a timestamp) partition by range (a);
+create table stable_qual_pruning1 partition of stable_qual_pruning
+  for values from ('2000-01-01') to ('2000-02-01');
+create table stable_qual_pruning2 partition of stable_qual_pruning
+  for values from ('2000-02-01') to ('2000-03-01');
+create table stable_qual_pruning3 partition of stable_qual_pruning
+  for values from ('3000-02-01') to ('3000-03-01');
+-- comparison against a stable value requires run-time pruning
+explain (analyze, costs off, summary off, timing off)
+select * from stable_qual_pruning where a < localtimestamp;
+                           QUERY PLAN
+----------------------------------------------------------------
+ Append (actual rows=0 loops=1)
+   Subplans Removed: 1
+   ->  Seq Scan on stable_qual_pruning1 (actual rows=0 loops=1)
+         Filter: (a < LOCALTIMESTAMP)
+   ->  Seq Scan on stable_qual_pruning2 (actual rows=0 loops=1)
+         Filter: (a < LOCALTIMESTAMP)
+(6 rows)
+
+-- timestamp < timestamptz comparison is only stable, not immutable
+explain (analyze, costs off, summary off, timing off)
+select * from stable_qual_pruning where a < '2000-02-01'::timestamptz;
+                                   QUERY PLAN
+--------------------------------------------------------------------------------
+ Append (actual rows=0 loops=1)
+   Subplans Removed: 2
+   ->  Seq Scan on stable_qual_pruning1 (actual rows=0 loops=1)
+         Filter: (a < 'Tue Feb 01 00:00:00 2000 PST'::timestamp with time zone)
+(4 rows)
+
+drop table stable_qual_pruning;
 -- Ensure runtime pruning works with initplans params with boolean types
 create table boolvalues (value bool not null);
 insert into boolvalues values('t'),('f');
diff --git a/src/test/regress/sql/partition_prune.sql b/src/test/regress/sql/partition_prune.sql
index 246c627..94d9e1e 100644
--- a/src/test/regress/sql/partition_prune.sql
+++ b/src/test/regress/sql/partition_prune.sql
@@ -754,6 +754,27 @@ select * from listp where a = (select null::int);

 drop table listp;

+--
+-- check that stable query clauses are only used in run-time pruning
+--
+create table stable_qual_pruning (a timestamp) partition by range (a);
+create table stable_qual_pruning1 partition of stable_qual_pruning
+  for values from ('2000-01-01') to ('2000-02-01');
+create table stable_qual_pruning2 partition of stable_qual_pruning
+  for values from ('2000-02-01') to ('2000-03-01');
+create table stable_qual_pruning3 partition of stable_qual_pruning
+  for values from ('3000-02-01') to ('3000-03-01');
+
+-- comparison against a stable value requires run-time pruning
+explain (analyze, costs off, summary off, timing off)
+select * from stable_qual_pruning where a < localtimestamp;
+
+-- timestamp < timestamptz comparison is only stable, not immutable
+explain (analyze, costs off, summary off, timing off)
+select * from stable_qual_pruning where a < '2000-02-01'::timestamptz;
+
+drop table stable_qual_pruning;
+
 -- Ensure runtime pruning works with initplans params with boolean types
 create table boolvalues (value bool not null);
 insert into boolvalues values('t'),('f');

Re: inconsistent results querying table partitioned by date

От
David Rowley
Дата:
On Thu, 16 May 2019 at 05:38, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> David Rowley <david.rowley@2ndquadrant.com> writes:
> > [ dont_prune_with_nonimmutable_opfuncs_during_planning_v3.patch ]
>
> I started working through this, and changed some comments I thought could
> be improved, and noticed that you'd missed out making the same changes
> for ScalarArrayOp so I did that.  However, when I got to the changes in
> analyze_partkey_exprs, my bogometer went off.  Why do we need to recheck
> this, in fact why does that function exist at all?  ISTM if we have used
> a pruning qual at plan time there is no need to check it again at runtime;
> therefore, it shouldn't even be in the list that analyze_partkey_exprs
> is looking at.

Thinking about this more, if we're now making it so a forplanner =
true set of steps requires all values being compared to the partition
key to be Const, then anything that's not Const must require run-time
pruning. That means the:

else if (func_volatile(fnoid) != PROVOLATILE_IMMUTABLE)

in analyze_partkey_exprs() can simply become "else".

> I am wondering therefore whether we shouldn't do one of these two
> things:
>
> * Teach match_clause_to_partition_key to not emit plan-time-usable
> quals at all if it is called with forplanner = false, or

That's not really possible.  Imagine a table partitioned by HASH(a,b)
and WHERE a = $1 and b = 10;  we can't do any pruning during planning
here, and if we only generate steps containing "a = $1" for run-time,
then we can't do anything there either.

> * Add a bool field to PartitionPruneStep indicating whether the step
> can be used at plan time, and use that to skip redundant checks at run
> time.  This would make more sense if we can then fix things so that
> we only run match_clause_to_partition_key once per qual clause ---
> I gather that it's now being run twice, which seems pretty inefficient.

I think it would be nicer if we made match_clause_to_partition_key()
do everything that analyze_partkey_exprs() currently does, but if we
change that "else if" to just "else" then the only advantage is not
having to make another pass over the pruning steps.  We'd have to make
it so match_clause_to_partition_key() did all the pull_exec_paramids()
stuff and add some new fields to PartitionPruneStepOp to store that
info.  We'd also need to store hasexecparam in the
PartitionPruneStepOp and change the signature of
partkey_datum_from_expr() so we could pass that down as we'd no longer
have context->exprhasexecparam... Well, we could keep that, but if
we're trying to not loop over the steps again then we'll need to store
that value in the step when it's created rather than put it in
PartitionPruneContext.

> Perhaps this implies too much code churn/restructuring to be reasonable
> to consider right now, but I thought I'd ask.  It sure seems like this
> work is being done in a pretty inefficient and duplicative way.

I certainly think the above would be cleaner and I don't mind working
on it, but if it's too much churn for this time of year then I'd
rather not waste time writing a patch that's going to get rejected.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: inconsistent results querying table partitioned by date

От
Tom Lane
Дата:
David Rowley <david.rowley@2ndquadrant.com> writes:
> Thinking about this more, if we're now making it so a forplanner =
> true set of steps requires all values being compared to the partition
> key to be Const, then anything that's not Const must require run-time
> pruning. That means the:

> else if (func_volatile(fnoid) != PROVOLATILE_IMMUTABLE)

> in analyze_partkey_exprs() can simply become "else".

Um ... what about a merely-stable comparison operator with a Const?

> I think it would be nicer if we made match_clause_to_partition_key()
> do everything that analyze_partkey_exprs() currently does, but if we
> change that "else if" to just "else" then the only advantage is not
> having to make another pass over the pruning steps.  We'd have to make
> it so match_clause_to_partition_key() did all the pull_exec_paramids()
> stuff and add some new fields to PartitionPruneStepOp to store that
> info.  We'd also need to store hasexecparam in the
> PartitionPruneStepOp and change the signature of
> partkey_datum_from_expr() so we could pass that down as we'd no longer
> have context->exprhasexecparam... Well, we could keep that, but if
> we're trying to not loop over the steps again then we'll need to store
> that value in the step when it's created rather than put it in
> PartitionPruneContext.

The loop over steps, per se, isn't that expensive --- but extra syscache
lookups are.  Or at least that's my gut feeling about it.  If we just had
match_clause_to_partition_key mark the steps as being plan-time executable
or not, we could avoid the repeat lookup.

>> * Teach match_clause_to_partition_key to not emit plan-time-usable
>> quals at all if it is called with forplanner = false, or

> That's not really possible.  Imagine a table partitioned by HASH(a,b)
> and WHERE a = $1 and b = 10;  we can't do any pruning during planning
> here, and if we only generate steps containing "a = $1" for run-time,
> then we can't do anything there either.

That's an interesting example, but how does it work now?  I did not
observe any code in there that seems to be aware of cross-dependencies
between steps.  Presumably somewhere we are combining those hash quals,
so couldn't we mark the combined step properly?

>> Perhaps this implies too much code churn/restructuring to be reasonable
>> to consider right now, but I thought I'd ask.  It sure seems like this
>> work is being done in a pretty inefficient and duplicative way.

> I certainly think the above would be cleaner and I don't mind working
> on it, but if it's too much churn for this time of year then I'd
> rather not waste time writing a patch that's going to get rejected.

Yeah, I'm leery of adding much new complexity right now, but maybe
adding a flag field to save a syscache lookup wouldn't be out of line.

            regards, tom lane



Re: inconsistent results querying table partitioned by date

От
David Rowley
Дата:
On Thu, 16 May 2019 at 12:28, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> David Rowley <david.rowley@2ndquadrant.com> writes:
> > Thinking about this more, if we're now making it so a forplanner =
> > true set of steps requires all values being compared to the partition
> > key to be Const, then anything that's not Const must require run-time
> > pruning. That means the:
>
> > else if (func_volatile(fnoid) != PROVOLATILE_IMMUTABLE)
>
> > in analyze_partkey_exprs() can simply become "else".
>
> Um ... what about a merely-stable comparison operator with a Const?

Oops. It seems I got my wires crossed there. Wasn't quite fully awake
when I started thinking about this earlier.

> The loop over steps, per se, isn't that expensive --- but extra syscache
> lookups are.  Or at least that's my gut feeling about it.  If we just had
> match_clause_to_partition_key mark the steps as being plan-time executable
> or not, we could avoid the repeat lookup.

okay, on re-think. I'm a little unsure of if you're mixing up "extra"
and "repeat", we do need an "extra" lookup because we've discovered
that strict ops are not safe to use during planning. I can't see
around having this extra call.  If you mean it's a repeat call, then
it's really not, as we only do the op_volatile() check with
forplanner=true.  With forplanner = false we only call
contain_var_clause() and contain_volatile_functions().

> >> * Teach match_clause_to_partition_key to not emit plan-time-usable
> >> quals at all if it is called with forplanner = false, or
>
> > That's not really possible.  Imagine a table partitioned by HASH(a,b)
> > and WHERE a = $1 and b = 10;  we can't do any pruning during planning
> > here, and if we only generate steps containing "a = $1" for run-time,
> > then we can't do anything there either.
>
> That's an interesting example, but how does it work now?  I did not
> observe any code in there that seems to be aware of cross-dependencies
> between steps.  Presumably somewhere we are combining those hash quals,
> so couldn't we mark the combined step properly?

In current HEAD we always generate steps for both quals when
generating steps for planning and execution.
partkey_datum_from_expr() just can't eval the qual when we're
executing the steps during planning.  My patch skips the step
generation when the Expr is non-Const, so we save a few cycles and
some memory allocation there.  Remember that we don't reuse steps
created for plan-time pruning in run-time pruning. A fresh set is
created.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: inconsistent results querying table partitioned by date

От
Amit Langote
Дата:
On 2019/05/16 10:00, David Rowley wrote:
> On Thu, 16 May 2019 at 12:28, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> The loop over steps, per se, isn't that expensive --- but extra syscache
>> lookups are.  Or at least that's my gut feeling about it.  If we just had
>> match_clause_to_partition_key mark the steps as being plan-time executable
>> or not, we could avoid the repeat lookup.
> 
> okay, on re-think. I'm a little unsure of if you're mixing up "extra"
> and "repeat", we do need an "extra" lookup because we've discovered
> that strict ops are not safe to use during planning. I can't see
> around having this extra call.  If you mean it's a repeat call, then
> it's really not, as we only do the op_volatile() check with
> forplanner=true.  With forplanner = false we only call
> contain_var_clause() and contain_volatile_functions().

How about we add one more bool, say, runtime_pruning_needed to
GeneratePruningStepsContext and set it when we discover in
match_clause_to_partition_key() that runtime pruning will be needed?  Then
return it to make_partitionedrel_pruneinfo() using a new output parameter
of gen_partprune_steps()?

Thanks,
Amit




Re: inconsistent results querying table partitioned by date

От
David Rowley
Дата:
On Thu, 16 May 2019 at 13:51, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
> How about we add one more bool, say, runtime_pruning_needed to
> GeneratePruningStepsContext and set it when we discover in
> match_clause_to_partition_key() that runtime pruning will be needed?  Then
> return it to make_partitionedrel_pruneinfo() using a new output parameter
> of gen_partprune_steps()?

I don't think this can be done the way you think. If we just have one
flag to say any run-time pruning is needed then we'd still need to
check op_volatile(opno) != PROVOLATILE_IMMUTABLE in
analyze_partkey_exprs() to see if an initial run-time prune is needed.
That means we might do two calls to op_volatile()!  The other one is
required to set the runtime_pruning_needed in
match_clause_to_partition_key().  We can't have an additional flag to
mark what do_exec_prune should get set to either since we only know
that once we check what Param types are in the Expr and if we do that
then that's the full-blown patch that we thought might be too much
churn.

I don't really see where this duplicate call is.  The patch calls
op_volatile() at the very most just once per pruning step.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: inconsistent results querying table partitioned by date

От
Amit Langote
Дата:
On 2019/05/16 11:52, David Rowley wrote:
> I don't really see where this duplicate call is.  The patch calls
> op_volatile() at the very most just once per pruning step.

Ah, there's no op_volatile() on the matched clause's operator in
match_clause_to_partition_key() when it's called during a forplanner=false
run.  So, while there is a duplication of efforts because
match_clause_to_partition_key() runs twice for any given clause which is
due to original design design, there is no duplication of efforts between
match_clause_to_partition_key() and analyze_partkey_exprs().

Thanks,
Amit




Re: inconsistent results querying table partitioned by date

От
Amit Langote
Дата:
On 2019/05/16 10:00, David Rowley wrote:
> On Thu, 16 May 2019 at 12:28, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> That's an interesting example, but how does it work now?  I did not
>> observe any code in there that seems to be aware of cross-dependencies
>> between steps.  Presumably somewhere we are combining those hash quals,
>> so couldn't we mark the combined step properly?
> 
> In current HEAD we always generate steps for both quals when
> generating steps for planning and execution.
> partkey_datum_from_expr() just can't eval the qual when we're
> executing the steps during planning.  My patch skips the step
> generation when the Expr is non-Const, so we save a few cycles and
> some memory allocation there.  Remember that we don't reuse steps
> created for plan-time pruning in run-time pruning. A fresh set is
> created.

Given the way gen_partprune_steps() is designed [1], while it may be
possible to discard steps generated by run-time pruning that need not be
applied during execution because they would've already been fully applied
during plan-time pruning, it seems hard to keep around *partial* steps
generated during plan-time pruning and use them for run-time pruning after
filling missing expressions from run-time quals.  Filling the missing
expressions into existing steps part seems hard.

Thanks,
Amit

[1] Any given PartitionPruneStepOp pruning step is built from a *set of*
clauses which have matched the required set of partitions keys (all keys
in the hash case and a prefix of keys in the range case).  If the set of
clauses doesn't cover the required set of partition keys, no step will be
generated at all, which may happen if some of the clauses were run-time
clauses.  The same set of clauses + parameterized path quals (if any)
might cover the required set and if so a step will be generated for
run-time pruning.




Re: inconsistent results querying table partitioned by date

От
Tom Lane
Дата:
Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes:
> Given the way gen_partprune_steps() is designed [1], while it may be
> possible to discard steps generated by run-time pruning that need not be
> applied during execution because they would've already been fully applied
> during plan-time pruning, it seems hard to keep around *partial* steps
> generated during plan-time pruning and use them for run-time pruning after
> filling missing expressions from run-time quals.  Filling the missing
> expressions into existing steps part seems hard.

Yeah.  It seems like the way to make some progress on this would be
to annotate the steps when they're generated as to whether they involve
any run-time component or not.  However, we might end up making multiple
calls anyway because of the possibility of parameterized paths having
additional qual clauses to work with, so I'm not clear on exactly how
we could reduce the number of times we do gen_partprune_steps.  It's
not something we could fix now in any case.

However, I'm still not real happy with analyze_partkey_exprs();
its analysis seems well short of ideal.  I notice that in the sort
of case suggested upthread, with more than one opclause contributing
to a step, it's entirely possible for it to report that both initial
and runtime pruning are needed even though only the latter is really
possible.  Perhaps this could be improved a bit by making the analysis
operate per-step rather than being totally per-expression.

Anyway, I've pushed David's patch now, on to look at Amit's.

            regards, tom lane



Re: inconsistent results querying table partitioned by date

От
Tom Lane
Дата:
I wrote:
> Anyway, I've pushed David's patch now, on to look at Amit's.

So, the changes in gen_prune_steps_from_opexps seem okay, but
the change in perform_pruning_base_step is just wrong, as can
be seen from this extended form of the test case:

create table mc3p (a int, b int, c int) partition by range (a, abs(b), c);
create table mc3p1 partition of mc3p
  for values from (1, 1, 1) to (1, 1, 10);
create table mc3p2 partition of mc3p
  for values from (1, 1, 10) to (2, minvalue, minvalue);
create table mc3p3 partition of mc3p
 for values from (2, minvalue, minvalue) to (3, maxvalue, maxvalue);
insert into mc3p values (1, 1, 1), (1,1,10), (2, 1, 1);

set plan_cache_mode = force_generic_plan;
prepare init_exec_prune_q1 as
  select * from mc3p where a = $1 and abs(b) <= $2 and c < (select 13);
explain (analyze, costs off, summary off, timing off)
execute init_exec_prune_q1 (1,2);
deallocate init_exec_prune_q1;
reset plan_cache_mode;

drop table mc3p;

This should of course find the (1,1,1) row, but it does not because
it converts the initial pruning condition to be "a = 1 and abs(b) = 2",
whereas the correct thing would be "a = 1 and abs(b) <= 2".

It seems to me that the correct fix, rather than forcing equality
strategy when you've missed out some of the quals, is to back up
and use the strategy of the last clause you did use.  But unless
I'm missing something, that information is just not available from
the PartitionPruneStepOp data structure.

We could extend the PartitionPruneStepOp struct to have an array
or list of StrategyNumbers (but I'm unsure about how safe that'd
be to backpatch into v11).

Another idea is that we ought to generate separate partitioning
steps lists for the initial-pruning and runtime-pruning cases,
ie invoke gen_partprune_steps yet a third time.  Ugh (and that
still means a data structure change, just in a different place).

The bottom line here is that we've got an inflexible data structure
that was designed on the assumption that gen_partprune_steps has all
the intelligence, and trying to adjust things later is just trouble.

I think this is really quite closely tied to my previous complaints
about the design for plan-time versus run-time pruning, just mutatis
mutandis.  The lack of any holistic, principled approach to
which-quals-apply-when is what's biting us here.

Moreover, now that I understand what's going on here a bit better,
I am deathly afraid that there are yet other bugs hiding in this
same area.  What we've basically got is that analyze_partkey_exprs
and perform_pruning_base_step are trying to reverse-engineer what
gen_partprune_steps would have done differently if it used only the
quals not containing exec params.  They don't really have enough
information available to do that correctly, and even if they did,
the fact that the logic looks nothing like the same is going to
be a fertile source of bugs and maintenance gotchas.

In other words, as I compose this I'm talking myself into the
idea that invoking gen_partprune_steps three times (for planner,
executor startup, and executor runtime, with corresponding
filters on which clauses can be used) is the only safe near-term
fix.  In the future we can look at cutting down the repetitive
costs that entails.

Thoughts?

            regards, tom lane



Re: inconsistent results querying table partitioned by date

От
Tom Lane
Дата:
I wrote:
> So, the changes in gen_prune_steps_from_opexps seem okay, but
> the change in perform_pruning_base_step is just wrong,

I went ahead and pushed the non-controversial part of that patch,
just to avoid carrying it around longer.

I propose that we use the attached test cases once we have a fix
for the remaining problem.  The results shown are the wrong answers
that HEAD produces.  Amit's patch fixes the first case but not the
second, since substituting equality semantics is the right thing
to do in only the first case.

            regards, tom lane

diff --git a/src/test/regress/expected/partition_prune.out b/src/test/regress/expected/partition_prune.out
index 723fc70..88701d8 100644
--- a/src/test/regress/expected/partition_prune.out
+++ b/src/test/regress/expected/partition_prune.out
@@ -3149,6 +3149,43 @@ select * from mc3p where a < 3 and abs(b) = 1;
          Filter: ((a < 3) AND (abs(b) = 1))
 (7 rows)

+--
+-- Check that pruning with composite range partitioning works correctly when
+-- a combination of runtime parameters is specified, not all of whose values
+-- are available at the same time
+--
+set plan_cache_mode = force_generic_plan;
+prepare ps1 as
+  select * from mc3p where a = $1 and abs(b) < (select 3);
+explain (analyze, costs off, summary off, timing off)
+execute ps1(1);
+                  QUERY PLAN
+----------------------------------------------
+ Append (actual rows=0 loops=1)
+   InitPlan 1 (returns $0)
+     ->  Result (never executed)
+   Subplans Removed: 2
+   ->  Seq Scan on mc3p0 (never executed)
+         Filter: ((a = $1) AND (abs(b) < $0))
+(6 rows)
+
+deallocate ps1;
+prepare ps2 as
+  select * from mc3p where a <= $1 and abs(b) < (select 3);
+explain (analyze, costs off, summary off, timing off)
+execute ps2(1);
+                   QUERY PLAN
+-------------------------------------------------
+ Append (actual rows=1 loops=1)
+   InitPlan 1 (returns $0)
+     ->  Result (actual rows=1 loops=1)
+   Subplans Removed: 2
+   ->  Seq Scan on mc3p0 (actual rows=1 loops=1)
+         Filter: ((a <= $1) AND (abs(b) < $0))
+(6 rows)
+
+deallocate ps2;
+reset plan_cache_mode;
 drop table mc3p;
 -- Ensure runtime pruning works with initplans params with boolean types
 create table boolvalues (value bool not null);
diff --git a/src/test/regress/sql/partition_prune.sql b/src/test/regress/sql/partition_prune.sql
index 2373bd8..071e28d 100644
--- a/src/test/regress/sql/partition_prune.sql
+++ b/src/test/regress/sql/partition_prune.sql
@@ -809,6 +809,24 @@ insert into mc3p values (0, 1, 1), (1, 1, 1), (2, 1, 1);
 explain (analyze, costs off, summary off, timing off)
 select * from mc3p where a < 3 and abs(b) = 1;

+--
+-- Check that pruning with composite range partitioning works correctly when
+-- a combination of runtime parameters is specified, not all of whose values
+-- are available at the same time
+--
+set plan_cache_mode = force_generic_plan;
+prepare ps1 as
+  select * from mc3p where a = $1 and abs(b) < (select 3);
+explain (analyze, costs off, summary off, timing off)
+execute ps1(1);
+deallocate ps1;
+prepare ps2 as
+  select * from mc3p where a <= $1 and abs(b) < (select 3);
+explain (analyze, costs off, summary off, timing off)
+execute ps2(1);
+deallocate ps2;
+reset plan_cache_mode;
+
 drop table mc3p;

 -- Ensure runtime pruning works with initplans params with boolean types

Re: inconsistent results querying table partitioned by date

От
Tom Lane
Дата:
I wrote:
> In other words, as I compose this I'm talking myself into the
> idea that invoking gen_partprune_steps three times (for planner,
> executor startup, and executor runtime, with corresponding
> filters on which clauses can be used) is the only safe near-term
> fix.  In the future we can look at cutting down the repetitive
> costs that entails.

Here's a really quick and dirty POC along that line.  It's unfinished
but I'm out of time for today.  I think the executor side of it is OK,
but the planner side is crufty: it lacks logic to decide whether
either the initial pruning or exec pruning steps are unnecessary.
And I didn't update most of the comments either.

I think the rule for "exec pruning is unnecessary" could probably
just be "we didn't find any exec params in the prune steps", since
that would imply that the initial steps include everything.
I don't see an equally cheap way to decide that initial pruning
is useless, though.  Thoughts?

Also, it seems like we could save at least some of the planning expense by
having the PARTCLAUSE_INITIAL pass report back whether it had discarded
any Param-bearing clauses or not.  If not, there's definitely no need to
run the PARTCLAUSE_EXEC pass.

            regards, tom lane

diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 70709e5..d663118 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -183,6 +183,11 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
                                      bool *isnull,
                                      int maxfieldlen);
 static List *adjust_partition_tlist(List *tlist, TupleConversionMap *map);
+static void ExecInitPruningContext(PartitionPruneContext *context,
+                       List *pruning_steps,
+                       PartitionDesc partdesc,
+                       PartitionKey partkey,
+                       PlanState *planstate);
 static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
                                PartitionedRelPruningData *pprune,
                                bool initial_prune,
@@ -1614,13 +1619,9 @@ ExecCreatePartitionPruneState(PlanState *planstate,
         {
             PartitionedRelPruneInfo *pinfo = lfirst_node(PartitionedRelPruneInfo, lc2);
             PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
-            PartitionPruneContext *context = &pprune->context;
             Relation    partrel;
             PartitionDesc partdesc;
             PartitionKey partkey;
-            int            partnatts;
-            int            n_steps;
-            ListCell   *lc3;

             /* present_parts is also subject to later modification */
             pprune->present_parts = bms_copy(pinfo->present_parts);
@@ -1700,66 +1701,36 @@ ExecCreatePartitionPruneState(PlanState *planstate,
                 Assert(pd_idx == pinfo->nparts);
             }

-            n_steps = list_length(pinfo->pruning_steps);
-
-            context->strategy = partkey->strategy;
-            context->partnatts = partnatts = partkey->partnatts;
-            context->nparts = pinfo->nparts;
-            context->boundinfo = partdesc->boundinfo;
-            context->partcollation = partkey->partcollation;
-            context->partsupfunc = partkey->partsupfunc;
-
-            /* We'll look up type-specific support functions as needed */
-            context->stepcmpfuncs = (FmgrInfo *)
-                palloc0(sizeof(FmgrInfo) * n_steps * partnatts);
-
-            context->ppccontext = CurrentMemoryContext;
-            context->planstate = planstate;
-
-            /* Initialize expression state for each expression we need */
-            context->exprstates = (ExprState **)
-                palloc0(sizeof(ExprState *) * n_steps * partnatts);
-            foreach(lc3, pinfo->pruning_steps)
+            /*
+             * Initialize pruning contexts as needed.
+             */
+            pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
+            if (pinfo->initial_pruning_steps)
             {
-                PartitionPruneStepOp *step = (PartitionPruneStepOp *) lfirst(lc3);
-                ListCell   *lc4;
-                int            keyno;
-
-                /* not needed for other step kinds */
-                if (!IsA(step, PartitionPruneStepOp))
-                    continue;
-
-                Assert(list_length(step->exprs) <= partnatts);
-
-                keyno = 0;
-                foreach(lc4, step->exprs)
-                {
-                    Expr       *expr = (Expr *) lfirst(lc4);
-
-                    /* not needed for Consts */
-                    if (!IsA(expr, Const))
-                    {
-                        int            stateidx = PruneCxtStateIdx(partnatts,
-                                                                step->step.step_id,
-                                                                keyno);
-
-                        context->exprstates[stateidx] =
-                            ExecInitExpr(expr, context->planstate);
-                    }
-                    keyno++;
-                }
+                ExecInitPruningContext(&pprune->initial_context,
+                                       pinfo->initial_pruning_steps,
+                                       partdesc, partkey, planstate);
+                /* Record whether initial pruning is needed at any level */
+                prunestate->do_initial_prune = true;
+            }
+            else
+            {
+                /*
+                 * Hack: ExecFindInitialMatchingSubPlans requires this field
+                 * to be valid whether pruning or not.
+                 */
+                pprune->initial_context.nparts = partdesc->nparts;
             }

-            /* Array is not modified at runtime, so just point to plan's copy */
-            context->exprhasexecparam = pinfo->hasexecparam;
-
-            pprune->pruning_steps = pinfo->pruning_steps;
-            pprune->do_initial_prune = pinfo->do_initial_prune;
-            pprune->do_exec_prune = pinfo->do_exec_prune;
-
-            /* Record if pruning would be useful at any level */
-            prunestate->do_initial_prune |= pinfo->do_initial_prune;
-            prunestate->do_exec_prune |= pinfo->do_exec_prune;
+            pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
+            if (pinfo->exec_pruning_steps)
+            {
+                ExecInitPruningContext(&pprune->exec_context,
+                                       pinfo->exec_pruning_steps,
+                                       partdesc, partkey, planstate);
+                /* Record whether exec pruning is needed at any level */
+                prunestate->do_exec_prune = true;
+            }

             /*
              * Accumulate the IDs of all PARAM_EXEC Params affecting the
@@ -1777,6 +1748,71 @@ ExecCreatePartitionPruneState(PlanState *planstate,
 }

 /*
+ * Initialize a PartitionPruneContext for the given list of pruning steps.
+ */
+static void
+ExecInitPruningContext(PartitionPruneContext *context,
+                       List *pruning_steps,
+                       PartitionDesc partdesc,
+                       PartitionKey partkey,
+                       PlanState *planstate)
+{
+    int            n_steps;
+    int            partnatts;
+    ListCell   *lc;
+
+    n_steps = list_length(pruning_steps);
+
+    context->strategy = partkey->strategy;
+    context->partnatts = partnatts = partkey->partnatts;
+    context->nparts = partdesc->nparts;
+    context->boundinfo = partdesc->boundinfo;
+    context->partcollation = partkey->partcollation;
+    context->partsupfunc = partkey->partsupfunc;
+
+    /* We'll look up type-specific support functions as needed */
+    context->stepcmpfuncs = (FmgrInfo *)
+        palloc0(sizeof(FmgrInfo) * n_steps * partnatts);
+
+    context->ppccontext = CurrentMemoryContext;
+    context->planstate = planstate;
+
+    /* Initialize expression state for each expression we need */
+    context->exprstates = (ExprState **)
+        palloc0(sizeof(ExprState *) * n_steps * partnatts);
+    foreach(lc, pruning_steps)
+    {
+        PartitionPruneStepOp *step = (PartitionPruneStepOp *) lfirst(lc);
+        ListCell   *lc2;
+        int            keyno;
+
+        /* not needed for other step kinds */
+        if (!IsA(step, PartitionPruneStepOp))
+            continue;
+
+        Assert(list_length(step->exprs) <= partnatts);
+
+        keyno = 0;
+        foreach(lc2, step->exprs)
+        {
+            Expr       *expr = (Expr *) lfirst(lc2);
+
+            /* not needed for Consts */
+            if (!IsA(expr, Const))
+            {
+                int            stateidx = PruneCxtStateIdx(partnatts,
+                                                        step->step.step_id,
+                                                        keyno);
+
+                context->exprstates[stateidx] =
+                    ExecInitExpr(expr, context->planstate);
+            }
+            keyno++;
+        }
+    }
+}
+
+/*
  * ExecFindInitialMatchingSubPlans
  *        Identify the set of subplans that cannot be eliminated by initial
  *        pruning, disregarding any pruning constraints involving PARAM_EXEC
@@ -1824,7 +1860,8 @@ ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate, int nsubplans)
         find_matching_subplans_recurse(prunedata, pprune, true, &result);

         /* Expression eval may have used space in node's ps_ExprContext too */
-        ResetExprContext(pprune->context.planstate->ps_ExprContext);
+        if (pprune->initial_pruning_steps)
+            ResetExprContext(pprune->initial_context.planstate->ps_ExprContext);
     }

     /* Add in any subplans that partition pruning didn't account for */
@@ -1888,7 +1925,7 @@ ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate, int nsubplans)
             for (j = prunedata->num_partrelprunedata - 1; j >= 0; j--)
             {
                 PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
-                int            nparts = pprune->context.nparts;
+                int            nparts = pprune->initial_context.nparts;
                 int            k;

                 /* We just rebuild present_parts from scratch */
@@ -1993,7 +2030,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate)
         find_matching_subplans_recurse(prunedata, pprune, false, &result);

         /* Expression eval may have used space in node's ps_ExprContext too */
-        ResetExprContext(pprune->context.planstate->ps_ExprContext);
+        if (pprune->exec_pruning_steps)
+            ResetExprContext(pprune->exec_context.planstate->ps_ExprContext);
     }

     /* Add in any subplans that partition pruning didn't account for */
@@ -2029,15 +2067,15 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
     check_stack_depth();

     /* Only prune if pruning would be useful at this level. */
-    if (initial_prune ? pprune->do_initial_prune : pprune->do_exec_prune)
+    if (initial_prune && pprune->initial_pruning_steps)
     {
-        PartitionPruneContext *context = &pprune->context;
-
-        /* Set whether we can evaluate PARAM_EXEC Params or not */
-        context->evalexecparams = !initial_prune;
-
-        partset = get_matching_partitions(context,
-                                          pprune->pruning_steps);
+        partset = get_matching_partitions(&pprune->initial_context,
+                                          pprune->initial_pruning_steps);
+    }
+    else if (!initial_prune && pprune->exec_pruning_steps)
+    {
+        partset = get_matching_partitions(&pprune->exec_context,
+                                          pprune->exec_pruning_steps);
     }
     else
     {
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 780d7ab..78deade 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -1198,16 +1198,13 @@ _copyPartitionedRelPruneInfo(const PartitionedRelPruneInfo *from)
     PartitionedRelPruneInfo *newnode = makeNode(PartitionedRelPruneInfo);

     COPY_SCALAR_FIELD(rtindex);
-    COPY_NODE_FIELD(pruning_steps);
     COPY_BITMAPSET_FIELD(present_parts);
     COPY_SCALAR_FIELD(nparts);
-    COPY_SCALAR_FIELD(nexprs);
     COPY_POINTER_FIELD(subplan_map, from->nparts * sizeof(int));
     COPY_POINTER_FIELD(subpart_map, from->nparts * sizeof(int));
     COPY_POINTER_FIELD(relid_map, from->nparts * sizeof(Oid));
-    COPY_POINTER_FIELD(hasexecparam, from->nexprs * sizeof(bool));
-    COPY_SCALAR_FIELD(do_initial_prune);
-    COPY_SCALAR_FIELD(do_exec_prune);
+    COPY_NODE_FIELD(initial_pruning_steps);
+    COPY_NODE_FIELD(exec_pruning_steps);
     COPY_BITMAPSET_FIELD(execparamids);

     return newnode;
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 387e4b9..237598e 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -948,16 +948,13 @@ _outPartitionedRelPruneInfo(StringInfo str, const PartitionedRelPruneInfo *node)
     WRITE_NODE_TYPE("PARTITIONEDRELPRUNEINFO");

     WRITE_UINT_FIELD(rtindex);
-    WRITE_NODE_FIELD(pruning_steps);
     WRITE_BITMAPSET_FIELD(present_parts);
     WRITE_INT_FIELD(nparts);
-    WRITE_INT_FIELD(nexprs);
     WRITE_INT_ARRAY(subplan_map, node->nparts);
     WRITE_INT_ARRAY(subpart_map, node->nparts);
     WRITE_OID_ARRAY(relid_map, node->nparts);
-    WRITE_BOOL_ARRAY(hasexecparam, node->nexprs);
-    WRITE_BOOL_FIELD(do_initial_prune);
-    WRITE_BOOL_FIELD(do_exec_prune);
+    WRITE_NODE_FIELD(initial_pruning_steps);
+    WRITE_NODE_FIELD(exec_pruning_steps);
     WRITE_BITMAPSET_FIELD(execparamids);
 }

diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 3b96492..6c2626e 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -2388,16 +2388,13 @@ _readPartitionedRelPruneInfo(void)
     READ_LOCALS(PartitionedRelPruneInfo);

     READ_UINT_FIELD(rtindex);
-    READ_NODE_FIELD(pruning_steps);
     READ_BITMAPSET_FIELD(present_parts);
     READ_INT_FIELD(nparts);
-    READ_INT_FIELD(nexprs);
     READ_INT_ARRAY(subplan_map, local_node->nparts);
     READ_INT_ARRAY(subpart_map, local_node->nparts);
     READ_OID_ARRAY(relid_map, local_node->nparts);
-    READ_BOOL_ARRAY(hasexecparam, local_node->nexprs);
-    READ_BOOL_FIELD(do_initial_prune);
-    READ_BOOL_FIELD(do_exec_prune);
+    READ_NODE_FIELD(initial_pruning_steps);
+    READ_NODE_FIELD(exec_pruning_steps);
     READ_BITMAPSET_FIELD(execparamids);

     READ_DONE();
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index ff3caf1..a892ca1 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -85,6 +85,17 @@ typedef enum PartClauseMatchStatus
 } PartClauseMatchStatus;

 /*
+ * PartClauseTarget
+ *        Identifies which qual clauses we can use for generating pruning steps
+ */
+typedef enum PartClauseTarget
+{
+    PARTCLAUSE_PLANNER,            /* want to prune during planning */
+    PARTCLAUSE_INITIAL,            /* want to prune during executor startup */
+    PARTCLAUSE_EXEC                /* want to prune for each plan node scan */
+} PartClauseTarget;
+
+/*
  * GeneratePruningStepsContext
  *        Information about the current state of generation of "pruning steps"
  *        for a given set of clauses
@@ -95,8 +106,7 @@ typedef enum PartClauseMatchStatus
 typedef struct GeneratePruningStepsContext
 {
     /* Input data: */
-    bool        forplanner;        /* true when generating steps to be used
-                                 * during query planning */
+    PartClauseTarget target;    /* context we're generating steps for */
     /* Working state and result data: */
     int            next_step_id;
     List       *steps;            /* output, list of PartitionPruneSteps */
@@ -122,7 +132,7 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
                               List *partitioned_rels, List *prunequal,
                               Bitmapset **matchedsubplans);
 static List *gen_partprune_steps(RelOptInfo *rel, List *clauses,
-                    bool forplanner, bool *contradictory);
+                    PartClauseTarget target, bool *contradictory);
 static List *gen_partprune_steps_internal(GeneratePruningStepsContext *context,
                              RelOptInfo *rel, List *clauses,
                              bool *contradictory);
@@ -169,8 +179,7 @@ static PruneStepResult *get_matching_range_bounds(PartitionPruneContext *context
                           FmgrInfo *partsupfunc, Bitmapset *nullkeys);
 static Bitmapset *pull_exec_paramids(Expr *expr);
 static bool pull_exec_paramids_walker(Node *node, Bitmapset **context);
-static bool analyze_partkey_exprs(PartitionedRelPruneInfo *pinfo, List *steps,
-                      int partnatts);
+static Bitmapset *get_partkey_exec_paramids(List *steps);
 static PruneStepResult *perform_pruning_base_step(PartitionPruneContext *context,
                           PartitionPruneStepOp *opstep);
 static PruneStepResult *perform_pruning_combine_step(PartitionPruneContext *context,
@@ -347,9 +356,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
         Index        rti = lfirst_int(lc);
         RelOptInfo *subpart = find_base_rel(root, rti);
         PartitionedRelPruneInfo *pinfo;
-        int            partnatts = subpart->part_scheme->partnatts;
         List       *partprunequal;
-        List       *pruning_steps;
+        List       *initial_pruning_steps;
+        List       *exec_pruning_steps;
         bool        contradictory;

         /*
@@ -415,13 +424,13 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
         }

         /*
-         * Convert pruning qual to pruning steps.  Since these steps will be
-         * used in the executor, we can pass 'forplanner' as false to allow
-         * steps to be generated that are unsafe for evaluation during
-         * planning, e.g. evaluation of stable functions.
+         * Convert pruning qual to pruning steps.  We need to do this twice,
+         * once to obtain executor startup pruning steps, and once for
+         * executor per-scan pruning steps.
          */
-        pruning_steps = gen_partprune_steps(subpart, partprunequal, false,
-                                            &contradictory);
+        initial_pruning_steps = gen_partprune_steps(subpart, partprunequal,
+                                                    PARTCLAUSE_INITIAL,
+                                                    &contradictory);

         if (contradictory)
         {
@@ -437,20 +446,28 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
             return NIL;
         }

+        exec_pruning_steps = gen_partprune_steps(subpart, partprunequal,
+                                                 PARTCLAUSE_EXEC,
+                                                 &contradictory);
+
+        if (contradictory)
+        {
+            /* As above, abort run-time pruning if anything fishy happens */
+            return NIL;
+        }
+
+        if (initial_pruning_steps || exec_pruning_steps)
+            doruntimeprune = true;
+
         /* Begin constructing the PartitionedRelPruneInfo for this rel */
         pinfo = makeNode(PartitionedRelPruneInfo);
         pinfo->rtindex = rti;
-        pinfo->pruning_steps = pruning_steps;
+        pinfo->initial_pruning_steps = initial_pruning_steps;
+        pinfo->exec_pruning_steps = exec_pruning_steps;
+        pinfo->execparamids = get_partkey_exec_paramids(exec_pruning_steps);
         /* Remaining fields will be filled in the next loop */

         pinfolist = lappend(pinfolist, pinfo);
-
-        /*
-         * Determine which pruning types should be enabled at this level. This
-         * also records paramids relevant to pruning steps in 'pinfo'.
-         */
-        doruntimeprune |= analyze_partkey_exprs(pinfo, pruning_steps,
-                                                partnatts);
     }

     if (!doruntimeprune)
@@ -535,6 +552,8 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
  *        Process 'clauses' (a rel's baserestrictinfo list of clauses) and return
  *        a list of "partition pruning steps".
  *
+ * XXX: FIX COMMENTS
+ *
  * 'forplanner' must be true when generating steps to be evaluated during
  * query planning, false when generating steps to be used at run-time.
  *
@@ -555,12 +574,12 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
  * else it's set false.
  */
 static List *
-gen_partprune_steps(RelOptInfo *rel, List *clauses, bool forplanner,
+gen_partprune_steps(RelOptInfo *rel, List *clauses, PartClauseTarget target,
                     bool *contradictory)
 {
     GeneratePruningStepsContext context;

-    context.forplanner = forplanner;
+    context.target = target;
     context.next_step_id = 0;
     context.steps = NIL;

@@ -635,7 +654,8 @@ prune_append_rel_partitions(RelOptInfo *rel)
      * pruning code that we only want pruning steps that can be evaluated
      * during planning.
      */
-    pruning_steps = gen_partprune_steps(rel, clauses, true,
+    pruning_steps = gen_partprune_steps(rel, clauses,
+                                        PARTCLAUSE_PLANNER,
                                         &contradictory);
     if (contradictory)
         return NULL;
@@ -655,8 +675,6 @@ prune_append_rel_partitions(RelOptInfo *rel)
     /* These are not valid when being called from the planner */
     context.planstate = NULL;
     context.exprstates = NULL;
-    context.exprhasexecparam = NULL;
-    context.evalexecparams = false;

     /* Actual pruning happens here. */
     return get_matching_partitions(&context, pruning_steps);
@@ -1656,7 +1674,7 @@ match_clause_to_partition_key(RelOptInfo *rel,
          * will produce a usable result, since that'd mean there are Vars on
          * both sides.)
          */
-        if (context->forplanner)
+        if (context->target == PARTCLAUSE_PLANNER)
         {
             /*
              * When pruning in the planner, we only support pruning using
@@ -1675,6 +1693,8 @@ match_clause_to_partition_key(RelOptInfo *rel,
         }
         else
         {
+            Bitmapset  *paramids;
+
             /*
              * Otherwise, non-consts are allowed, but we can't prune using an
              * expression that contains Vars.
@@ -1683,6 +1703,14 @@ match_clause_to_partition_key(RelOptInfo *rel,
                 return PARTCLAUSE_UNSUPPORTED;

             /*
+             * See if there are exec Params, too.  If so, we can only use this
+             * expression during per-scan pruning.
+             */
+            paramids = pull_exec_paramids(expr);
+            if (!bms_is_empty(paramids) && context->target != PARTCLAUSE_EXEC)
+                return PARTCLAUSE_UNSUPPORTED;
+
+            /*
              * And we must reject anything containing a volatile function.
              * Stable functions are OK though.  (We need not check this for
              * the comparison operator itself: anything that belongs to a
@@ -1837,7 +1865,7 @@ match_clause_to_partition_key(RelOptInfo *rel,
          * see if it can sanely be used for partition pruning (this is
          * identical to the logic for a plain OpExpr).
          */
-        if (context->forplanner)
+        if (context->target == PARTCLAUSE_PLANNER)
         {
             /*
              * When pruning in the planner, we only support pruning using
@@ -1856,6 +1884,8 @@ match_clause_to_partition_key(RelOptInfo *rel,
         }
         else
         {
+            Bitmapset  *paramids;
+
             /*
              * Otherwise, non-consts are allowed, but we can't prune using an
              * expression that contains Vars.
@@ -1864,6 +1894,14 @@ match_clause_to_partition_key(RelOptInfo *rel,
                 return PARTCLAUSE_UNSUPPORTED;

             /*
+             * See if there are exec Params, too.  If so, we can only use this
+             * expression during per-scan pruning.
+             */
+            paramids = pull_exec_paramids(expr);
+            if (!bms_is_empty(paramids) && context->target != PARTCLAUSE_EXEC)
+                return PARTCLAUSE_UNSUPPORTED;
+
+            /*
              * And we must reject anything containing a volatile function.
              * Stable functions are OK though.  (We need not check this for
              * the comparison operator itself: anything that belongs to a
@@ -2985,89 +3023,38 @@ pull_exec_paramids_walker(Node *node, Bitmapset **context)
 }

 /*
- * analyze_partkey_exprs
- *        Loop through all pruning steps and identify which ones require
- *        executor startup-time or executor run-time pruning.
- *
- * Returns true if any executor partition pruning should be attempted at this
- * level.  Also fills fields of *pinfo to record how to process each step.
+ * get_partkey_exec_paramids
+ *        Loop through given pruning steps and find out which exec Params
+ *        are used.
  *
- * Note: when this is called, not much of *pinfo is valid; but that's OK
- * since we only use it as an output area.
+ * Returns a Bitmapset of Param IDs.
  */
-static bool
-analyze_partkey_exprs(PartitionedRelPruneInfo *pinfo, List *steps,
-                      int partnatts)
+static Bitmapset *
+get_partkey_exec_paramids(List *steps)
 {
-    bool        doruntimeprune = false;
+    Bitmapset  *execparamids = NULL;
     ListCell   *lc;

-    /*
-     * Steps require run-time pruning if they contain EXEC_PARAM Params.
-     * Otherwise, if their expressions aren't simple Consts or they involve
-     * non-immutable comparison operators, they require startup-time pruning.
-     * (Otherwise, the pruning would have been done at plan time.)
-     *
-     * Notice that what we actually check for mutability is the comparison
-     * functions, not the original operators.  This relies on the support
-     * functions of the btree or hash opfamily being marked consistently with
-     * the operators.
-     */
-    pinfo->nexprs = list_length(steps) * partnatts;
-    pinfo->hasexecparam = (bool *) palloc0(sizeof(bool) * pinfo->nexprs);
-    pinfo->do_initial_prune = false;
-    pinfo->do_exec_prune = false;
-    pinfo->execparamids = NULL;
-
     foreach(lc, steps)
     {
         PartitionPruneStepOp *step = (PartitionPruneStepOp *) lfirst(lc);
         ListCell   *lc2;
-        ListCell   *lc3;
-        int            keyno;

         if (!IsA(step, PartitionPruneStepOp))
             continue;

-        keyno = 0;
-        Assert(list_length(step->exprs) == list_length(step->cmpfns));
-        forboth(lc2, step->exprs, lc3, step->cmpfns)
+        foreach(lc2, step->exprs)
         {
             Expr       *expr = lfirst(lc2);
-            Oid            fnoid = lfirst_oid(lc3);

+            /* We can be quick for plain Consts */
             if (!IsA(expr, Const))
-            {
-                Bitmapset  *execparamids = pull_exec_paramids(expr);
-                bool        hasexecparams;
-                int            stateidx = PruneCxtStateIdx(partnatts,
-                                                        step->step.step_id,
-                                                        keyno);
-
-                Assert(stateidx < pinfo->nexprs);
-                hasexecparams = !bms_is_empty(execparamids);
-                pinfo->hasexecparam[stateidx] = hasexecparams;
-                pinfo->execparamids = bms_join(pinfo->execparamids,
-                                               execparamids);
-
-                if (hasexecparams)
-                    pinfo->do_exec_prune = true;
-                else
-                    pinfo->do_initial_prune = true;
-
-                doruntimeprune = true;
-            }
-            else if (func_volatile(fnoid) != PROVOLATILE_IMMUTABLE)
-            {
-                /* No exec params here, but must do initial pruning */
-                pinfo->do_initial_prune = true;
-                doruntimeprune = true;
-            }
-            keyno++;
+                execparamids = bms_join(execparamids,
+                                        pull_exec_paramids(expr));
         }
     }

-    return doruntimeprune;
+    return execparamids;
 }

 /*
@@ -3392,6 +3379,8 @@ match_boolean_partition_clause(Oid partopfamily, Expr *clause, Expr *partkey,
  * partkey_datum_from_expr
  *        Evaluate expression for potential partition pruning
  *
+ * XXX this can't fail anymore, so change to return void
+ *
  * Evaluate 'expr', whose ExprState is stateidx of the context exprstate
  * array; set *value and *isnull to the resulting Datum and nullflag.
  * Return true if evaluation was possible, otherwise false.
@@ -3417,30 +3406,19 @@ partkey_datum_from_expr(PartitionPruneContext *context,
     }
     else
     {
+        ExprState  *exprstate;
+        ExprContext *ectx;
+
         /*
          * We should never see a non-Const in a step unless we're running in
          * the executor.
          */
         Assert(context->planstate != NULL);

-        /*
-         * When called from the executor we'll have a valid planstate so we
-         * may be able to evaluate an expression which could not be folded to
-         * a Const during planning.  Since run-time pruning can occur both
-         * during initialization of the executor or while it's running, we
-         * must be careful here to evaluate expressions containing PARAM_EXEC
-         * Params only when told it's OK.
-         */
-        if (context->evalexecparams || !context->exprhasexecparam[stateidx])
-        {
-            ExprState  *exprstate;
-            ExprContext *ectx;
-
-            exprstate = context->exprstates[stateidx];
-            ectx = context->planstate->ps_ExprContext;
-            *value = ExecEvalExprSwitchContext(exprstate, ectx, isnull);
-            return true;
-        }
+        exprstate = context->exprstates[stateidx];
+        ectx = context->planstate->ps_ExprContext;
+        *value = ExecEvalExprSwitchContext(exprstate, ectx, isnull);
+        return true;
     }

     return false;
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index b363aba..465a975 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -62,24 +62,24 @@ typedef struct PartitionRoutingInfo
  * subpart_map                    Subpart index by partition index, or -1.
  * present_parts                A Bitmapset of the partition indexes that we
  *                                have subplans or subparts for.
- * context                        Contains the context details required to call
- *                                the partition pruning code.
- * pruning_steps                List of PartitionPruneSteps used to
- *                                perform the actual pruning.
- * do_initial_prune                true if pruning should be performed during
- *                                executor startup (for this partitioning level).
- * do_exec_prune                true if pruning should be performed during
- *                                executor run (for this partitioning level).
+ * initial_pruning_steps        List of PartitionPruneSteps used to
+ *                                perform executor startup pruning.
+ * exec_pruning_steps            List of PartitionPruneSteps used to
+ *                                perform per-scan pruning.
+ * initial_context                If initial_pruning_steps isn't NIL, contains
+ *                                the details needed to execute those steps.
+ * exec_context                    If exec_pruning_steps isn't NIL, contains
+ *                                the details needed to execute those steps.
  */
 typedef struct PartitionedRelPruningData
 {
     int           *subplan_map;
     int           *subpart_map;
     Bitmapset  *present_parts;
-    PartitionPruneContext context;
-    List       *pruning_steps;
-    bool        do_initial_prune;
-    bool        do_exec_prune;
+    List       *initial_pruning_steps;
+    List       *exec_pruning_steps;
+    PartitionPruneContext initial_context;
+    PartitionPruneContext exec_context;
 } PartitionedRelPruningData;

 /*
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 1cce762..1241245 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -1109,21 +1109,23 @@ typedef struct PartitionedRelPruneInfo
 {
     NodeTag        type;
     Index        rtindex;        /* RT index of partition rel for this level */
-    List       *pruning_steps;    /* List of PartitionPruneStep, see below */
     Bitmapset  *present_parts;    /* Indexes of all partitions which subplans or
-                                 * subparts are present for. */
-    int            nparts;            /* Length of subplan_map[] and subpart_map[] */
-    int            nexprs;            /* Length of hasexecparam[] */
+                                 * subparts are present for */
+    int            nparts;            /* Length of the following arrays: */
     int           *subplan_map;    /* subplan index by partition index, or -1 */
     int           *subpart_map;    /* subpart index by partition index, or -1 */
     Oid           *relid_map;        /* relation OID by partition index, or 0 */
-    bool       *hasexecparam;    /* true if corresponding pruning_step contains
-                                 * any PARAM_EXEC Params. */
-    bool        do_initial_prune;    /* true if pruning should be performed
-                                     * during executor startup. */
-    bool        do_exec_prune;    /* true if pruning should be performed during
-                                 * executor run. */
-    Bitmapset  *execparamids;    /* All PARAM_EXEC Param IDs in pruning_steps */
+
+    /*
+     * initial_pruning_steps shows how to prune during executor startup (i.e.,
+     * without use of any PARAM_EXEC Params); it is NIL if no startup pruning
+     * is required.  exec_pruning_steps shows how to prune with PARAM_EXEC
+     * Params; it is NIL if no per-scan pruning is required.
+     */
+    List       *initial_pruning_steps;    /* List of PartitionPruneStep */
+    List       *exec_pruning_steps; /* List of PartitionPruneStep */
+    Bitmapset  *execparamids;    /* All PARAM_EXEC Param IDs in
+                                 * exec_pruning_steps */
 } PartitionedRelPruneInfo;

 /*
diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h
index 2f75717..b906ae1 100644
--- a/src/include/partitioning/partprune.h
+++ b/src/include/partitioning/partprune.h
@@ -44,10 +44,6 @@ struct RelOptInfo;
  * exprstates        Array of ExprStates, indexed as per PruneCtxStateIdx; one
  *                    for each partition key in each pruning step.  Allocated if
  *                    planstate is non-NULL, otherwise NULL.
- * exprhasexecparam    Array of bools, each true if corresponding 'exprstate'
- *                    expression contains any PARAM_EXEC Params.  (Can be NULL
- *                    if planstate is NULL.)
- * evalexecparams    True if it's safe to evaluate PARAM_EXEC Params.
  */
 typedef struct PartitionPruneContext
 {
@@ -61,8 +57,6 @@ typedef struct PartitionPruneContext
     MemoryContext ppccontext;
     PlanState  *planstate;
     ExprState **exprstates;
-    bool       *exprhasexecparam;
-    bool        evalexecparams;
 } PartitionPruneContext;

 /*
diff --git a/src/test/regress/expected/partition_prune.out b/src/test/regress/expected/partition_prune.out
index 723fc70..841bd8b 100644
--- a/src/test/regress/expected/partition_prune.out
+++ b/src/test/regress/expected/partition_prune.out
@@ -3149,6 +3149,45 @@ select * from mc3p where a < 3 and abs(b) = 1;
          Filter: ((a < 3) AND (abs(b) = 1))
 (7 rows)

+--
+-- Check that pruning with composite range partitioning works correctly when
+-- a combination of runtime parameters is specified, not all of whose values
+-- are available at the same time
+--
+set plan_cache_mode = force_generic_plan;
+prepare ps1 as
+  select * from mc3p where a = $1 and abs(b) < (select 3);
+explain (analyze, costs off, summary off, timing off)
+execute ps1(1);
+                   QUERY PLAN
+-------------------------------------------------
+ Append (actual rows=1 loops=1)
+   InitPlan 1 (returns $0)
+     ->  Result (actual rows=1 loops=1)
+   Subplans Removed: 2
+   ->  Seq Scan on mc3p1 (actual rows=1 loops=1)
+         Filter: ((a = $1) AND (abs(b) < $0))
+(6 rows)
+
+deallocate ps1;
+prepare ps2 as
+  select * from mc3p where a <= $1 and abs(b) < (select 3);
+explain (analyze, costs off, summary off, timing off)
+execute ps2(1);
+                   QUERY PLAN
+-------------------------------------------------
+ Append (actual rows=2 loops=1)
+   InitPlan 1 (returns $0)
+     ->  Result (actual rows=1 loops=1)
+   Subplans Removed: 1
+   ->  Seq Scan on mc3p0 (actual rows=1 loops=1)
+         Filter: ((a <= $1) AND (abs(b) < $0))
+   ->  Seq Scan on mc3p1 (actual rows=1 loops=1)
+         Filter: ((a <= $1) AND (abs(b) < $0))
+(8 rows)
+
+deallocate ps2;
+reset plan_cache_mode;
 drop table mc3p;
 -- Ensure runtime pruning works with initplans params with boolean types
 create table boolvalues (value bool not null);
diff --git a/src/test/regress/sql/partition_prune.sql b/src/test/regress/sql/partition_prune.sql
index 2373bd8..071e28d 100644
--- a/src/test/regress/sql/partition_prune.sql
+++ b/src/test/regress/sql/partition_prune.sql
@@ -809,6 +809,24 @@ insert into mc3p values (0, 1, 1), (1, 1, 1), (2, 1, 1);
 explain (analyze, costs off, summary off, timing off)
 select * from mc3p where a < 3 and abs(b) = 1;

+--
+-- Check that pruning with composite range partitioning works correctly when
+-- a combination of runtime parameters is specified, not all of whose values
+-- are available at the same time
+--
+set plan_cache_mode = force_generic_plan;
+prepare ps1 as
+  select * from mc3p where a = $1 and abs(b) < (select 3);
+explain (analyze, costs off, summary off, timing off)
+execute ps1(1);
+deallocate ps1;
+prepare ps2 as
+  select * from mc3p where a <= $1 and abs(b) < (select 3);
+explain (analyze, costs off, summary off, timing off)
+execute ps2(1);
+deallocate ps2;
+reset plan_cache_mode;
+
 drop table mc3p;

 -- Ensure runtime pruning works with initplans params with boolean types

Re: inconsistent results querying table partitioned by date

От
Amit Langote
Дата:
On 2019/05/17 5:37, Tom Lane wrote:
> I wrote:
>> So, the changes in gen_prune_steps_from_opexps seem okay, but
>> the change in perform_pruning_base_step is just wrong,

Thank you.  Will look at your patch for the other bit.

Regards,
Amit




Re: inconsistent results querying table partitioned by date

От
Amit Langote
Дата:
On 2019/05/17 4:25, Tom Lane wrote:
> I wrote:
>> Anyway, I've pushed David's patch now, on to look at Amit's.
> 
> So, the changes in gen_prune_steps_from_opexps seem okay, but
> the change in perform_pruning_base_step is just wrong, as can
> be seen from this extended form of the test case:
> 
> create table mc3p (a int, b int, c int) partition by range (a, abs(b), c);
> create table mc3p1 partition of mc3p
>   for values from (1, 1, 1) to (1, 1, 10);
> create table mc3p2 partition of mc3p
>   for values from (1, 1, 10) to (2, minvalue, minvalue);
> create table mc3p3 partition of mc3p
>  for values from (2, minvalue, minvalue) to (3, maxvalue, maxvalue);
> insert into mc3p values (1, 1, 1), (1,1,10), (2, 1, 1);
> 
> set plan_cache_mode = force_generic_plan;
> prepare init_exec_prune_q1 as
>   select * from mc3p where a = $1 and abs(b) <= $2 and c < (select 13);
> explain (analyze, costs off, summary off, timing off)
> execute init_exec_prune_q1 (1,2);
> deallocate init_exec_prune_q1;
> reset plan_cache_mode;
> 
> drop table mc3p;
>
> This should of course find the (1,1,1) row, but it does not because
> it converts the initial pruning condition to be "a = 1 and abs(b) = 2",
> whereas the correct thing would be "a = 1 and abs(b) <= 2".

Oops, you're right.

> It seems to me that the correct fix, rather than forcing equality
> strategy when you've missed out some of the quals, is to back up
> and use the strategy of the last clause you did use.  But unless
> I'm missing something, that information is just not available from
> the PartitionPruneStepOp data structure.

It's not.  PartitionPruneStepOp's opstrategy refers to the strategy of the
operator for the last matched key.

> We could extend the PartitionPruneStepOp struct to have an array
> or list of StrategyNumbers (but I'm unsure about how safe that'd
> be to backpatch into v11).

I did consider that, but I too feared that it won't be back-patchable.

> Another idea is that we ought to generate separate partitioning
> steps lists for the initial-pruning and runtime-pruning cases,
> ie invoke gen_partprune_steps yet a third time.  Ugh (and that
> still means a data structure change, just in a different place).
>
> The bottom line here is that we've got an inflexible data structure
> that was designed on the assumption that gen_partprune_steps has all
> the intelligence, and trying to adjust things later is just trouble.
> 
> I think this is really quite closely tied to my previous complaints
> about the design for plan-time versus run-time pruning, just mutatis
> mutandis.  The lack of any holistic, principled approach to
> which-quals-apply-when is what's biting us here.
> 
> Moreover, now that I understand what's going on here a bit better,
> I am deathly afraid that there are yet other bugs hiding in this
> same area.  What we've basically got is that analyze_partkey_exprs
> and perform_pruning_base_step are trying to reverse-engineer what
> gen_partprune_steps would have done differently if it used only the
> quals not containing exec params.  They don't really have enough
> information available to do that correctly, and even if they did,
> the fact that the logic looks nothing like the same is going to
> be a fertile source of bugs and maintenance gotchas.
> 
> In other words, as I compose this I'm talking myself into the
> idea that invoking gen_partprune_steps three times (for planner,
> executor startup, and executor runtime, with corresponding
> filters on which clauses can be used) is the only safe near-term
> fix.  In the future we can look at cutting down the repetitive
> costs that entails.

I have to agree that running gen_partprune_steps separately for each
instance of performing get_matching_partitions() would result in cleaner
code overall.  Cleaner in the sense that the logic of which set of clauses
are applicable to which instance of pruning is now centralized and doesn't
require code to reverse-engineering that in multiple places, as you say.

Maybe, we *can* avoid doing full-blown gen_partprune_steps() a 2nd and a
3rd time, an idea for which I'll post in reply to your other email.

Thanks,
Amit




Re: inconsistent results querying table partitioned by date

От
Amit Langote
Дата:
On 2019/05/17 7:50, Tom Lane wrote:
> I wrote:
>> In other words, as I compose this I'm talking myself into the
>> idea that invoking gen_partprune_steps three times (for planner,
>> executor startup, and executor runtime, with corresponding
>> filters on which clauses can be used) is the only safe near-term
>> fix.  In the future we can look at cutting down the repetitive
>> costs that entails.
> 
> Here's a really quick and dirty POC along that line.

Thanks for the patch.

> It's unfinished
> but I'm out of time for today.  I think the executor side of it is OK,
> but the planner side is crufty: it lacks logic to decide whether
> either the initial pruning or exec pruning steps are unnecessary.
> And I didn't update most of the comments either.
> 
> I think the rule for "exec pruning is unnecessary" could probably
> just be "we didn't find any exec params in the prune steps", since
> that would imply that the initial steps include everything.
> I don't see an equally cheap way to decide that initial pruning
> is useless, though.  Thoughts?

As promised, here is the idea I was thinking of, although I'm no longer
confident you will like it based on your previous emails on this thread.
I was thinking of having a walker function that traverses the list of
*clauses* that were passed to make_partition_pruneinfo(), which puts the
paramids into an output context struct, with separate Bitmapsets for
PARAM_EXTERN and PARAM_EXEC parameters.  In addition to running this
walker, also do a contain_mutable_functions() pass over the clauses.  (I
can already hear a big no at this point!)  As you might've guessed, the
result of those two steps would decide whether we need to do the the extra
gen_partprune_steps() steps.  Fwiw, I prototyped it in the attached patch
which applies on top of yours.

Maybe as the only defense, I'd say that that would be cheaper in the
fairly common case where no run-time pruning is needed than doing two
full-blown gen_partprune_steps().

> Also, it seems like we could save at least some of the planning expense by
> having the PARTCLAUSE_INITIAL pass report back whether it had discarded
> any Param-bearing clauses or not.  If not, there's definitely no need to
> run the PARTCLAUSE_EXEC pass.

That sounds like a good idea.

Thanks,
Amit

Вложения

Re: inconsistent results querying table partitioned by date

От
Tom Lane
Дата:
Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes:
> As promised, here is the idea I was thinking of, although I'm no longer
> confident you will like it based on your previous emails on this thread.
> I was thinking of having a walker function that traverses the list of
> *clauses* that were passed to make_partition_pruneinfo(), which puts the
> paramids into an output context struct, with separate Bitmapsets for
> PARAM_EXTERN and PARAM_EXEC parameters.  In addition to running this
> walker, also do a contain_mutable_functions() pass over the clauses.  (I
> can already hear a big no at this point!)  As you might've guessed, the
> result of those two steps would decide whether we need to do the the extra
> gen_partprune_steps() steps.  Fwiw, I prototyped it in the attached patch
> which applies on top of yours.

Actually, yeah, my thoughts about future development were to split up
the gen_partprune_steps processing to avoid the duplicated work.
I've not read your patch yet but this sounds like the same idea.
What I'm worried about is if it's too much code churn for a
back-patchable bug fix.

            regards, tom lane



Re: inconsistent results querying table partitioned by date

От
Tom Lane
Дата:
Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes:
> On 2019/05/17 7:50, Tom Lane wrote:
>> Also, it seems like we could save at least some of the planning expense by
>> having the PARTCLAUSE_INITIAL pass report back whether it had discarded
>> any Param-bearing clauses or not.  If not, there's definitely no need to
>> run the PARTCLAUSE_EXEC pass.

> That sounds like a good idea.

Here's a patch that does it like that.  Since this requires additional
output values from gen_partprune_steps, I decided to go "all in" on
having GeneratePruningStepsContext contain everything of interest,
including all these result flags.  That makes the patch a bit more
invasive than it could perhaps be otherwise, but I think it's cleaner.
And these are just internal APIs in partprune.c so there's no
compatibility problems to worry about.

It's kind of hard to tell what this code is actually doing, due to
the lack of visibility in EXPLAIN output (something I think Robert
has kvetched about before).  To try to check that, I made the very
quick-and-dirty explain.c patches below, which I'm *not* proposing
to commit.  They showed that this code is making the same decisions
as HEAD for all the currently-existing test cases, which is probably
what we want it to do.  The new test cases added by the main patch
show up as having different initial and exec step lists, which is
also expected.  So I feel pretty good that this is acting as we wish.

The fly in the ointment is what about back-patching?  After looking
around, I think that we can probably get away with changing the run-time
data structures (PartitionPruneContext, PartitionedRelPruningData,
PartitionPruningData) in v11; it seems unlikely that anything outside
partprune.c and execPartition.c would have occasion to touch those.
But it's harder to argue that for the node type PartitionedRelPruneInfo,
since that's part of plan trees.  What I suggest we do for v11 is to
leave the existing fields in that node type that're removed by this
patch present, but unused, and add initial_pruning_steps and
exec_pruning_steps at the end.  This would confuse any code that's
studying the contents of PartitionedRelPruneInfo closely, but
probably as long as we set do_initial_prune and do_exec_prune properly,
that would satisfy most code that might be inspecting it.

Since time is growing short before beta1 wrap, I'm going to go ahead
and push this in hopes of getting buildfarm cycles on it.  But feel
free to review and look for further improvements.  There's plenty
of time to rethink the data-structure details for the v11 backpatch,
also.

            regards, tom lane

diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 70709e5..d663118 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -183,6 +183,11 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
                                      bool *isnull,
                                      int maxfieldlen);
 static List *adjust_partition_tlist(List *tlist, TupleConversionMap *map);
+static void ExecInitPruningContext(PartitionPruneContext *context,
+                       List *pruning_steps,
+                       PartitionDesc partdesc,
+                       PartitionKey partkey,
+                       PlanState *planstate);
 static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
                                PartitionedRelPruningData *pprune,
                                bool initial_prune,
@@ -1614,13 +1619,9 @@ ExecCreatePartitionPruneState(PlanState *planstate,
         {
             PartitionedRelPruneInfo *pinfo = lfirst_node(PartitionedRelPruneInfo, lc2);
             PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
-            PartitionPruneContext *context = &pprune->context;
             Relation    partrel;
             PartitionDesc partdesc;
             PartitionKey partkey;
-            int            partnatts;
-            int            n_steps;
-            ListCell   *lc3;

             /* present_parts is also subject to later modification */
             pprune->present_parts = bms_copy(pinfo->present_parts);
@@ -1700,66 +1701,36 @@ ExecCreatePartitionPruneState(PlanState *planstate,
                 Assert(pd_idx == pinfo->nparts);
             }

-            n_steps = list_length(pinfo->pruning_steps);
-
-            context->strategy = partkey->strategy;
-            context->partnatts = partnatts = partkey->partnatts;
-            context->nparts = pinfo->nparts;
-            context->boundinfo = partdesc->boundinfo;
-            context->partcollation = partkey->partcollation;
-            context->partsupfunc = partkey->partsupfunc;
-
-            /* We'll look up type-specific support functions as needed */
-            context->stepcmpfuncs = (FmgrInfo *)
-                palloc0(sizeof(FmgrInfo) * n_steps * partnatts);
-
-            context->ppccontext = CurrentMemoryContext;
-            context->planstate = planstate;
-
-            /* Initialize expression state for each expression we need */
-            context->exprstates = (ExprState **)
-                palloc0(sizeof(ExprState *) * n_steps * partnatts);
-            foreach(lc3, pinfo->pruning_steps)
+            /*
+             * Initialize pruning contexts as needed.
+             */
+            pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
+            if (pinfo->initial_pruning_steps)
             {
-                PartitionPruneStepOp *step = (PartitionPruneStepOp *) lfirst(lc3);
-                ListCell   *lc4;
-                int            keyno;
-
-                /* not needed for other step kinds */
-                if (!IsA(step, PartitionPruneStepOp))
-                    continue;
-
-                Assert(list_length(step->exprs) <= partnatts);
-
-                keyno = 0;
-                foreach(lc4, step->exprs)
-                {
-                    Expr       *expr = (Expr *) lfirst(lc4);
-
-                    /* not needed for Consts */
-                    if (!IsA(expr, Const))
-                    {
-                        int            stateidx = PruneCxtStateIdx(partnatts,
-                                                                step->step.step_id,
-                                                                keyno);
-
-                        context->exprstates[stateidx] =
-                            ExecInitExpr(expr, context->planstate);
-                    }
-                    keyno++;
-                }
+                ExecInitPruningContext(&pprune->initial_context,
+                                       pinfo->initial_pruning_steps,
+                                       partdesc, partkey, planstate);
+                /* Record whether initial pruning is needed at any level */
+                prunestate->do_initial_prune = true;
+            }
+            else
+            {
+                /*
+                 * Hack: ExecFindInitialMatchingSubPlans requires this field
+                 * to be valid whether pruning or not.
+                 */
+                pprune->initial_context.nparts = partdesc->nparts;
             }

-            /* Array is not modified at runtime, so just point to plan's copy */
-            context->exprhasexecparam = pinfo->hasexecparam;
-
-            pprune->pruning_steps = pinfo->pruning_steps;
-            pprune->do_initial_prune = pinfo->do_initial_prune;
-            pprune->do_exec_prune = pinfo->do_exec_prune;
-
-            /* Record if pruning would be useful at any level */
-            prunestate->do_initial_prune |= pinfo->do_initial_prune;
-            prunestate->do_exec_prune |= pinfo->do_exec_prune;
+            pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
+            if (pinfo->exec_pruning_steps)
+            {
+                ExecInitPruningContext(&pprune->exec_context,
+                                       pinfo->exec_pruning_steps,
+                                       partdesc, partkey, planstate);
+                /* Record whether exec pruning is needed at any level */
+                prunestate->do_exec_prune = true;
+            }

             /*
              * Accumulate the IDs of all PARAM_EXEC Params affecting the
@@ -1777,6 +1748,71 @@ ExecCreatePartitionPruneState(PlanState *planstate,
 }

 /*
+ * Initialize a PartitionPruneContext for the given list of pruning steps.
+ */
+static void
+ExecInitPruningContext(PartitionPruneContext *context,
+                       List *pruning_steps,
+                       PartitionDesc partdesc,
+                       PartitionKey partkey,
+                       PlanState *planstate)
+{
+    int            n_steps;
+    int            partnatts;
+    ListCell   *lc;
+
+    n_steps = list_length(pruning_steps);
+
+    context->strategy = partkey->strategy;
+    context->partnatts = partnatts = partkey->partnatts;
+    context->nparts = partdesc->nparts;
+    context->boundinfo = partdesc->boundinfo;
+    context->partcollation = partkey->partcollation;
+    context->partsupfunc = partkey->partsupfunc;
+
+    /* We'll look up type-specific support functions as needed */
+    context->stepcmpfuncs = (FmgrInfo *)
+        palloc0(sizeof(FmgrInfo) * n_steps * partnatts);
+
+    context->ppccontext = CurrentMemoryContext;
+    context->planstate = planstate;
+
+    /* Initialize expression state for each expression we need */
+    context->exprstates = (ExprState **)
+        palloc0(sizeof(ExprState *) * n_steps * partnatts);
+    foreach(lc, pruning_steps)
+    {
+        PartitionPruneStepOp *step = (PartitionPruneStepOp *) lfirst(lc);
+        ListCell   *lc2;
+        int            keyno;
+
+        /* not needed for other step kinds */
+        if (!IsA(step, PartitionPruneStepOp))
+            continue;
+
+        Assert(list_length(step->exprs) <= partnatts);
+
+        keyno = 0;
+        foreach(lc2, step->exprs)
+        {
+            Expr       *expr = (Expr *) lfirst(lc2);
+
+            /* not needed for Consts */
+            if (!IsA(expr, Const))
+            {
+                int            stateidx = PruneCxtStateIdx(partnatts,
+                                                        step->step.step_id,
+                                                        keyno);
+
+                context->exprstates[stateidx] =
+                    ExecInitExpr(expr, context->planstate);
+            }
+            keyno++;
+        }
+    }
+}
+
+/*
  * ExecFindInitialMatchingSubPlans
  *        Identify the set of subplans that cannot be eliminated by initial
  *        pruning, disregarding any pruning constraints involving PARAM_EXEC
@@ -1824,7 +1860,8 @@ ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate, int nsubplans)
         find_matching_subplans_recurse(prunedata, pprune, true, &result);

         /* Expression eval may have used space in node's ps_ExprContext too */
-        ResetExprContext(pprune->context.planstate->ps_ExprContext);
+        if (pprune->initial_pruning_steps)
+            ResetExprContext(pprune->initial_context.planstate->ps_ExprContext);
     }

     /* Add in any subplans that partition pruning didn't account for */
@@ -1888,7 +1925,7 @@ ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate, int nsubplans)
             for (j = prunedata->num_partrelprunedata - 1; j >= 0; j--)
             {
                 PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
-                int            nparts = pprune->context.nparts;
+                int            nparts = pprune->initial_context.nparts;
                 int            k;

                 /* We just rebuild present_parts from scratch */
@@ -1993,7 +2030,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate)
         find_matching_subplans_recurse(prunedata, pprune, false, &result);

         /* Expression eval may have used space in node's ps_ExprContext too */
-        ResetExprContext(pprune->context.planstate->ps_ExprContext);
+        if (pprune->exec_pruning_steps)
+            ResetExprContext(pprune->exec_context.planstate->ps_ExprContext);
     }

     /* Add in any subplans that partition pruning didn't account for */
@@ -2029,15 +2067,15 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
     check_stack_depth();

     /* Only prune if pruning would be useful at this level. */
-    if (initial_prune ? pprune->do_initial_prune : pprune->do_exec_prune)
+    if (initial_prune && pprune->initial_pruning_steps)
     {
-        PartitionPruneContext *context = &pprune->context;
-
-        /* Set whether we can evaluate PARAM_EXEC Params or not */
-        context->evalexecparams = !initial_prune;
-
-        partset = get_matching_partitions(context,
-                                          pprune->pruning_steps);
+        partset = get_matching_partitions(&pprune->initial_context,
+                                          pprune->initial_pruning_steps);
+    }
+    else if (!initial_prune && pprune->exec_pruning_steps)
+    {
+        partset = get_matching_partitions(&pprune->exec_context,
+                                          pprune->exec_pruning_steps);
     }
     else
     {
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 780d7ab..78deade 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -1198,16 +1198,13 @@ _copyPartitionedRelPruneInfo(const PartitionedRelPruneInfo *from)
     PartitionedRelPruneInfo *newnode = makeNode(PartitionedRelPruneInfo);

     COPY_SCALAR_FIELD(rtindex);
-    COPY_NODE_FIELD(pruning_steps);
     COPY_BITMAPSET_FIELD(present_parts);
     COPY_SCALAR_FIELD(nparts);
-    COPY_SCALAR_FIELD(nexprs);
     COPY_POINTER_FIELD(subplan_map, from->nparts * sizeof(int));
     COPY_POINTER_FIELD(subpart_map, from->nparts * sizeof(int));
     COPY_POINTER_FIELD(relid_map, from->nparts * sizeof(Oid));
-    COPY_POINTER_FIELD(hasexecparam, from->nexprs * sizeof(bool));
-    COPY_SCALAR_FIELD(do_initial_prune);
-    COPY_SCALAR_FIELD(do_exec_prune);
+    COPY_NODE_FIELD(initial_pruning_steps);
+    COPY_NODE_FIELD(exec_pruning_steps);
     COPY_BITMAPSET_FIELD(execparamids);

     return newnode;
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 387e4b9..237598e 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -948,16 +948,13 @@ _outPartitionedRelPruneInfo(StringInfo str, const PartitionedRelPruneInfo *node)
     WRITE_NODE_TYPE("PARTITIONEDRELPRUNEINFO");

     WRITE_UINT_FIELD(rtindex);
-    WRITE_NODE_FIELD(pruning_steps);
     WRITE_BITMAPSET_FIELD(present_parts);
     WRITE_INT_FIELD(nparts);
-    WRITE_INT_FIELD(nexprs);
     WRITE_INT_ARRAY(subplan_map, node->nparts);
     WRITE_INT_ARRAY(subpart_map, node->nparts);
     WRITE_OID_ARRAY(relid_map, node->nparts);
-    WRITE_BOOL_ARRAY(hasexecparam, node->nexprs);
-    WRITE_BOOL_FIELD(do_initial_prune);
-    WRITE_BOOL_FIELD(do_exec_prune);
+    WRITE_NODE_FIELD(initial_pruning_steps);
+    WRITE_NODE_FIELD(exec_pruning_steps);
     WRITE_BITMAPSET_FIELD(execparamids);
 }

diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 3b96492..6c2626e 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -2388,16 +2388,13 @@ _readPartitionedRelPruneInfo(void)
     READ_LOCALS(PartitionedRelPruneInfo);

     READ_UINT_FIELD(rtindex);
-    READ_NODE_FIELD(pruning_steps);
     READ_BITMAPSET_FIELD(present_parts);
     READ_INT_FIELD(nparts);
-    READ_INT_FIELD(nexprs);
     READ_INT_ARRAY(subplan_map, local_node->nparts);
     READ_INT_ARRAY(subpart_map, local_node->nparts);
     READ_OID_ARRAY(relid_map, local_node->nparts);
-    READ_BOOL_ARRAY(hasexecparam, local_node->nexprs);
-    READ_BOOL_FIELD(do_initial_prune);
-    READ_BOOL_FIELD(do_exec_prune);
+    READ_NODE_FIELD(initial_pruning_steps);
+    READ_NODE_FIELD(exec_pruning_steps);
     READ_BITMAPSET_FIELD(execparamids);

     READ_DONE();
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index ff3caf1..53f814f 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -85,21 +85,42 @@ typedef enum PartClauseMatchStatus
 } PartClauseMatchStatus;

 /*
+ * PartClauseTarget
+ *        Identifies which qual clauses we can use for generating pruning steps
+ */
+typedef enum PartClauseTarget
+{
+    PARTTARGET_PLANNER,            /* want to prune during planning */
+    PARTTARGET_INITIAL,            /* want to prune during executor startup */
+    PARTTARGET_EXEC                /* want to prune during each plan node scan */
+} PartClauseTarget;
+
+/*
  * GeneratePruningStepsContext
  *        Information about the current state of generation of "pruning steps"
  *        for a given set of clauses
  *
- * gen_partprune_steps() initializes an instance of this struct, which is used
- * throughout the step generation process.
+ * gen_partprune_steps() initializes and returns an instance of this struct.
+ *
+ * Note that has_mutable_op, has_mutable_arg, and has_exec_param are set if
+ * we found any potentially-useful-for-pruning clause having those properties,
+ * whether or not we actually used the clause in the steps list.  This
+ * definition allows us to skip the PARTTARGET_EXEC pass in some cases.
  */
 typedef struct GeneratePruningStepsContext
 {
-    /* Input data: */
-    bool        forplanner;        /* true when generating steps to be used
-                                 * during query planning */
-    /* Working state and result data: */
+    /* Copies of input arguments for gen_partprune_steps: */
+    RelOptInfo *rel;            /* the partitioned relation */
+    PartClauseTarget target;    /* use-case we're generating steps for */
+    /* Result data: */
+    List       *steps;            /* list of PartitionPruneSteps */
+    bool        has_mutable_op; /* clauses include any stable operators */
+    bool        has_mutable_arg;    /* clauses include any mutable comparison
+                                     * values, *other than* exec params */
+    bool        has_exec_param; /* clauses include any PARAM_EXEC params */
+    bool        contradictory;    /* clauses were proven self-contradictory */
+    /* Working state: */
     int            next_step_id;
-    List       *steps;            /* output, list of PartitionPruneSteps */
 } GeneratePruningStepsContext;

 /* The result of performing one PartitionPruneStep */
@@ -121,22 +142,20 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
                               int *relid_subplan_map,
                               List *partitioned_rels, List *prunequal,
                               Bitmapset **matchedsubplans);
-static List *gen_partprune_steps(RelOptInfo *rel, List *clauses,
-                    bool forplanner, bool *contradictory);
+static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
+                    PartClauseTarget target,
+                    GeneratePruningStepsContext *context);
 static List *gen_partprune_steps_internal(GeneratePruningStepsContext *context,
-                             RelOptInfo *rel, List *clauses,
-                             bool *contradictory);
+                             List *clauses);
 static PartitionPruneStep *gen_prune_step_op(GeneratePruningStepsContext *context,
                   StrategyNumber opstrategy, bool op_is_ne,
                   List *exprs, List *cmpfns, Bitmapset *nullkeys);
 static PartitionPruneStep *gen_prune_step_combine(GeneratePruningStepsContext *context,
                        List *source_stepids,
                        PartitionPruneCombineOp combineOp);
-static PartitionPruneStep *gen_prune_steps_from_opexps(PartitionScheme part_scheme,
-                            GeneratePruningStepsContext *context,
+static PartitionPruneStep *gen_prune_steps_from_opexps(GeneratePruningStepsContext *context,
                             List **keyclauses, Bitmapset *nullkeys);
-static PartClauseMatchStatus match_clause_to_partition_key(RelOptInfo *rel,
-                              GeneratePruningStepsContext *context,
+static PartClauseMatchStatus match_clause_to_partition_key(GeneratePruningStepsContext *context,
                               Expr *clause, Expr *partkey, int partkeyidx,
                               bool *clause_is_not_null,
                               PartClauseInfo **pc, List **clause_steps);
@@ -169,8 +188,7 @@ static PruneStepResult *get_matching_range_bounds(PartitionPruneContext *context
                           FmgrInfo *partsupfunc, Bitmapset *nullkeys);
 static Bitmapset *pull_exec_paramids(Expr *expr);
 static bool pull_exec_paramids_walker(Node *node, Bitmapset **context);
-static bool analyze_partkey_exprs(PartitionedRelPruneInfo *pinfo, List *steps,
-                      int partnatts);
+static Bitmapset *get_partkey_exec_paramids(List *steps);
 static PruneStepResult *perform_pruning_base_step(PartitionPruneContext *context,
                           PartitionPruneStepOp *opstep);
 static PruneStepResult *perform_pruning_combine_step(PartitionPruneContext *context,
@@ -178,7 +196,7 @@ static PruneStepResult *perform_pruning_combine_step(PartitionPruneContext *cont
                              PruneStepResult **step_results);
 static bool match_boolean_partition_clause(Oid partopfamily, Expr *clause,
                                Expr *partkey, Expr **outconst);
-static bool partkey_datum_from_expr(PartitionPruneContext *context,
+static void partkey_datum_from_expr(PartitionPruneContext *context,
                         Expr *expr, int stateidx,
                         Datum *value, bool *isnull);

@@ -347,10 +365,11 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
         Index        rti = lfirst_int(lc);
         RelOptInfo *subpart = find_base_rel(root, rti);
         PartitionedRelPruneInfo *pinfo;
-        int            partnatts = subpart->part_scheme->partnatts;
         List       *partprunequal;
-        List       *pruning_steps;
-        bool        contradictory;
+        List       *initial_pruning_steps;
+        List       *exec_pruning_steps;
+        Bitmapset  *execparamids;
+        GeneratePruningStepsContext context;

         /*
          * Fill the mapping array.
@@ -415,15 +434,16 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
         }

         /*
-         * Convert pruning qual to pruning steps.  Since these steps will be
-         * used in the executor, we can pass 'forplanner' as false to allow
-         * steps to be generated that are unsafe for evaluation during
-         * planning, e.g. evaluation of stable functions.
+         * Convert pruning qual to pruning steps.  We may need to do this
+         * twice, once to obtain executor startup pruning steps, and once for
+         * executor per-scan pruning steps.  This first pass creates startup
+         * pruning steps and detects whether there's any possibly-useful quals
+         * that would require per-scan pruning.
          */
-        pruning_steps = gen_partprune_steps(subpart, partprunequal, false,
-                                            &contradictory);
+        gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
+                            &context);

-        if (contradictory)
+        if (context.contradictory)
         {
             /*
              * This shouldn't happen as the planner should have detected this
@@ -437,20 +457,63 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
             return NIL;
         }

+        /*
+         * If no mutable operators or expressions appear in usable pruning
+         * clauses, then there's no point in running startup pruning, because
+         * plan-time pruning should have pruned everything prunable.
+         */
+        if (context.has_mutable_op || context.has_mutable_arg)
+            initial_pruning_steps = context.steps;
+        else
+            initial_pruning_steps = NIL;
+
+        /*
+         * If no exec Params appear in potentially-usable pruning clauses,
+         * then there's no point in even thinking about per-scan pruning.
+         */
+        if (context.has_exec_param)
+        {
+            /* ... OK, we'd better think about it */
+            gen_partprune_steps(subpart, partprunequal, PARTTARGET_EXEC,
+                                &context);
+
+            if (context.contradictory)
+            {
+                /* As above, skip run-time pruning if anything fishy happens */
+                return NIL;
+            }
+
+            exec_pruning_steps = context.steps;
+
+            /*
+             * Detect which exec Params actually got used; the fact that some
+             * were in available clauses doesn't mean we actually used them.
+             * Skip per-scan pruning if there are none.
+             */
+            execparamids = get_partkey_exec_paramids(exec_pruning_steps);
+
+            if (bms_is_empty(execparamids))
+                exec_pruning_steps = NIL;
+        }
+        else
+        {
+            /* No exec Params anywhere, so forget about scan-time pruning */
+            exec_pruning_steps = NIL;
+            execparamids = NULL;
+        }
+
+        if (initial_pruning_steps || exec_pruning_steps)
+            doruntimeprune = true;
+
         /* Begin constructing the PartitionedRelPruneInfo for this rel */
         pinfo = makeNode(PartitionedRelPruneInfo);
         pinfo->rtindex = rti;
-        pinfo->pruning_steps = pruning_steps;
+        pinfo->initial_pruning_steps = initial_pruning_steps;
+        pinfo->exec_pruning_steps = exec_pruning_steps;
+        pinfo->execparamids = execparamids;
         /* Remaining fields will be filled in the next loop */

         pinfolist = lappend(pinfolist, pinfo);
-
-        /*
-         * Determine which pruning types should be enabled at this level. This
-         * also records paramids relevant to pruning steps in 'pinfo'.
-         */
-        doruntimeprune |= analyze_partkey_exprs(pinfo, pruning_steps,
-                                                partnatts);
     }

     if (!doruntimeprune)
@@ -532,37 +595,25 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,

 /*
  * gen_partprune_steps
- *        Process 'clauses' (a rel's baserestrictinfo list of clauses) and return
- *        a list of "partition pruning steps".
+ *        Process 'clauses' (typically a rel's baserestrictinfo list of clauses)
+ *        and create a list of "partition pruning steps".
  *
- * 'forplanner' must be true when generating steps to be evaluated during
- * query planning, false when generating steps to be used at run-time.
+ * 'target' tells whether to generate pruning steps for planning (use
+ * immutable clauses only), or for executor startup (use any allowable
+ * clause except ones containing PARAM_EXEC Params), or for executor
+ * per-scan pruning (use any allowable clause).
  *
- * The result generated with forplanner=false includes all clauses that
- * are selected with forplanner=true, because in some cases we need a
- * combination of clauses to prune successfully.  For example, if we
- * are partitioning on a hash of columns A and B, and we have clauses
- * "WHERE A=constant AND B=nonconstant", we can't do anything at plan
- * time even though the first clause would be evaluable then.  And we
- * must include the first clause when called with forplanner=false,
- * or we'll fail to prune at run-time either.  This does mean that when
- * called with forplanner=false, we may return steps that don't actually
- * need to be executed at runtime; it's left to analyze_partkey_exprs()
- * to (re)discover that.
- *
- * If the clauses in the input list are contradictory or there is a
- * pseudo-constant "false", *contradictory is set to true upon return,
- * else it's set false.
+ * 'context' is an output argument that receives the steps list as well as
+ * some subsidiary flags; see the GeneratePruningStepsContext typedef.
  */
-static List *
-gen_partprune_steps(RelOptInfo *rel, List *clauses, bool forplanner,
-                    bool *contradictory)
+static void
+gen_partprune_steps(RelOptInfo *rel, List *clauses, PartClauseTarget target,
+                    GeneratePruningStepsContext *context)
 {
-    GeneratePruningStepsContext context;
-
-    context.forplanner = forplanner;
-    context.next_step_id = 0;
-    context.steps = NIL;
+    /* Initialize all output values to zero/false/NULL */
+    memset(context, 0, sizeof(GeneratePruningStepsContext));
+    context->rel = rel;
+    context->target = target;

     /*
      * For sub-partitioned tables there's a corner case where if the
@@ -593,9 +644,7 @@ gen_partprune_steps(RelOptInfo *rel, List *clauses, bool forplanner,
     }

     /* Down into the rabbit-hole. */
-    (void) gen_partprune_steps_internal(&context, rel, clauses, contradictory);
-
-    return context.steps;
+    (void) gen_partprune_steps_internal(context, clauses);
 }

 /*
@@ -613,7 +662,7 @@ prune_append_rel_partitions(RelOptInfo *rel)
 {
     List       *clauses = rel->baserestrictinfo;
     List       *pruning_steps;
-    bool        contradictory;
+    GeneratePruningStepsContext gcontext;
     PartitionPruneContext context;

     Assert(rel->part_scheme != NULL);
@@ -630,15 +679,19 @@ prune_append_rel_partitions(RelOptInfo *rel)
         return bms_add_range(NULL, 0, rel->nparts - 1);

     /*
-     * Process clauses.  If the clauses are found to be contradictory, we can
-     * return the empty set.  Pass 'forplanner' as true to indicate to the
-     * pruning code that we only want pruning steps that can be evaluated
-     * during planning.
+     * Process clauses to extract pruning steps that are usable at plan time.
+     * If the clauses are found to be contradictory, we can return the empty
+     * set.
      */
-    pruning_steps = gen_partprune_steps(rel, clauses, true,
-                                        &contradictory);
-    if (contradictory)
+    gen_partprune_steps(rel, clauses, PARTTARGET_PLANNER,
+                        &gcontext);
+    if (gcontext.contradictory)
         return NULL;
+    pruning_steps = gcontext.steps;
+
+    /* If there's nothing usable, return all partitions */
+    if (pruning_steps == NIL)
+        return bms_add_range(NULL, 0, rel->nparts - 1);

     /* Set up PartitionPruneContext */
     context.strategy = rel->part_scheme->strategy;
@@ -655,8 +708,6 @@ prune_append_rel_partitions(RelOptInfo *rel)
     /* These are not valid when being called from the planner */
     context.planstate = NULL;
     context.exprstates = NULL;
-    context.exprhasexecparam = NULL;
-    context.evalexecparams = false;

     /* Actual pruning happens here. */
     return get_matching_partitions(&context, pruning_steps);
@@ -667,7 +718,7 @@ prune_append_rel_partitions(RelOptInfo *rel)
  *        Determine partitions that survive partition pruning
  *
  * Note: context->planstate must be set to a valid PlanState when the
- * pruning_steps were generated with 'forplanner' = false.
+ * pruning_steps were generated with a target other than PARTTARGET_PLANNER.
  *
  * Returns a Bitmapset of the RelOptInfo->part_rels indexes of the surviving
  * partitions.
@@ -786,16 +837,15 @@ get_matching_partitions(PartitionPruneContext *context, List *pruning_steps)
  * even across recursive calls.
  *
  * If we find clauses that are mutually contradictory, or a pseudoconstant
- * clause that contains false, we set *contradictory to true and return NIL
- * (that is, no pruning steps).  Caller should consider all partitions as
- * pruned in that case.  Otherwise, *contradictory is set to false.
+ * clause that contains false, we set context->contradictory to true and
+ * return NIL (that is, no pruning steps).  Caller should consider all
+ * partitions as pruned in that case.
  */
 static List *
 gen_partprune_steps_internal(GeneratePruningStepsContext *context,
-                             RelOptInfo *rel, List *clauses,
-                             bool *contradictory)
+                             List *clauses)
 {
-    PartitionScheme part_scheme = rel->part_scheme;
+    PartitionScheme part_scheme = context->rel->part_scheme;
     List       *keyclauses[PARTITION_MAX_KEYS];
     Bitmapset  *nullkeys = NULL,
                *notnullkeys = NULL;
@@ -803,8 +853,6 @@ gen_partprune_steps_internal(GeneratePruningStepsContext *context,
     List       *result = NIL;
     ListCell   *lc;

-    *contradictory = false;
-
     memset(keyclauses, 0, sizeof(keyclauses));
     foreach(lc, clauses)
     {
@@ -820,7 +868,7 @@ gen_partprune_steps_internal(GeneratePruningStepsContext *context,
             (((Const *) clause)->constisnull ||
              !DatumGetBool(((Const *) clause)->constvalue)))
         {
-            *contradictory = true;
+            context->contradictory = true;
             return NIL;
         }

@@ -842,6 +890,12 @@ gen_partprune_steps_internal(GeneratePruningStepsContext *context,
                 ListCell   *lc1;

                 /*
+                 * We can share the outer context area with the recursive
+                 * call, but contradictory had better not be true yet.
+                 */
+                Assert(!context->contradictory);
+
+                /*
                  * Get pruning step for each arg.  If we get contradictory for
                  * all args, it means the OR expression is false as a whole.
                  */
@@ -851,11 +905,18 @@ gen_partprune_steps_internal(GeneratePruningStepsContext *context,
                     bool        arg_contradictory;
                     List       *argsteps;

-                    argsteps =
-                        gen_partprune_steps_internal(context, rel,
-                                                     list_make1(arg),
-                                                     &arg_contradictory);
-                    if (!arg_contradictory)
+                    argsteps = gen_partprune_steps_internal(context,
+                                                            list_make1(arg));
+                    arg_contradictory = context->contradictory;
+                    /* Keep context->contradictory clear till we're done */
+                    context->contradictory = false;
+
+                    if (arg_contradictory)
+                    {
+                        /* Just ignore self-contradictory arguments. */
+                        continue;
+                    }
+                    else
                         all_args_contradictory = false;

                     if (argsteps != NIL)
@@ -869,34 +930,28 @@ gen_partprune_steps_internal(GeneratePruningStepsContext *context,
                     else
                     {
                         /*
-                         * No steps either means that arg_contradictory is
-                         * true or the arg didn't contain a clause matching
-                         * this partition key.
+                         * The arg didn't contain a clause matching this
+                         * partition key.  We cannot prune using such an arg.
+                         * To indicate that to the pruning code, we must
+                         * construct a dummy PartitionPruneStepCombine whose
+                         * source_stepids is set to an empty List.
                          *
-                         * In case of the latter, we cannot prune using such
-                         * an arg.  To indicate that to the pruning code, we
-                         * must construct a dummy PartitionPruneStepCombine
-                         * whose source_stepids is set to an empty List.
                          * However, if we can prove using constraint exclusion
                          * that the clause refutes the table's partition
                          * constraint (if it's sub-partitioned), we need not
                          * bother with that.  That is, we effectively ignore
                          * this OR arm.
                          */
-                        List       *partconstr = rel->partition_qual;
+                        List       *partconstr = context->rel->partition_qual;
                         PartitionPruneStep *orstep;

-                        /* Just ignore this argument. */
-                        if (arg_contradictory)
-                            continue;
-
                         if (partconstr)
                         {
                             partconstr = (List *)
                                 expression_planner((Expr *) partconstr);
-                            if (rel->relid != 1)
+                            if (context->rel->relid != 1)
                                 ChangeVarNodes((Node *) partconstr, 1,
-                                               rel->relid, 0);
+                                               context->rel->relid, 0);
                             if (predicate_refuted_by(partconstr,
                                                      list_make1(arg),
                                                      false))
@@ -909,11 +964,12 @@ gen_partprune_steps_internal(GeneratePruningStepsContext *context,
                     }
                 }

-                *contradictory = all_args_contradictory;
-
-                /* Check if any contradicting clauses were found */
-                if (*contradictory)
+                /* If all the OR arms are contradictory, we can stop */
+                if (all_args_contradictory)
+                {
+                    context->contradictory = true;
                     return NIL;
+                }

                 if (arg_stepids != NIL)
                 {
@@ -937,9 +993,10 @@ gen_partprune_steps_internal(GeneratePruningStepsContext *context,
                  * recurse and later combine the component partitions sets
                  * using a combine step.
                  */
-                argsteps = gen_partprune_steps_internal(context, rel, args,
-                                                        contradictory);
-                if (*contradictory)
+                argsteps = gen_partprune_steps_internal(context, args);
+
+                /* If any AND arm is contradictory, we can stop immediately */
+                if (context->contradictory)
                     return NIL;

                 foreach(lc1, argsteps)
@@ -969,17 +1026,16 @@ gen_partprune_steps_internal(GeneratePruningStepsContext *context,
         }

         /*
-         * Must be a clause for which we can check if one of its args matches
-         * the partition key.
+         * See if we can match this clause to any of the partition keys.
          */
         for (i = 0; i < part_scheme->partnatts; i++)
         {
-            Expr       *partkey = linitial(rel->partexprs[i]);
+            Expr       *partkey = linitial(context->rel->partexprs[i]);
             bool        clause_is_not_null = false;
             PartClauseInfo *pc = NULL;
             List       *clause_steps = NIL;

-            switch (match_clause_to_partition_key(rel, context,
+            switch (match_clause_to_partition_key(context,
                                                   clause, partkey, i,
                                                   &clause_is_not_null,
                                                   &pc, &clause_steps))
@@ -993,7 +1049,7 @@ gen_partprune_steps_internal(GeneratePruningStepsContext *context,
                      */
                     if (bms_is_member(i, nullkeys))
                     {
-                        *contradictory = true;
+                        context->contradictory = true;
                         return NIL;
                     }
                     generate_opsteps = true;
@@ -1006,7 +1062,7 @@ gen_partprune_steps_internal(GeneratePruningStepsContext *context,
                         /* check for conflicting IS NOT NULL */
                         if (bms_is_member(i, notnullkeys))
                         {
-                            *contradictory = true;
+                            context->contradictory = true;
                             return NIL;
                         }
                         nullkeys = bms_add_member(nullkeys, i);
@@ -1016,7 +1072,7 @@ gen_partprune_steps_internal(GeneratePruningStepsContext *context,
                         /* check for conflicting IS NULL */
                         if (bms_is_member(i, nullkeys))
                         {
-                            *contradictory = true;
+                            context->contradictory = true;
                             return NIL;
                         }
                         notnullkeys = bms_add_member(notnullkeys, i);
@@ -1030,7 +1086,7 @@ gen_partprune_steps_internal(GeneratePruningStepsContext *context,

                 case PARTCLAUSE_MATCH_CONTRADICT:
                     /* We've nothing more to do if a contradiction was found. */
-                    *contradictory = true;
+                    context->contradictory = true;
                     return NIL;

                 case PARTCLAUSE_NOMATCH:
@@ -1091,8 +1147,7 @@ gen_partprune_steps_internal(GeneratePruningStepsContext *context,
         PartitionPruneStep *step;

         /* Strategy 2 */
-        step = gen_prune_steps_from_opexps(part_scheme, context,
-                                           keyclauses, nullkeys);
+        step = gen_prune_steps_from_opexps(context, keyclauses, nullkeys);
         if (step != NULL)
             result = lappend(result, step);
     }
@@ -1201,15 +1256,15 @@ gen_prune_step_combine(GeneratePruningStepsContext *context,
  * found for any subsequent keys; see specific notes below.
  */
 static PartitionPruneStep *
-gen_prune_steps_from_opexps(PartitionScheme part_scheme,
-                            GeneratePruningStepsContext *context,
+gen_prune_steps_from_opexps(GeneratePruningStepsContext *context,
                             List **keyclauses, Bitmapset *nullkeys)
 {
-    ListCell   *lc;
+    PartitionScheme part_scheme = context->rel->part_scheme;
     List       *opsteps = NIL;
     List       *btree_clauses[BTMaxStrategyNumber + 1],
                *hash_clauses[HTMaxStrategyNumber + 1];
     int            i;
+    ListCell   *lc;

     memset(btree_clauses, 0, sizeof(btree_clauses));
     memset(hash_clauses, 0, sizeof(hash_clauses));
@@ -1563,13 +1618,12 @@ gen_prune_steps_from_opexps(PartitionScheme part_scheme,
  *   Output arguments: none set.
  */
 static PartClauseMatchStatus
-match_clause_to_partition_key(RelOptInfo *rel,
-                              GeneratePruningStepsContext *context,
+match_clause_to_partition_key(GeneratePruningStepsContext *context,
                               Expr *clause, Expr *partkey, int partkeyidx,
                               bool *clause_is_not_null, PartClauseInfo **pc,
                               List **clause_steps)
 {
-    PartitionScheme part_scheme = rel->part_scheme;
+    PartitionScheme part_scheme = context->rel->part_scheme;
     Oid            partopfamily = part_scheme->partopfamily[partkeyidx],
                 partcoll = part_scheme->partcollation[partkeyidx];
     Expr       *expr;
@@ -1588,7 +1642,7 @@ match_clause_to_partition_key(RelOptInfo *rel,
         partclause->op_is_ne = false;
         partclause->expr = expr;
         /* We know that expr is of Boolean type. */
-        partclause->cmpfn = rel->part_scheme->partsupfunc[partkeyidx].fn_oid;
+        partclause->cmpfn = part_scheme->partsupfunc[partkeyidx].fn_oid;
         partclause->op_strategy = InvalidStrategy;

         *pc = partclause;
@@ -1647,59 +1701,6 @@ match_clause_to_partition_key(RelOptInfo *rel,
             return PARTCLAUSE_NOMATCH;

         /*
-         * Matched with this key.  Now check various properties of the clause
-         * to see if it's sane to use it for pruning.  In most of these cases,
-         * we can return UNSUPPORTED because the same failure would occur no
-         * matter which partkey it's matched to.  (In particular, now that
-         * we've successfully matched one side of the opclause to a partkey,
-         * there is no chance that matching the other side to another partkey
-         * will produce a usable result, since that'd mean there are Vars on
-         * both sides.)
-         */
-        if (context->forplanner)
-        {
-            /*
-             * When pruning in the planner, we only support pruning using
-             * comparisons to constants.  Immutable subexpressions will have
-             * been folded to constants already, and we cannot prune on the
-             * basis of anything that's not immutable.
-             */
-            if (!IsA(expr, Const))
-                return PARTCLAUSE_UNSUPPORTED;
-
-            /*
-             * Also, the comparison operator itself must be immutable.
-             */
-            if (op_volatile(opno) != PROVOLATILE_IMMUTABLE)
-                return PARTCLAUSE_UNSUPPORTED;
-        }
-        else
-        {
-            /*
-             * Otherwise, non-consts are allowed, but we can't prune using an
-             * expression that contains Vars.
-             */
-            if (contain_var_clause((Node *) expr))
-                return PARTCLAUSE_UNSUPPORTED;
-
-            /*
-             * And we must reject anything containing a volatile function.
-             * Stable functions are OK though.  (We need not check this for
-             * the comparison operator itself: anything that belongs to a
-             * partitioning operator family must be at least stable.)
-             */
-            if (contain_volatile_functions((Node *) expr))
-                return PARTCLAUSE_UNSUPPORTED;
-        }
-
-        /*
-         * Only allow strict operators.  This will guarantee nulls are
-         * filtered.
-         */
-        if (!op_strict(opno))
-            return PARTCLAUSE_UNSUPPORTED;
-
-        /*
          * See if the operator is relevant to the partitioning opfamily.
          *
          * Normally we only care about operators that are listed as being part
@@ -1740,6 +1741,95 @@ match_clause_to_partition_key(RelOptInfo *rel,
         }

         /*
+         * Only allow strict operators.  This will guarantee nulls are
+         * filtered.  (This test is likely useless, since btree and hash
+         * comparison operators are generally strict.)
+         */
+        if (!op_strict(opno))
+            return PARTCLAUSE_UNSUPPORTED;
+
+        /*
+         * OK, we have a match to the partition key and a suitable operator.
+         * Examine the other argument to see if it's usable for pruning.
+         *
+         * In most of these cases, we can return UNSUPPORTED because the same
+         * failure would occur no matter which partkey it's matched to.  (In
+         * particular, now that we've successfully matched one side of the
+         * opclause to a partkey, there is no chance that matching the other
+         * side to another partkey will produce a usable result, since that'd
+         * mean there are Vars on both sides.)
+         *
+         * Also, if we reject an argument for a target-dependent reason, set
+         * appropriate fields of *context to report that.  We postpone these
+         * tests until after matching the partkey and the operator, so as to
+         * reduce the odds of setting the context fields for clauses that do
+         * not end up contributing to pruning steps.
+         *
+         * First, check for non-Const argument.  (We assume that any immutable
+         * subexpression will have been folded to a Const already.)
+         */
+        if (!IsA(expr, Const))
+        {
+            Bitmapset  *paramids;
+
+            /*
+             * When pruning in the planner, we only support pruning using
+             * comparisons to constants.  We cannot prune on the basis of
+             * anything that's not immutable.  (Note that has_mutable_arg and
+             * has_exec_param do not get set for this target value.)
+             */
+            if (context->target == PARTTARGET_PLANNER)
+                return PARTCLAUSE_UNSUPPORTED;
+
+            /*
+             * We can never prune using an expression that contains Vars.
+             */
+            if (contain_var_clause((Node *) expr))
+                return PARTCLAUSE_UNSUPPORTED;
+
+            /*
+             * And we must reject anything containing a volatile function.
+             * Stable functions are OK though.
+             */
+            if (contain_volatile_functions((Node *) expr))
+                return PARTCLAUSE_UNSUPPORTED;
+
+            /*
+             * See if there are any exec Params.  If so, we can only use this
+             * expression during per-scan pruning.
+             */
+            paramids = pull_exec_paramids(expr);
+            if (!bms_is_empty(paramids))
+            {
+                context->has_exec_param = true;
+                if (context->target != PARTTARGET_EXEC)
+                    return PARTCLAUSE_UNSUPPORTED;
+            }
+            else
+            {
+                /* It's potentially usable, but mutable */
+                context->has_mutable_arg = true;
+            }
+        }
+
+        /*
+         * Check whether the comparison operator itself is immutable.  (We
+         * assume anything that's in a btree or hash opclass is at least
+         * stable, but we need to check for immutability.)
+         */
+        if (op_volatile(opno) != PROVOLATILE_IMMUTABLE)
+        {
+            context->has_mutable_op = true;
+
+            /*
+             * When pruning in the planner, we cannot prune with mutable
+             * operators.
+             */
+            if (context->target == PARTTARGET_PLANNER)
+                return PARTCLAUSE_UNSUPPORTED;
+        }
+
+        /*
          * Now find the procedure to use, based on the types.  If the clause's
          * other argument is of the same type as the partitioning opclass's
          * declared input type, we can use the procedure cached in
@@ -1822,7 +1912,6 @@ match_clause_to_partition_key(RelOptInfo *rel,
         List       *elem_exprs,
                    *elem_clauses;
         ListCell   *lc1;
-        bool        contradictory;

         if (IsA(leftop, RelabelType))
             leftop = ((RelabelType *) leftop)->arg;
@@ -1833,54 +1922,8 @@ match_clause_to_partition_key(RelOptInfo *rel,
             return PARTCLAUSE_NOMATCH;

         /*
-         * Matched with this key.  Check various properties of the clause to
-         * see if it can sanely be used for partition pruning (this is
-         * identical to the logic for a plain OpExpr).
-         */
-        if (context->forplanner)
-        {
-            /*
-             * When pruning in the planner, we only support pruning using
-             * comparisons to constants.  Immutable subexpressions will have
-             * been folded to constants already, and we cannot prune on the
-             * basis of anything that's not immutable.
-             */
-            if (!IsA(rightop, Const))
-                return PARTCLAUSE_UNSUPPORTED;
-
-            /*
-             * Also, the comparison operator itself must be immutable.
-             */
-            if (op_volatile(saop_op) != PROVOLATILE_IMMUTABLE)
-                return PARTCLAUSE_UNSUPPORTED;
-        }
-        else
-        {
-            /*
-             * Otherwise, non-consts are allowed, but we can't prune using an
-             * expression that contains Vars.
-             */
-            if (contain_var_clause((Node *) rightop))
-                return PARTCLAUSE_UNSUPPORTED;
-
-            /*
-             * And we must reject anything containing a volatile function.
-             * Stable functions are OK though.  (We need not check this for
-             * the comparison operator itself: anything that belongs to a
-             * partitioning operator family must be at least stable.)
-             */
-            if (contain_volatile_functions((Node *) rightop))
-                return PARTCLAUSE_UNSUPPORTED;
-        }
-
-        /*
-         * Only allow strict operators.  This will guarantee nulls are
-         * filtered.
-         */
-        if (!op_strict(saop_op))
-            return PARTCLAUSE_UNSUPPORTED;
-
-        /*
+         * See if the operator is relevant to the partitioning opfamily.
+         *
          * In case of NOT IN (..), we get a '<>', which we handle if list
          * partitioning is in use and we're able to confirm that it's negator
          * is a btree equality operator belonging to the partitioning operator
@@ -1911,12 +1954,89 @@ match_clause_to_partition_key(RelOptInfo *rel,
         }

         /*
-         * First generate a list of Const nodes, one for each array element
-         * (excepting nulls).
+         * Only allow strict operators.  This will guarantee nulls are
+         * filtered.  (This test is likely useless, since btree and hash
+         * comparison operators are generally strict.)
+         */
+        if (!op_strict(saop_op))
+            return PARTCLAUSE_UNSUPPORTED;
+
+        /*
+         * OK, we have a match to the partition key and a suitable operator.
+         * Examine the array argument to see if it's usable for pruning.  This
+         * is identical to the logic for a plain OpExpr.
+         */
+        if (!IsA(rightop, Const))
+        {
+            Bitmapset  *paramids;
+
+            /*
+             * When pruning in the planner, we only support pruning using
+             * comparisons to constants.  We cannot prune on the basis of
+             * anything that's not immutable.  (Note that has_mutable_arg and
+             * has_exec_param do not get set for this target value.)
+             */
+            if (context->target == PARTTARGET_PLANNER)
+                return PARTCLAUSE_UNSUPPORTED;
+
+            /*
+             * We can never prune using an expression that contains Vars.
+             */
+            if (contain_var_clause((Node *) rightop))
+                return PARTCLAUSE_UNSUPPORTED;
+
+            /*
+             * And we must reject anything containing a volatile function.
+             * Stable functions are OK though.
+             */
+            if (contain_volatile_functions((Node *) rightop))
+                return PARTCLAUSE_UNSUPPORTED;
+
+            /*
+             * See if there are any exec Params.  If so, we can only use this
+             * expression during per-scan pruning.
+             */
+            paramids = pull_exec_paramids(rightop);
+            if (!bms_is_empty(paramids))
+            {
+                context->has_exec_param = true;
+                if (context->target != PARTTARGET_EXEC)
+                    return PARTCLAUSE_UNSUPPORTED;
+            }
+            else
+            {
+                /* It's potentially usable, but mutable */
+                context->has_mutable_arg = true;
+            }
+        }
+
+        /*
+         * Check whether the comparison operator itself is immutable.  (We
+         * assume anything that's in a btree or hash opclass is at least
+         * stable, but we need to check for immutability.)
+         */
+        if (op_volatile(saop_op) != PROVOLATILE_IMMUTABLE)
+        {
+            context->has_mutable_op = true;
+
+            /*
+             * When pruning in the planner, we cannot prune with mutable
+             * operators.
+             */
+            if (context->target == PARTTARGET_PLANNER)
+                return PARTCLAUSE_UNSUPPORTED;
+        }
+
+        /*
+         * Examine the contents of the array argument.
          */
         elem_exprs = NIL;
         if (IsA(rightop, Const))
         {
+            /*
+             * For a constant array, convert the elements to a list of Const
+             * nodes, one for each array element (excepting nulls).
+             */
             Const       *arr = (Const *) rightop;
             ArrayType  *arrval = DatumGetArrayTypeP(arr->constvalue);
             int16        elemlen;
@@ -1968,6 +2088,9 @@ match_clause_to_partition_key(RelOptInfo *rel,
             if (arrexpr->multidims)
                 return PARTCLAUSE_UNSUPPORTED;

+            /*
+             * Otherwise, we can just use the list of element values.
+             */
             elem_exprs = arrexpr->elements;
         }
         else
@@ -2000,10 +2123,8 @@ match_clause_to_partition_key(RelOptInfo *rel,
             elem_clauses = list_make1(makeBoolExpr(OR_EXPR, elem_clauses, -1));

         /* Finally, generate steps */
-        *clause_steps =
-            gen_partprune_steps_internal(context, rel, elem_clauses,
-                                         &contradictory);
-        if (contradictory)
+        *clause_steps = gen_partprune_steps_internal(context, elem_clauses);
+        if (context->contradictory)
             return PARTCLAUSE_MATCH_CONTRADICT;
         else if (*clause_steps == NIL)
             return PARTCLAUSE_UNSUPPORTED;    /* step generation failed */
@@ -2985,89 +3106,38 @@ pull_exec_paramids_walker(Node *node, Bitmapset **context)
 }

 /*
- * analyze_partkey_exprs
- *        Loop through all pruning steps and identify which ones require
- *        executor startup-time or executor run-time pruning.
- *
- * Returns true if any executor partition pruning should be attempted at this
- * level.  Also fills fields of *pinfo to record how to process each step.
+ * get_partkey_exec_paramids
+ *        Loop through given pruning steps and find out which exec Params
+ *        are used.
  *
- * Note: when this is called, not much of *pinfo is valid; but that's OK
- * since we only use it as an output area.
+ * Returns a Bitmapset of Param IDs.
  */
-static bool
-analyze_partkey_exprs(PartitionedRelPruneInfo *pinfo, List *steps,
-                      int partnatts)
+static Bitmapset *
+get_partkey_exec_paramids(List *steps)
 {
-    bool        doruntimeprune = false;
+    Bitmapset  *execparamids = NULL;
     ListCell   *lc;

-    /*
-     * Steps require run-time pruning if they contain EXEC_PARAM Params.
-     * Otherwise, if their expressions aren't simple Consts or they involve
-     * non-immutable comparison operators, they require startup-time pruning.
-     * (Otherwise, the pruning would have been done at plan time.)
-     *
-     * Notice that what we actually check for mutability is the comparison
-     * functions, not the original operators.  This relies on the support
-     * functions of the btree or hash opfamily being marked consistently with
-     * the operators.
-     */
-    pinfo->nexprs = list_length(steps) * partnatts;
-    pinfo->hasexecparam = (bool *) palloc0(sizeof(bool) * pinfo->nexprs);
-    pinfo->do_initial_prune = false;
-    pinfo->do_exec_prune = false;
-    pinfo->execparamids = NULL;
-
     foreach(lc, steps)
     {
         PartitionPruneStepOp *step = (PartitionPruneStepOp *) lfirst(lc);
         ListCell   *lc2;
-        ListCell   *lc3;
-        int            keyno;

         if (!IsA(step, PartitionPruneStepOp))
             continue;

-        keyno = 0;
-        Assert(list_length(step->exprs) == list_length(step->cmpfns));
-        forboth(lc2, step->exprs, lc3, step->cmpfns)
+        foreach(lc2, step->exprs)
         {
             Expr       *expr = lfirst(lc2);
-            Oid            fnoid = lfirst_oid(lc3);

+            /* We can be quick for plain Consts */
             if (!IsA(expr, Const))
-            {
-                Bitmapset  *execparamids = pull_exec_paramids(expr);
-                bool        hasexecparams;
-                int            stateidx = PruneCxtStateIdx(partnatts,
-                                                        step->step.step_id,
-                                                        keyno);
-
-                Assert(stateidx < pinfo->nexprs);
-                hasexecparams = !bms_is_empty(execparamids);
-                pinfo->hasexecparam[stateidx] = hasexecparams;
-                pinfo->execparamids = bms_join(pinfo->execparamids,
-                                               execparamids);
-
-                if (hasexecparams)
-                    pinfo->do_exec_prune = true;
-                else
-                    pinfo->do_initial_prune = true;
-
-                doruntimeprune = true;
-            }
-            else if (func_volatile(fnoid) != PROVOLATILE_IMMUTABLE)
-            {
-                /* No exec params here, but must do initial pruning */
-                pinfo->do_initial_prune = true;
-                doruntimeprune = true;
-            }
-            keyno++;
+                execparamids = bms_join(execparamids,
+                                        pull_exec_paramids(expr));
         }
     }

-    return doruntimeprune;
+    return execparamids;
 }

 /*
@@ -3125,56 +3195,54 @@ perform_pruning_base_step(PartitionPruneContext *context,
             Expr       *expr;
             Datum        datum;
             bool        isnull;
+            Oid            cmpfn;

             expr = lfirst(lc1);
             stateidx = PruneCxtStateIdx(context->partnatts,
                                         opstep->step.step_id, keyno);
-            if (partkey_datum_from_expr(context, expr, stateidx,
-                                        &datum, &isnull))
-            {
-                Oid            cmpfn;
+            partkey_datum_from_expr(context, expr, stateidx,
+                                    &datum, &isnull);

-                /*
-                 * Since we only allow strict operators in pruning steps, any
-                 * null-valued comparison value must cause the comparison to
-                 * fail, so that no partitions could match.
-                 */
-                if (isnull)
-                {
-                    PruneStepResult *result;
-
-                    result = (PruneStepResult *) palloc(sizeof(PruneStepResult));
-                    result->bound_offsets = NULL;
-                    result->scan_default = false;
-                    result->scan_null = false;
+            /*
+             * Since we only allow strict operators in pruning steps, any
+             * null-valued comparison value must cause the comparison to fail,
+             * so that no partitions could match.
+             */
+            if (isnull)
+            {
+                PruneStepResult *result;

-                    return result;
-                }
+                result = (PruneStepResult *) palloc(sizeof(PruneStepResult));
+                result->bound_offsets = NULL;
+                result->scan_default = false;
+                result->scan_null = false;

-                /* Set up the stepcmpfuncs entry, unless we already did */
-                cmpfn = lfirst_oid(lc2);
-                Assert(OidIsValid(cmpfn));
-                if (cmpfn != context->stepcmpfuncs[stateidx].fn_oid)
-                {
-                    /*
-                     * If the needed support function is the same one cached
-                     * in the relation's partition key, copy the cached
-                     * FmgrInfo.  Otherwise (i.e., when we have a cross-type
-                     * comparison), an actual lookup is required.
-                     */
-                    if (cmpfn == context->partsupfunc[keyno].fn_oid)
-                        fmgr_info_copy(&context->stepcmpfuncs[stateidx],
-                                       &context->partsupfunc[keyno],
-                                       context->ppccontext);
-                    else
-                        fmgr_info_cxt(cmpfn, &context->stepcmpfuncs[stateidx],
-                                      context->ppccontext);
-                }
+                return result;
+            }

-                values[keyno] = datum;
-                nvalues++;
+            /* Set up the stepcmpfuncs entry, unless we already did */
+            cmpfn = lfirst_oid(lc2);
+            Assert(OidIsValid(cmpfn));
+            if (cmpfn != context->stepcmpfuncs[stateidx].fn_oid)
+            {
+                /*
+                 * If the needed support function is the same one cached in
+                 * the relation's partition key, copy the cached FmgrInfo.
+                 * Otherwise (i.e., when we have a cross-type comparison), an
+                 * actual lookup is required.
+                 */
+                if (cmpfn == context->partsupfunc[keyno].fn_oid)
+                    fmgr_info_copy(&context->stepcmpfuncs[stateidx],
+                                   &context->partsupfunc[keyno],
+                                   context->ppccontext);
+                else
+                    fmgr_info_cxt(cmpfn, &context->stepcmpfuncs[stateidx],
+                                  context->ppccontext);
             }

+            values[keyno] = datum;
+            nvalues++;
+
             lc1 = lnext(lc1);
             lc2 = lnext(lc2);
         }
@@ -3392,16 +3460,17 @@ match_boolean_partition_clause(Oid partopfamily, Expr *clause, Expr *partkey,
  * partkey_datum_from_expr
  *        Evaluate expression for potential partition pruning
  *
- * Evaluate 'expr', whose ExprState is stateidx of the context exprstate
- * array; set *value and *isnull to the resulting Datum and nullflag.
- * Return true if evaluation was possible, otherwise false.
+ * Evaluate 'expr'; set *value and *isnull to the resulting Datum and nullflag.
+ *
+ * If expr isn't a Const, its ExprState is in stateidx of the context
+ * exprstate array.
  *
  * Note that the evaluated result may be in the per-tuple memory context of
  * context->planstate->ps_ExprContext, and we may have leaked other memory
  * there too.  This memory must be recovered by resetting that ExprContext
  * after we're done with the pruning operation (see execPartition.c).
  */
-static bool
+static void
 partkey_datum_from_expr(PartitionPruneContext *context,
                         Expr *expr, int stateidx,
                         Datum *value, bool *isnull)
@@ -3413,35 +3482,20 @@ partkey_datum_from_expr(PartitionPruneContext *context,

         *value = con->constvalue;
         *isnull = con->constisnull;
-        return true;
     }
     else
     {
+        ExprState  *exprstate;
+        ExprContext *ectx;
+
         /*
          * We should never see a non-Const in a step unless we're running in
          * the executor.
          */
         Assert(context->planstate != NULL);

-        /*
-         * When called from the executor we'll have a valid planstate so we
-         * may be able to evaluate an expression which could not be folded to
-         * a Const during planning.  Since run-time pruning can occur both
-         * during initialization of the executor or while it's running, we
-         * must be careful here to evaluate expressions containing PARAM_EXEC
-         * Params only when told it's OK.
-         */
-        if (context->evalexecparams || !context->exprhasexecparam[stateidx])
-        {
-            ExprState  *exprstate;
-            ExprContext *ectx;
-
-            exprstate = context->exprstates[stateidx];
-            ectx = context->planstate->ps_ExprContext;
-            *value = ExecEvalExprSwitchContext(exprstate, ectx, isnull);
-            return true;
-        }
+        exprstate = context->exprstates[stateidx];
+        ectx = context->planstate->ps_ExprContext;
+        *value = ExecEvalExprSwitchContext(exprstate, ectx, isnull);
     }
-
-    return false;
 }
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index b363aba..465a975 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -62,24 +62,24 @@ typedef struct PartitionRoutingInfo
  * subpart_map                    Subpart index by partition index, or -1.
  * present_parts                A Bitmapset of the partition indexes that we
  *                                have subplans or subparts for.
- * context                        Contains the context details required to call
- *                                the partition pruning code.
- * pruning_steps                List of PartitionPruneSteps used to
- *                                perform the actual pruning.
- * do_initial_prune                true if pruning should be performed during
- *                                executor startup (for this partitioning level).
- * do_exec_prune                true if pruning should be performed during
- *                                executor run (for this partitioning level).
+ * initial_pruning_steps        List of PartitionPruneSteps used to
+ *                                perform executor startup pruning.
+ * exec_pruning_steps            List of PartitionPruneSteps used to
+ *                                perform per-scan pruning.
+ * initial_context                If initial_pruning_steps isn't NIL, contains
+ *                                the details needed to execute those steps.
+ * exec_context                    If exec_pruning_steps isn't NIL, contains
+ *                                the details needed to execute those steps.
  */
 typedef struct PartitionedRelPruningData
 {
     int           *subplan_map;
     int           *subpart_map;
     Bitmapset  *present_parts;
-    PartitionPruneContext context;
-    List       *pruning_steps;
-    bool        do_initial_prune;
-    bool        do_exec_prune;
+    List       *initial_pruning_steps;
+    List       *exec_pruning_steps;
+    PartitionPruneContext initial_context;
+    PartitionPruneContext exec_context;
 } PartitionedRelPruningData;

 /*
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 1cce762..1241245 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -1109,21 +1109,23 @@ typedef struct PartitionedRelPruneInfo
 {
     NodeTag        type;
     Index        rtindex;        /* RT index of partition rel for this level */
-    List       *pruning_steps;    /* List of PartitionPruneStep, see below */
     Bitmapset  *present_parts;    /* Indexes of all partitions which subplans or
-                                 * subparts are present for. */
-    int            nparts;            /* Length of subplan_map[] and subpart_map[] */
-    int            nexprs;            /* Length of hasexecparam[] */
+                                 * subparts are present for */
+    int            nparts;            /* Length of the following arrays: */
     int           *subplan_map;    /* subplan index by partition index, or -1 */
     int           *subpart_map;    /* subpart index by partition index, or -1 */
     Oid           *relid_map;        /* relation OID by partition index, or 0 */
-    bool       *hasexecparam;    /* true if corresponding pruning_step contains
-                                 * any PARAM_EXEC Params. */
-    bool        do_initial_prune;    /* true if pruning should be performed
-                                     * during executor startup. */
-    bool        do_exec_prune;    /* true if pruning should be performed during
-                                 * executor run. */
-    Bitmapset  *execparamids;    /* All PARAM_EXEC Param IDs in pruning_steps */
+
+    /*
+     * initial_pruning_steps shows how to prune during executor startup (i.e.,
+     * without use of any PARAM_EXEC Params); it is NIL if no startup pruning
+     * is required.  exec_pruning_steps shows how to prune with PARAM_EXEC
+     * Params; it is NIL if no per-scan pruning is required.
+     */
+    List       *initial_pruning_steps;    /* List of PartitionPruneStep */
+    List       *exec_pruning_steps; /* List of PartitionPruneStep */
+    Bitmapset  *execparamids;    /* All PARAM_EXEC Param IDs in
+                                 * exec_pruning_steps */
 } PartitionedRelPruneInfo;

 /*
diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h
index 2f75717..b906ae1 100644
--- a/src/include/partitioning/partprune.h
+++ b/src/include/partitioning/partprune.h
@@ -44,10 +44,6 @@ struct RelOptInfo;
  * exprstates        Array of ExprStates, indexed as per PruneCtxStateIdx; one
  *                    for each partition key in each pruning step.  Allocated if
  *                    planstate is non-NULL, otherwise NULL.
- * exprhasexecparam    Array of bools, each true if corresponding 'exprstate'
- *                    expression contains any PARAM_EXEC Params.  (Can be NULL
- *                    if planstate is NULL.)
- * evalexecparams    True if it's safe to evaluate PARAM_EXEC Params.
  */
 typedef struct PartitionPruneContext
 {
@@ -61,8 +57,6 @@ typedef struct PartitionPruneContext
     MemoryContext ppccontext;
     PlanState  *planstate;
     ExprState **exprstates;
-    bool       *exprhasexecparam;
-    bool        evalexecparams;
 } PartitionPruneContext;

 /*
diff --git a/src/test/regress/expected/partition_prune.out b/src/test/regress/expected/partition_prune.out
index 723fc70..841bd8b 100644
--- a/src/test/regress/expected/partition_prune.out
+++ b/src/test/regress/expected/partition_prune.out
@@ -3149,6 +3149,45 @@ select * from mc3p where a < 3 and abs(b) = 1;
          Filter: ((a < 3) AND (abs(b) = 1))
 (7 rows)

+--
+-- Check that pruning with composite range partitioning works correctly when
+-- a combination of runtime parameters is specified, not all of whose values
+-- are available at the same time
+--
+set plan_cache_mode = force_generic_plan;
+prepare ps1 as
+  select * from mc3p where a = $1 and abs(b) < (select 3);
+explain (analyze, costs off, summary off, timing off)
+execute ps1(1);
+                   QUERY PLAN
+-------------------------------------------------
+ Append (actual rows=1 loops=1)
+   InitPlan 1 (returns $0)
+     ->  Result (actual rows=1 loops=1)
+   Subplans Removed: 2
+   ->  Seq Scan on mc3p1 (actual rows=1 loops=1)
+         Filter: ((a = $1) AND (abs(b) < $0))
+(6 rows)
+
+deallocate ps1;
+prepare ps2 as
+  select * from mc3p where a <= $1 and abs(b) < (select 3);
+explain (analyze, costs off, summary off, timing off)
+execute ps2(1);
+                   QUERY PLAN
+-------------------------------------------------
+ Append (actual rows=2 loops=1)
+   InitPlan 1 (returns $0)
+     ->  Result (actual rows=1 loops=1)
+   Subplans Removed: 1
+   ->  Seq Scan on mc3p0 (actual rows=1 loops=1)
+         Filter: ((a <= $1) AND (abs(b) < $0))
+   ->  Seq Scan on mc3p1 (actual rows=1 loops=1)
+         Filter: ((a <= $1) AND (abs(b) < $0))
+(8 rows)
+
+deallocate ps2;
+reset plan_cache_mode;
 drop table mc3p;
 -- Ensure runtime pruning works with initplans params with boolean types
 create table boolvalues (value bool not null);
diff --git a/src/test/regress/sql/partition_prune.sql b/src/test/regress/sql/partition_prune.sql
index 2373bd8..071e28d 100644
--- a/src/test/regress/sql/partition_prune.sql
+++ b/src/test/regress/sql/partition_prune.sql
@@ -809,6 +809,24 @@ insert into mc3p values (0, 1, 1), (1, 1, 1), (2, 1, 1);
 explain (analyze, costs off, summary off, timing off)
 select * from mc3p where a < 3 and abs(b) = 1;

+--
+-- Check that pruning with composite range partitioning works correctly when
+-- a combination of runtime parameters is specified, not all of whose values
+-- are available at the same time
+--
+set plan_cache_mode = force_generic_plan;
+prepare ps1 as
+  select * from mc3p where a = $1 and abs(b) < (select 3);
+explain (analyze, costs off, summary off, timing off)
+execute ps1(1);
+deallocate ps1;
+prepare ps2 as
+  select * from mc3p where a <= $1 and abs(b) < (select 3);
+explain (analyze, costs off, summary off, timing off)
+execute ps2(1);
+deallocate ps2;
+reset plan_cache_mode;
+
 drop table mc3p;

 -- Ensure runtime pruning works with initplans params with boolean types
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index a6c6de7..a06f28a 100644
*** a/src/backend/commands/explain.c
--- b/src/backend/commands/explain.c
*************** static void show_sort_keys(SortState *so
*** 82,87 ****
--- 82,89 ----
                 ExplainState *es);
  static void show_merge_append_keys(MergeAppendState *mstate, List *ancestors,
                         ExplainState *es);
+ static void show_pruning_info(PartitionPruneInfo *pinfo, List *ancestors,
+                        ExplainState *es);
  static void show_agg_keys(AggState *astate, List *ancestors,
                ExplainState *es);
  static void show_grouping_sets(PlanState *planstate, Agg *agg,
*************** ExplainNode(PlanState *planstate, List *
*** 1841,1849 ****
--- 1843,1857 ----
              show_sort_keys(castNode(SortState, planstate), ancestors, es);
              show_sort_info(castNode(SortState, planstate), es);
              break;
+         case T_Append:
+             show_pruning_info(((Append *) plan)->part_prune_info,
+                               ancestors, es);
+             break;
          case T_MergeAppend:
              show_merge_append_keys(castNode(MergeAppendState, planstate),
                                     ancestors, es);
+             show_pruning_info(((MergeAppend *) plan)->part_prune_info,
+                               ancestors, es);
              break;
          case T_Result:
              show_upper_qual((List *) ((Result *) plan)->resconstantqual,
*************** show_merge_append_keys(MergeAppendState
*** 2198,2203 ****
--- 2206,2258 ----
  }

  /*
+  * Show runtime pruning info, if any
+  */
+ static void
+ show_pruning_info(PartitionPruneInfo *ppinfo, List *ancestors,
+                   ExplainState *es)
+ {
+     ListCell   *lc;
+
+     if (ppinfo == NULL)
+         return;
+     Assert(IsA(ppinfo, PartitionPruneInfo));
+     foreach(lc, ppinfo->prune_infos)
+     {
+         List       *prune_infos = lfirst(lc);
+         ListCell   *l2;
+
+         foreach(l2, prune_infos)
+         {
+             PartitionedRelPruneInfo *pinfo = lfirst(l2);
+
+             if (pinfo->initial_pruning_steps ||
+                 pinfo->exec_pruning_steps)
+             {
+                 Index rti = pinfo->rtindex;
+                 RangeTblEntry *rte;
+                 char       *refname;
+
+                 rte = rt_fetch(rti, es->rtable);
+                 refname = (char *) list_nth(es->rtable_names, rti - 1);
+                 if (refname == NULL)
+                     refname = rte->eref->aliasname;
+
+                 if (es->format == EXPLAIN_FORMAT_TEXT)
+                 {
+                     appendStringInfoSpaces(es->str, es->indent * 2);
+                     appendStringInfo(es->str,
+                                      "Runtime Pruning: %s has %d initial steps, %d exec steps\n",
+                                      refname,
+                                      list_length(pinfo->initial_pruning_steps),
+                                      list_length(pinfo->exec_pruning_steps));
+                 }
+             }
+         }
+     }
+ }
+
+ /*
   * Show the grouping keys for an Agg node.
   */
  static void
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index a6c6de7..19708ab 100644
*** a/src/backend/commands/explain.c
--- b/src/backend/commands/explain.c
*************** static void show_sort_keys(SortState *so
*** 82,87 ****
--- 82,89 ----
                 ExplainState *es);
  static void show_merge_append_keys(MergeAppendState *mstate, List *ancestors,
                         ExplainState *es);
+ static void show_pruning_info(PartitionPruneInfo *pinfo, List *ancestors,
+                        ExplainState *es);
  static void show_agg_keys(AggState *astate, List *ancestors,
                ExplainState *es);
  static void show_grouping_sets(PlanState *planstate, Agg *agg,
*************** ExplainNode(PlanState *planstate, List *
*** 1841,1849 ****
--- 1843,1857 ----
              show_sort_keys(castNode(SortState, planstate), ancestors, es);
              show_sort_info(castNode(SortState, planstate), es);
              break;
+         case T_Append:
+             show_pruning_info(((Append *) plan)->part_prune_info,
+                               ancestors, es);
+             break;
          case T_MergeAppend:
              show_merge_append_keys(castNode(MergeAppendState, planstate),
                                     ancestors, es);
+             show_pruning_info(((MergeAppend *) plan)->part_prune_info,
+                               ancestors, es);
              break;
          case T_Result:
              show_upper_qual((List *) ((Result *) plan)->resconstantqual,
*************** show_merge_append_keys(MergeAppendState
*** 2198,2203 ****
--- 2206,2258 ----
  }

  /*
+  * Show runtime pruning info, if any
+  */
+ static void
+ show_pruning_info(PartitionPruneInfo *ppinfo, List *ancestors,
+                   ExplainState *es)
+ {
+     ListCell   *lc;
+
+     if (ppinfo == NULL)
+         return;
+     Assert(IsA(ppinfo, PartitionPruneInfo));
+     foreach(lc, ppinfo->prune_infos)
+     {
+         List       *prune_infos = lfirst(lc);
+         ListCell   *l2;
+
+         foreach(l2, prune_infos)
+         {
+             PartitionedRelPruneInfo *pinfo = lfirst(l2);
+
+             if (pinfo->do_initial_prune ||
+                 pinfo->do_exec_prune)
+             {
+                 Index rti = pinfo->rtindex;
+                 RangeTblEntry *rte;
+                 char       *refname;
+
+                 rte = rt_fetch(rti, es->rtable);
+                 refname = (char *) list_nth(es->rtable_names, rti - 1);
+                 if (refname == NULL)
+                     refname = rte->eref->aliasname;
+
+                 if (es->format == EXPLAIN_FORMAT_TEXT)
+                 {
+                     appendStringInfoSpaces(es->str, es->indent * 2);
+                     appendStringInfo(es->str,
+                                      "Runtime Pruning: %s has %d initial steps, %d exec steps\n",
+                                      refname,
+                                      pinfo->do_initial_prune ? list_length(pinfo->pruning_steps) : 0,
+                                      pinfo->do_exec_prune ? list_length(pinfo->pruning_steps) : 0);
+                 }
+             }
+         }
+     }
+ }
+
+ /*
   * Show the grouping keys for an Agg node.
   */
  static void

Re: inconsistent results querying table partitioned by date

От
David Rowley
Дата:
On Sat, 18 May 2019 at 09:30, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Since time is growing short before beta1 wrap, I'm going to go ahead
> and push this in hopes of getting buildfarm cycles on it.  But feel
> free to review and look for further improvements.  There's plenty
> of time to rethink the data-structure details for the v11 backpatch,
> also.

Thanks for working on this.  I did have a look and was a bit
disappointed to see the code will end up calling pull_exec_paramids()
3 times for any expression with exec param IDs. I can knock that down
to just 2 by caching the exec Param IDs of the used quals in
GeneratePruningStepsContext. I imagine you didn't do this since its
possible we reject the qual even after has_exec_param is set, but if
we just delay recording the param IDs until after we're certain we're
using the qual then it should be safe to do this.  This also gets rid
of the entire get_partkey_exec_paramids() function, so has the
additional saving of not looping through all the steps once more.  I
had thought reducing duplicate work like this was one of the driving
factors for this change, although I understand the main one for this
was fixing the bug Amit was working on.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Вложения

Re: inconsistent results querying table partitioned by date

От
Tom Lane
Дата:
David Rowley <david.rowley@2ndquadrant.com> writes:
> Thanks for working on this.  I did have a look and was a bit
> disappointed to see the code will end up calling pull_exec_paramids()
> 3 times for any expression with exec param IDs. I can knock that down
> to just 2 by caching the exec Param IDs of the used quals in
> GeneratePruningStepsContext. I imagine you didn't do this since its
> possible we reject the qual even after has_exec_param is set,

Exactly.

> but if
> we just delay recording the param IDs until after we're certain we're
> using the qual then it should be safe to do this.

I thought about that too, but it seems awfully messy/fragile; it's not
clear to me at what point this code is really guaranteed to have accepted
a clause.  The recursion for AND/OR, in particular, seems to make that
kind of questionable.

If we could be sure about that I'd be inclined to apply the principle
to the has_mutable_arg and has_exec_param flags as well as exec_paramids,
and change the order of tests in match_clause_to_partition_key back to
what it was, relying on the outer control layer to not propagate that
data into the overall state until we knew whether the clause was really
going to contribute to the steps.

            regards, tom lane



Re: inconsistent results querying table partitioned by date

От
Amit Langote
Дата:
On 2019/05/18 23:58, Tom Lane wrote:
> David Rowley <david.rowley@2ndquadrant.com> writes:
>> Thanks for working on this.

+1.  Thanks a lot Tom for the cleanup.

>> I did have a look and was a bit
>> disappointed to see the code will end up calling pull_exec_paramids()
>> 3 times for any expression with exec param IDs. I can knock that down
>> to just 2 by caching the exec Param IDs of the used quals in
>> GeneratePruningStepsContext. I imagine you didn't do this since its
>> possible we reject the qual even after has_exec_param is set,
> 
> Exactly.

Yeah, the point at which we can be sure that any given expression (Params,
etc.) will be in a pruning step is when we actually build the step
containing that expression.  So, going over the resulting pruning steps to
collect execparamids is as good as collecting them in the context struct
(a Bitmapset field of it) as the steps are built.   For example,
gen_prune_step_op() itself may add pull_exec_paramids(exprs) to
context->execparamids when building the individual steps, instead of
get_partkey_exec_paramids() going over steps and collecting them from the
individual expressions of PartitionPruneStepOp steps.  However, doing that
only changes the place where that work is done, not reduce the amount of
work that needs to be done.

>> but if
>> we just delay recording the param IDs until after we're certain we're
>> using the qual then it should be safe to do this.
> 
> I thought about that too, but it seems awfully messy/fragile; it's not
> clear to me at what point this code is really guaranteed to have accepted
> a clause.  The recursion for AND/OR, in particular, seems to make that
> kind of questionable.
> 
> If we could be sure about that I'd be inclined to apply the principle
> to the has_mutable_arg and has_exec_param flags as well as exec_paramids,
> and change the order of tests in match_clause_to_partition_key back to
> what it was, relying on the outer control layer to not propagate that
> data into the overall state until we knew whether the clause was really
> going to contribute to the steps.

Per what I mentioned above, this idea means we'll want to update all of
has_mutable_arg, has_exec_param and exec_paramids in gen_prune_step_op(),
by which point it's clear that the clause(s) having those properties
really do contribute to steps.

However, make_partitionedrel_pruneinfo() throws these steps away if it can
determine from those properties that run-time pruning will be useless.
So, I was thinking maybe there is no point in building the steps in the
first place.  IOW, setting those properties in
match_clause_to_partition_key() as is done now and skipping the part of
gen_partprune_steps_internal() where clauses are turned into steps, if the
combination of context->target and the properties suggests that those step
will be useless anyway.  However, inventing a new meaning for
gen_partprune_steps_internal() returning NIL isn't without problems,
because it's interpreted in a certain way when it's invoked recursively
for an individual arm of an OR clause.  I couldn't find an easy way to get
around that, so I had to scrap the idea.

Thanks,
Amit