Обсуждение: inconsistent results querying table partitioned by date
Hi Im having a problem with querying a table partitioned by date. Depending on the sequence of operations on a date parameter used in a where clause, I either get the expected results, orno results. This suggests a bug in the handling of date parameters and partition range handling. I’ve replicated this down to a single sequence of create table, insert data, query. This issue occurs for me on postgresql 11.2 on a mac, installed via brew. In this case the table is partitioned by an id and then by the date, if it is partitioned by only the date everything worksas expected. However, I am attempting to add partitioning to a fairly large sofware-as-a-service platform, so making changes to the tabledefinitions or global code changes is not really practical. The sql in question is below. I hope there is something simple I can change in the partition definitions to work around this. Many Thanks, Alan Jackson Data Architect TVSquared --SQL STARTS HERE --drop table dataid; CREATE TABLE dataid ( id integer not null, datadatetime timestamp without time zone NOT NULL, CONSTRAINT dataid_pkey PRIMARY KEY (id, datadatetime) ) PARTITION BY RANGE (id, datadatetime) ; CREATE TABLE dataid_201902 PARTITION OF dataid FOR VALUES FROM (1, '2019-02-01 00:00:00') TO (1, '2019-03-01 00:00:00'); CREATE TABLE dataid_default PARTITION OF dataid DEFAULT; insert into dataid values (1,'2019-02-24T00:00:00'); --- returns 1 row as expected select * from dataid where id=1 and datadatetime < (('2019-02-26T00:00:00'::timestamp::timestamp at time zone 'America/New_York') at time zone 'UTC' + '2 days'::interval); --- returns no rows select * from dataid where id=1 and datadatetime < (('2019-02-26T00:00:00'::timestamp::timestamp at time zone 'America/New_York'+ '2 days'::interval) at time zone 'UTC'); -- both date expressions evaluate to the same date. select (('2019-02-26T00:00:00'::timestamp::timestamp at time zone 'America/New_York' ) at time zone 'UTC' + '2 days'::interval)as workingdate, (('2019-02-26T00:00:00'::timestamp::timestamp at time zone 'America/New_York' + '2 days'::interval) at time zone 'UTC') asnotworkingdate; --SQL ENDS HERE -- TV Squared Limited is a company registered in Scotland. Registered number: SC421072. Registered office: CodeBase, Argyle House, 3 Lady Lawson Street, Edinburgh, EH3 9DR. TV Squared Inc (File No. 5600204) is an Incorporated company registered in Delaware. Principal office: 1412 Broadway, 22 Fl, New York, New York, 10018 TV Squared GmbH is a company registered in Munich. Registered number: HRB 236077. Registered office: Oskar-von-Miller-Ring 20, c/o wework, 80333 Munchen This message is private and confidential. If you have received this message in error, please notify us and remove it from your system.
Alan Jackson <ajax@tvsquared.com> writes: > Im having a problem with querying a table partitioned by date. > Depending on the sequence of operations on a date parameter used in a where clause, I either get the expected results,or no results. Yeah, this is pretty clearly broken. It looks to me like the partition pruning code is making insupportable assumptions about a comparison to a stable expression. Using your example table: regression=# explain select * from dataid where id=1 and datadatetime < localtimestamp; QUERY PLAN ---------------------------------------------------------------------------------- Bitmap Heap Scan on dataid_default (cost=4.19..11.31 rows=3 width=12) Recheck Cond: ((id = 1) AND (datadatetime < LOCALTIMESTAMP)) -> Bitmap Index Scan on dataid_default_pkey (cost=0.00..4.19 rows=3 width=0) Index Cond: ((id = 1) AND (datadatetime < LOCALTIMESTAMP)) (4 rows) It should absolutely not have pruned away the dataid_201902 partition, but it did. It's okay with an immutable expression: regression=# explain select * from dataid where id=1 and datadatetime < '2019-05-09'::timestamp; QUERY PLAN -------------------------------------------------------------------------------------------------------------- Append (cost=4.18..22.63 rows=6 width=12) -> Bitmap Heap Scan on dataid_201902 (cost=4.18..11.30 rows=3 width=12) Recheck Cond: ((id = 1) AND (datadatetime < '2019-05-09 00:00:00'::timestamp without time zone)) -> Bitmap Index Scan on dataid_201902_pkey (cost=0.00..4.18 rows=3 width=0) Index Cond: ((id = 1) AND (datadatetime < '2019-05-09 00:00:00'::timestamp without time zone)) -> Bitmap Heap Scan on dataid_default (cost=4.18..11.30 rows=3 width=12) Recheck Cond: ((id = 1) AND (datadatetime < '2019-05-09 00:00:00'::timestamp without time zone)) -> Bitmap Index Scan on dataid_default_pkey (cost=0.00..4.18 rows=3 width=0) Index Cond: ((id = 1) AND (datadatetime < '2019-05-09 00:00:00'::timestamp without time zone)) (9 rows) or a volatile one: regression=# explain select * from dataid where id=1 and datadatetime < '2019-05-09'::timestamp + random()*'1 day'::interval; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------ Append (cost=4.23..29.80 rows=6 width=12) -> Bitmap Heap Scan on dataid_201902 (cost=4.23..14.88 rows=3 width=12) Recheck Cond: (id = 1) Filter: (datadatetime < ('2019-05-09 00:00:00'::timestamp without time zone + (random() * '1 day'::interval))) -> Bitmap Index Scan on dataid_201902_pkey (cost=0.00..4.23 rows=10 width=0) Index Cond: (id = 1) -> Bitmap Heap Scan on dataid_default (cost=4.23..14.88 rows=3 width=12) Recheck Cond: (id = 1) Filter: (datadatetime < ('2019-05-09 00:00:00'::timestamp without time zone + (random() * '1 day'::interval))) -> Bitmap Index Scan on dataid_default_pkey (cost=0.00..4.23 rows=10 width=0) Index Cond: (id = 1) (11 rows) but somebody's confused about what can be done with stable expressions. While I'm on about it, this behavior is also insupportable: regression=# explain select * from dataid where id=1 and datadatetime < '2018-05-09'::timestamptz ; QUERY PLAN -------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on dataid_default (cost=4.18..11.30 rows=3 width=12) Recheck Cond: ((id = 1) AND (datadatetime < '2018-05-09 00:00:00-04'::timestamp with time zone)) -> Bitmap Index Scan on dataid_default_pkey (cost=0.00..4.18 rows=3 width=0) Index Cond: ((id = 1) AND (datadatetime < '2018-05-09 00:00:00-04'::timestamp with time zone)) (4 rows) because timestamp-against-timestamptz comparison is inherently only stable; the pruning code is way exceeding its authority by supposing that a comparison that holds at plan time will hold at runtime, even with a constant comparison value. The reason for the difference in your results is that one expression is immutable and the other is only stable: regression=# explain verbose select (('2019-02-26T00:00:00'::timestamp::timestamp at time zone 'America/New_York' ) at time zone 'UTC' + '2 days'::interval)as workingdate, (('2019-02-26T00:00:00'::timestamp::timestamp at time zone 'America/New_York' + '2 days'::interval) at time zone 'UTC') asnotworkingdate; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------------- Result (cost=0.00..0.01 rows=1 width=16) Output: '2019-02-28 05:00:00'::timestamp without time zone, timezone('UTC'::text, ('2019-02-26 00:00:00-05'::timestampwith time zone + '2 days'::interval)) (2 rows) the reason being that timestamptz + interval depends on the timezone setting (for some intervals) but timestamp + interval never does. Seems to be equally broken in v11 and HEAD. I didn't try v10. > I hope there is something simple I can change in the partition definitions to work around this. Until we fix the bug, I think the best you can do is not use stable expressions in this context. regards, tom lane
On 2019/05/10 8:22, Tom Lane wrote: > Alan Jackson <ajax@tvsquared.com> writes: >> Im having a problem with querying a table partitioned by date. >> Depending on the sequence of operations on a date parameter used in a where clause, I either get the expected results,or no results. > > Yeah, this is pretty clearly broken. Indeed it is. > It looks to me like the partition > pruning code is making insupportable assumptions about a comparison to > a stable expression. Using your example table: > > regression=# explain select * from dataid where id=1 and datadatetime < localtimestamp; > QUERY PLAN > ---------------------------------------------------------------------------------- > Bitmap Heap Scan on dataid_default (cost=4.19..11.31 rows=3 width=12) > Recheck Cond: ((id = 1) AND (datadatetime < LOCALTIMESTAMP)) > -> Bitmap Index Scan on dataid_default_pkey (cost=0.00..4.19 rows=3 width=0) > Index Cond: ((id = 1) AND (datadatetime < LOCALTIMESTAMP)) > (4 rows) > > It should absolutely not have pruned away the dataid_201902 partition, > but it did. > > While I'm on about it, this behavior is also insupportable: > > regression=# explain select * from dataid where id=1 and datadatetime < '2018-05-09'::timestamptz ; (likely a typo in the condition: s/2018/2019) > QUERY PLAN > -------------------------------------------------------------------------------------------------------- > Bitmap Heap Scan on dataid_default (cost=4.18..11.30 rows=3 width=12) > Recheck Cond: ((id = 1) AND (datadatetime < '2018-05-09 00:00:00-04'::timestamp with time zone)) > -> Bitmap Index Scan on dataid_default_pkey (cost=0.00..4.18 rows=3 width=0) > Index Cond: ((id = 1) AND (datadatetime < '2018-05-09 00:00:00-04'::timestamp with time zone)) > (4 rows) > > because timestamp-against-timestamptz comparison is inherently only > stable; the pruning code is way exceeding its authority by supposing > that a comparison that holds at plan time will hold at runtime, > even with a constant comparison value. I looked into it and the problem is not really that plan-time pruning is comparing stable expressions against partition bounds. If it had, it wouldn't have pruned dataid_201902 anyway, because its bounding range for datadatetime is '2019-02-01' to '2019-03-01', which is clearly less than '2019-05-09' (Tom's localtimestamp). The real problem seems to be with the way range partition pruning assumes an operator strategy to perform pruning with. Quoted query's WHERE clause looks something like this: (firstkey = CONSTANT AND secondkey < STABLE_EXPRESSION). From this set of clauses, a (CONSTANT, STABLE_EXPRESSION) tuple is formed to be compared against partition bounds using row-comparison-like semantics. As things stand today, the actual comparison function (partprune.c: get_matching_range_bounds()) receives only the strategy of the last expression, which in this case is that of a LESS operator. When the tuple (CONSTANT, STABLE_EXPRESSION) passes through the last step to extract Datum values to be passed to get_matching_range_bounds, it's correctly determined that STABLE_EXPRESSION cannot be computed during planning and so the tuple is truncated to just (CONSTANT-Datum), but the strategy to assume during pruning is still that of the LESS operator, whereas now it should really be EQUAL. With LESS semantics, get_matching_range_bounds() concludes that no partition bounds are smaller than 1 (extracted from id=1 in the above query), except the default partition, so it prunes 'dataid_201902'. I've attached a patch to fix that. Actually, I've attached two patches -- the 1st one adds a test for the misbehaving case with *wrong* output wherein a partition is incorrectly pruned, and the 2nd actually fixes the bug and updates the output of the test added by the 1st patch. Divided the patch this way just to show the bug clearly. > Seems to be equally broken in v11 and HEAD. I didn't try v10. v10 is fine, as it uses constraint exclusion. Attached patches apply to both v11 and HEAD. Thanks, Amit
Вложения
At Fri, 10 May 2019 14:37:34 +0900, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote in <49cd5235-cbe6-e686-8014-85c1e45efe56@lab.ntt.co.jp> > > because timestamp-against-timestamptz comparison is inherently only > > stable; the pruning code is way exceeding its authority by supposing > > that a comparison that holds at plan time will hold at runtime, > > even with a constant comparison value. > > I looked into it and the problem is not really that plan-time pruning is > comparing stable expressions against partition bounds. If it had, it > wouldn't have pruned dataid_201902 anyway, because its bounding range for > datadatetime is '2019-02-01' to '2019-03-01', which is clearly less than > '2019-05-09' (Tom's localtimestamp). > > The real problem seems to be with the way range partition pruning assumes > an operator strategy to perform pruning with. Quoted query's WHERE clause > looks something like this: (firstkey = CONSTANT AND secondkey < > STABLE_EXPRESSION). From this set of clauses, a (CONSTANT, > STABLE_EXPRESSION) tuple is formed to be compared against partition bounds > using row-comparison-like semantics. As things stand today, the actual > comparison function (partprune.c: get_matching_range_bounds()) receives > only the strategy of the last expression, which in this case is that of a > LESS operator. When the tuple (CONSTANT, STABLE_EXPRESSION) passes > through the last step to extract Datum values to be passed to > get_matching_range_bounds, it's correctly determined that > STABLE_EXPRESSION cannot be computed during planning and so the tuple is > truncated to just (CONSTANT-Datum), but the strategy to assume during > pruning is still that of the LESS operator, whereas now it should really > be EQUAL. With LESS semantics, get_matching_range_bounds() concludes that > no partition bounds are smaller than 1 (extracted from id=1 in the above > query), except the default partition, so it prunes 'dataid_201902'. I concluded the same. > I've attached a patch to fix that. Actually, I've attached two patches -- > the 1st one adds a test for the misbehaving case with *wrong* output > wherein a partition is incorrectly pruned, and the 2nd actually fixes the > bug and updates the output of the test added by the 1st patch. Divided > the patch this way just to show the bug clearly. But this seems a bit wrong. If the two partition keys were in reverse order, pruning still fails. CREATE TABLE dataid2 ( datadatetime timestamp without time zone NOT NULL, id integer not null, CONSTRAINT dataid2_pkey PRIMARY KEY (datadatetime, id) ) PARTITION BY RANGE (datadatetime, id); CREATE TABLE dataid2_201902 PARTITION OF dataid2 FOR VALUES FROM ('2019-02-01 00:00:00', 1) TO ('2019-03-01 00:00:00', 1); CREATE TABLE dataid2_default PARTITION OF dataid2 DEFAULT; insert into dataid2 values ('2019-02-24T00:00:00', 1); select * from dataid2 where id = 1 and datadatetime < (('2019-02-26T00:00:00'::timestamp::timestamp at time zone 'America/New_York'+ '2 days'::interval) at time zone 'UTC'); datadatetime | id --------------+---- (0 rows) This is wrong. The condition is divided into two part (id = 1) and (datadatetime < ..) and the latter reduces to nothing and the former remains unchanged. Pruning continues using id = 1 and (I suppose) but that is not partition_range_datum_bsearch()'s assumption. As the result all partitions (other than default) are gone. In passing I found a typo while looking this issue. | case BTLessStrategyNumber: | | /* | * Look for the greatest bound that is < or <= lookup value and | * set minoff to its offset. I think the "minoff" is typo of "maxoff". > > Seems to be equally broken in v11 and HEAD. I didn't try v10. > > v10 is fine, as it uses constraint exclusion. > > Attached patches apply to both v11 and HEAD. Mmm. This doesn't apply on head on my environment. > patching file src/test/regress/expected/partition_prune.out > Hunk #1 FAILED at 951. git rev-parse --short HEAD d0bbf871ca regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Horiguchi-san, Thanks for checking. On 2019/05/10 15:33, Kyotaro HORIGUCHI wrote: > At Fri, 10 May 2019 14:37:34 +0900, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote: >> I've attached a patch to fix that. Actually, I've attached two patches -- >> the 1st one adds a test for the misbehaving case with *wrong* output >> wherein a partition is incorrectly pruned, and the 2nd actually fixes the >> bug and updates the output of the test added by the 1st patch. Divided >> the patch this way just to show the bug clearly. > > But this seems a bit wrong. > > If the two partition keys were in reverse order, pruning still > fails. > > CREATE TABLE dataid2 ( > datadatetime timestamp without time zone NOT NULL, > id integer not null, > CONSTRAINT dataid2_pkey PRIMARY KEY (datadatetime, id) > ) PARTITION BY RANGE (datadatetime, id); > > CREATE TABLE dataid2_201902 PARTITION OF dataid2 FOR VALUES FROM ('2019-02-01 00:00:00', 1) TO ('2019-03-01 00:00:00',1); > > CREATE TABLE dataid2_default PARTITION OF dataid2 DEFAULT; > > insert into dataid2 values ('2019-02-24T00:00:00', 1); > > select * from dataid2 where id = 1 and datadatetime < (('2019-02-26T00:00:00'::timestamp::timestamp at time zone 'America/New_York'+ '2 days'::interval) at time zone 'UTC'); > datadatetime | id > --------------+---- > (0 rows) > > This is wrong. Ah, indeed. > The condition is divided into two part (id = 1) and (datadatetime > < ..) and the latter reduces to nothing and the former remains > unchanged. Pruning continues using id = 1 and (I suppose) but > that is not partition_range_datum_bsearch()'s assumption. As the > result all partitions (other than default) are gone. I'd have expected this to result in no values being passed to get_matching_range_bounds (nvalues = 0), because the value of the expression compared against the 1st key is unavailable at planning time, meaning the values for subsequent keys should be ignored. That would have meant there's nothing to prune with and hence no pruning should have occurred. The problem however does not seem to be with the new logic I proposed but what turned out to be another bug in gen_prune_steps_from_opexps(). Given that datadatetime is the first key in your example table and it's being compared using a non-inclusive operator, clauses for subsequent key columns (in this case id = 1) should have been ignored by pruning, which fails to happen due to a bug in gen_prune_steps_from_opexps(). I've fixed that in the attached updated patch. With the patch, queries like yours return the correct result as shown below: truncate dataid2; insert into dataid2 (datadatetime, id) select (('2019-02-22T00:00:00'::timestamp::timestamp at time zone 'America/New_York' + '2 days'::interval) at time zone 'UTC'), 1; insert into dataid2 (datadatetime, id) select (('2019-02-22T00:00:00'::timestamp::timestamp at time zone 'America/New_York' + '1 days'::interval) at time zone 'UTC'), 1; -- pruning steps generated for only 1st column (key1 < expr1) -- but no pruning occurs during planning because the value for -- datadatetime's clause is unavailable explain select * from dataid2 where id = 1 and datadatetime < (('2019-02-22T00:00:00'::timestamp::timestamp at time zone 'America/New_York' + '2 days'::interval) at time zone 'UTC'); QUERY PLAN ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── Append (cost=10.96..36.21 rows=6 width=12) -> Bitmap Heap Scan on dataid2_201902 (cost=10.96..18.09 rows=3 width=12) Recheck Cond: ((datadatetime < timezone('UTC'::text, ('2019-02-22 14:00:00+09'::timestamp with time zone + '2 days'::interval))) AND (id = 1)) -> Bitmap Index Scan on dataid2_201902_pkey (cost=0.00..10.96 rows=3 width=0) Index Cond: ((datadatetime < timezone('UTC'::text, ('2019-02-22 14:00:00+09'::timestamp with time zone + '2 days'::interval))) AND (id = 1)) -> Bitmap Heap Scan on dataid2_default (cost=10.96..18.09 rows=3 width=12) Recheck Cond: ((datadatetime < timezone('UTC'::text, ('2019-02-22 14:00:00+09'::timestamp with time zone + '2 days'::interval))) AND (id = 1)) -> Bitmap Index Scan on dataid2_default_pkey (cost=0.00..10.96 rows=3 width=0) Index Cond: ((datadatetime < timezone('UTC'::text, ('2019-02-22 14:00:00+09'::timestamp with time zone + '2 days'::interval))) AND (id = 1)) (9 rows) select * from dataid2 where id = 1 and datadatetime < (('2019-02-22T00:00:00'::timestamp::timestamp at time zone 'America/New_York' + '2 days'::interval) at time zone 'UTC'); datadatetime │ id ─────────────────────┼──── 2019-02-23 05:00:00 │ 1 (1 row) -- pruning steps generated for both columns (key1 = expr1, key2 = expr2), -- but no pruning occurs during planning because the value for -- datadatetime's clause is unavailable explain select * from dataid2 where id = 1 and datadatetime = (('2019-02-22T00:00:00'::timestamp::timestamp at time zone 'America/New_York' + '2 days'::interval) at time zone 'UTC'); QUERY PLAN ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── Append (cost=0.16..16.37 rows=2 width=12) Subplans Removed: 1 -> Index Only Scan using dataid2_201902_pkey on dataid2_201902 (cost=0.16..8.18 rows=1 width=12) Index Cond: ((datadatetime = timezone('UTC'::text, ('2019-02-22 14:00:00+09'::timestamp with time zone + '2 days'::interval))) AND (id = 1)) (4 rows) select * from dataid2 where id = 1 and datadatetime = (('2019-02-22T00:00:00'::timestamp::timestamp at time zone 'America/New_York' + '2 days'::interval) at time zone 'UTC'); datadatetime │ id ─────────────────────┼──── 2019-02-24 05:00:00 │ 1 (1 row) > In passing I found a typo while looking this issue. > > | case BTLessStrategyNumber: > | > | /* > | * Look for the greatest bound that is < or <= lookup value and > | * set minoff to its offset. > > I think the "minoff" is typo of "maxoff". Ah, fixed in the updated patch. >>> Seems to be equally broken in v11 and HEAD. I didn't try v10. >> >> v10 is fine, as it uses constraint exclusion. >> >> Attached patches apply to both v11 and HEAD. > > Mmm. This doesn't apply on head on my environment. > >> patching file src/test/regress/expected/partition_prune.out >> Hunk #1 FAILED at 951. > > git rev-parse --short HEAD > d0bbf871ca Hmm, I rebased the patch's branch over master and didn't get any conflicts. Please check with the new patch. Thanks, Amit
Вложения
Hi Many thanks for the quick and patch-filled response to this issue. It’s great to see such an able and active response! I’m glad my bug report highlighted the problem sufficiently clearly. Is this likely to turn into an official patch, and if so, where can I track which release etc it will be included in? Thanks again, Alan Jackson Data Architect TVSquared > On 10 May 2019, at 10:18, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote: > > Horiguchi-san, > > Thanks for checking. > >> On 2019/05/10 15:33, Kyotaro HORIGUCHI wrote: >> At Fri, 10 May 2019 14:37:34 +0900, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote: >>> I've attached a patch to fix that. Actually, I've attached two patches -- >>> the 1st one adds a test for the misbehaving case with *wrong* output >>> wherein a partition is incorrectly pruned, and the 2nd actually fixes the >>> bug and updates the output of the test added by the 1st patch. Divided >>> the patch this way just to show the bug clearly. >> >> But this seems a bit wrong. >> >> If the two partition keys were in reverse order, pruning still >> fails. >> >> CREATE TABLE dataid2 ( >> datadatetime timestamp without time zone NOT NULL, >> id integer not null, >> CONSTRAINT dataid2_pkey PRIMARY KEY (datadatetime, id) >> ) PARTITION BY RANGE (datadatetime, id); >> >> CREATE TABLE dataid2_201902 PARTITION OF dataid2 FOR VALUES FROM ('2019-02-01 00:00:00', 1) TO ('2019-03-01 00:00:00',1); >> >> CREATE TABLE dataid2_default PARTITION OF dataid2 DEFAULT; >> >> insert into dataid2 values ('2019-02-24T00:00:00', 1); >> >> select * from dataid2 where id = 1 and datadatetime < (('2019-02-26T00:00:00'::timestamp::timestamp at time zone 'America/New_York'+ '2 days'::interval) at time zone 'UTC'); >> datadatetime | id >> --------------+---- >> (0 rows) >> >> This is wrong. > > Ah, indeed. > >> The condition is divided into two part (id = 1) and (datadatetime >> < ..) and the latter reduces to nothing and the former remains >> unchanged. Pruning continues using id = 1 and (I suppose) but >> that is not partition_range_datum_bsearch()'s assumption. As the >> result all partitions (other than default) are gone. > > I'd have expected this to result in no values being passed to > get_matching_range_bounds (nvalues = 0), because the value of the > expression compared against the 1st key is unavailable at planning time, > meaning the values for subsequent keys should be ignored. That would have > meant there's nothing to prune with and hence no pruning should have > occurred. The problem however does not seem to be with the new logic I > proposed but what turned out to be another bug in > gen_prune_steps_from_opexps(). > > Given that datadatetime is the first key in your example table and it's > being compared using a non-inclusive operator, clauses for subsequent key > columns (in this case id = 1) should have been ignored by pruning, which > fails to happen due to a bug in gen_prune_steps_from_opexps(). I've fixed > that in the attached updated patch. With the patch, queries like yours > return the correct result as shown below: > > truncate dataid2; > > insert into dataid2 (datadatetime, id) select > (('2019-02-22T00:00:00'::timestamp::timestamp at time zone > 'America/New_York' + '2 days'::interval) at time zone 'UTC'), 1; > > insert into dataid2 (datadatetime, id) select > (('2019-02-22T00:00:00'::timestamp::timestamp at time zone > 'America/New_York' + '1 days'::interval) at time zone 'UTC'), 1; > > -- pruning steps generated for only 1st column (key1 < expr1) > -- but no pruning occurs during planning because the value for > -- datadatetime's clause is unavailable > > explain select * from dataid2 where id = 1 and datadatetime < > (('2019-02-22T00:00:00'::timestamp::timestamp at time zone > 'America/New_York' + '2 days'::interval) at time zone 'UTC'); > > QUERY PLAN > > ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── > Append (cost=10.96..36.21 rows=6 width=12) > -> Bitmap Heap Scan on dataid2_201902 (cost=10.96..18.09 rows=3 width=12) > Recheck Cond: ((datadatetime < timezone('UTC'::text, ('2019-02-22 > 14:00:00+09'::timestamp with time zone + '2 days'::interval))) AND (id = 1)) > -> Bitmap Index Scan on dataid2_201902_pkey (cost=0.00..10.96 > rows=3 width=0) > Index Cond: ((datadatetime < timezone('UTC'::text, > ('2019-02-22 14:00:00+09'::timestamp with time zone + '2 > days'::interval))) AND (id = 1)) > -> Bitmap Heap Scan on dataid2_default (cost=10.96..18.09 rows=3 > width=12) > Recheck Cond: ((datadatetime < timezone('UTC'::text, ('2019-02-22 > 14:00:00+09'::timestamp with time zone + '2 days'::interval))) AND (id = 1)) > -> Bitmap Index Scan on dataid2_default_pkey (cost=0.00..10.96 > rows=3 width=0) > Index Cond: ((datadatetime < timezone('UTC'::text, > ('2019-02-22 14:00:00+09'::timestamp with time zone + '2 > days'::interval))) AND (id = 1)) > (9 rows) > > select * from dataid2 where id = 1 and datadatetime < > (('2019-02-22T00:00:00'::timestamp::timestamp at time zone > 'America/New_York' + '2 days'::interval) at time zone 'UTC'); > datadatetime │ id > ─────────────────────┼──── > 2019-02-23 05:00:00 │ 1 > (1 row) > > -- pruning steps generated for both columns (key1 = expr1, key2 = expr2), > -- but no pruning occurs during planning because the value for > -- datadatetime's clause is unavailable > > explain select * from dataid2 where id = 1 and datadatetime = > (('2019-02-22T00:00:00'::timestamp::timestamp at time zone > 'America/New_York' + '2 days'::interval) at time zone 'UTC'); > > QUERY PLAN > > ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── > Append (cost=0.16..16.37 rows=2 width=12) > Subplans Removed: 1 > -> Index Only Scan using dataid2_201902_pkey on dataid2_201902 > (cost=0.16..8.18 rows=1 width=12) > Index Cond: ((datadatetime = timezone('UTC'::text, ('2019-02-22 > 14:00:00+09'::timestamp with time zone + '2 days'::interval))) AND (id = 1)) > (4 rows) > > > select * from dataid2 where id = 1 and datadatetime = > (('2019-02-22T00:00:00'::timestamp::timestamp at time zone > 'America/New_York' + '2 days'::interval) at time zone 'UTC'); > datadatetime │ id > ─────────────────────┼──── > 2019-02-24 05:00:00 │ 1 > (1 row) > >> In passing I found a typo while looking this issue. >> >> | case BTLessStrategyNumber: >> | >> | /* >> | * Look for the greatest bound that is < or <= lookup value and >> | * set minoff to its offset. >> >> I think the "minoff" is typo of "maxoff". > > Ah, fixed in the updated patch. > >>>> Seems to be equally broken in v11 and HEAD. I didn't try v10. >>> >>> v10 is fine, as it uses constraint exclusion. >>> >>> Attached patches apply to both v11 and HEAD. >> >> Mmm. This doesn't apply on head on my environment. >> >>> patching file src/test/regress/expected/partition_prune.out >>> Hunk #1 FAILED at 951. >> >> git rev-parse --short HEAD >> d0bbf871ca > > Hmm, I rebased the patch's branch over master and didn't get any > conflicts. Please check with the new patch. > > Thanks, > Amit > <v2-0001-Add-test.patch> > <v2-0002-Bug-fix.patch> -- TV Squared Limited is a company registered in Scotland. Registered number: SC421072. Registered office: CodeBase, Argyle House, 3 Lady Lawson Street, Edinburgh, EH3 9DR. TV Squared Inc (File No. 5600204) is an Incorporated company registered in Delaware. Principal office: 1412 Broadway, 22 Fl, New York, New York, 10018 TV Squared GmbH is a company registered in Munich. Registered number: HRB 236077. Registered office: Oskar-von-Miller-Ring 20, c/o wework, 80333 Munchen This message is private and confidential. If you have received this message in error, please notify us and remove it from your system.
Alan Jackson <ajax@tvsquared.com> writes: > Is this likely to turn into an official patch, and if so, where can I track which release etc it will be included in? We're definitely going to do *something*, though whether the current patches are the last word is too early to say. Unfortunately, we just missed the regular quarterly minor releases [1]. Unless a bug turns up that's sufficiently urgent to justify an off-schedule release (no, this one isn't), the next patch release for v11 will be 11.4 in August. I'd say the odds are near 100% that some fix will be committed before then. Actually, the odds are high that we'll have committed a fix within a few days; we don't normally let this sort of discussion drag on. If this problem is enough of a show-stopper for you, you could consider applying the patch locally once it's committed. regards, tom lane [1] https://www.postgresql.org/developer/roadmap/
Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes: > [ v2 patches ] While this does seem to be fixing real bugs, it's failing to address my point about stable vs. immutable operators. Taking Alan's test case again (and using the v2 patch), consider: regression=# explain select * from dataid where id=1 and datadatetime < '2018-01-01'::timestamp; QUERY PLAN -------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on dataid_default (cost=4.18..11.30 rows=3 width=12) Recheck Cond: ((id = 1) AND (datadatetime < '2018-01-01 00:00:00'::timestamp without time zone)) -> Bitmap Index Scan on dataid_default_pkey (cost=0.00..4.18 rows=3 width=0) Index Cond: ((id = 1) AND (datadatetime < '2018-01-01 00:00:00'::timestamp without time zone)) (4 rows) That's fine. The given date is older than anything in the dataid_201902 partition, so we can prune. But change it to this: regression=# explain select * from dataid where id=1 and datadatetime < '2018-01-01'::timestamptz; QUERY PLAN -------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on dataid_default (cost=4.18..11.30 rows=3 width=12) Recheck Cond: ((id = 1) AND (datadatetime < '2018-01-01 00:00:00-05'::timestamp with time zone)) -> Bitmap Index Scan on dataid_default_pkey (cost=0.00..4.18 rows=3 width=0) Index Cond: ((id = 1) AND (datadatetime < '2018-01-01 00:00:00-05'::timestamp with time zone)) (4 rows) That's not fine. What we have here is a "timestamp < timestamptz" operator, which is only stable, therefore it might give different results at runtime than at plan time. You can't make plan-time pruning decisions with that. What we should have gotten here was an Append node that could do run-time pruning. regards, tom lane
On 2019/05/11 6:05, Tom Lane wrote: > regression=# explain select * from dataid where id=1 and datadatetime < '2018-01-01'::timestamptz; > QUERY PLAN > -------------------------------------------------------------------------------------------------------- > Bitmap Heap Scan on dataid_default (cost=4.18..11.30 rows=3 width=12) > Recheck Cond: ((id = 1) AND (datadatetime < '2018-01-01 00:00:00-05'::timestamp with time zone)) > -> Bitmap Index Scan on dataid_default_pkey (cost=0.00..4.18 rows=3 width=0) > Index Cond: ((id = 1) AND (datadatetime < '2018-01-01 00:00:00-05'::timestamp with time zone)) > (4 rows) > > That's not fine. What we have here is a "timestamp < timestamptz" > operator, which is only stable, therefore it might give different > results at runtime than at plan time. You can't make plan-time > pruning decisions with that. What we should have gotten here was > an Append node that could do run-time pruning. You're right. It seems that prune_append_rel_partitions() is forgetting to filter mutable clauses from rel->baserestrictinfo, like relation_excluded_by_constraints() does. I fixed that in the attached 0003 patch, which also adds a test for this scenario. I needed to also tweak run-time pruning support code a bit so that it considers the cases involving mutable functions as requiring (startup) run-time pruning, in addition to the cases with mutable expressions. Adding David if he wants to comment. 0003 patch cannot be applied as-is to both REL_11_STABLE and HEAD branches, so I've attached two files, one for each branch. BTW, while looking at this, I came across this comment in parse_coerce.c: coerce_type(): * XXX if the typinput function is not immutable, we really ought to * postpone evaluation of the function call until runtime. But there * is no way to represent a typinput function call as an expression * tree, because C-string values are not Datums. (XXX This *is* * possible as of 7.3, do we want to do it?) Should something be done about that? Thanks, Amit
Вложения
On Mon, 13 May 2019 at 17:40, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote: > > On 2019/05/11 6:05, Tom Lane wrote: > > regression=# explain select * from dataid where id=1 and datadatetime < '2018-01-01'::timestamptz; > > QUERY PLAN > > -------------------------------------------------------------------------------------------------------- > > Bitmap Heap Scan on dataid_default (cost=4.18..11.30 rows=3 width=12) > > Recheck Cond: ((id = 1) AND (datadatetime < '2018-01-01 00:00:00-05'::timestamp with time zone)) > > -> Bitmap Index Scan on dataid_default_pkey (cost=0.00..4.18 rows=3 width=0) > > Index Cond: ((id = 1) AND (datadatetime < '2018-01-01 00:00:00-05'::timestamp with time zone)) > > (4 rows) > > > > That's not fine. What we have here is a "timestamp < timestamptz" > > operator, which is only stable, therefore it might give different > > results at runtime than at plan time. You can't make plan-time > > pruning decisions with that. What we should have gotten here was > > an Append node that could do run-time pruning. > > You're right. It seems that prune_append_rel_partitions() is forgetting > to filter mutable clauses from rel->baserestrictinfo, like > relation_excluded_by_constraints() does. I fixed that in the attached > 0003 patch, which also adds a test for this scenario. I needed to also > tweak run-time pruning support code a bit so that it considers the cases > involving mutable functions as requiring (startup) run-time pruning, in > addition to the cases with mutable expressions. Adding David if he wants > to comment. Yeah. I don't think you're going about this the right way. I don't really see why we need to make any changes to the run-time pruning code here, that part seems fine to me. The problem seems to be that match_clause_to_partition_key() thinks it can use a non-const expression to compare to the partition key. All immutable function calls will already be folded to constants by this time, so what's wrong with just insisting that the value being compared to the partition key is a constant when generating steps during planning? When we generate steps for run-time pruning we can use stable function return values, but obviously still not volatile functions. So isn't the fix just to provide gen_partprune_steps with some sort of context as to the environment that it's generating steps for? Fixing it by insisting the planner pruning only uses consts saves us some cycles where we check for Vars and volatile expressions too. That's a pretty good saving when you consider the fact that the expression sometimes could be fairly complex. I did it that way in the attached then tried running with your tests and got failures. After looking at those your tests look like they're expecting the wrong results. E.g +explain (analyze, costs off, summary off, timing off) select * from mc3p where a = 1 and abs(b) < (select 2); + QUERY PLAN +-------------------------------------------------------- + Append (actual rows=0 loops=1) + InitPlan 1 (returns $0) + -> Result (actual rows=1 loops=1) + -> Seq Scan on mc3p0 (actual rows=0 loops=1) + Filter: ((a = 1) AND (abs(b) < $0)) + -> Seq Scan on mc3p_default (actual rows=0 loops=1) + Filter: ((a = 1) AND (abs(b) < $0)) +(7 rows) to which, it does not seem very good that you pruned "mc3p1", since (with my patch) postgres=# insert into mc3p (a,b,c) values(1,1,1); INSERT 0 1 postgres=# select tableoid::regclass,* from mc3p where a = 1 and abs(b) < (select 2); tableoid | a | b | c ----------+---+---+--- mc3p1 | 1 | 1 | 1 (1 row) If those tests were passing in your code then you'd have missed that row... Oops. I've attached how I think the fix should look and included some tests too. I was a bit unsure if it was worth adding the new enum or not. Probably could be just a bool, but I thought enum as it makes looking at function calls easier on the human eye. I also included a test that ensures an immutable function's value is used for pruning during planning... Half of me thinks that test is pretty useless since the return value is converted to a const well before pruning, but I left it there anyway... I've not done the PG11 version of the patch but I can do it if the PG12 one looks okay. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Вложения
On 2019/05/14 10:50, David Rowley wrote: > On Mon, 13 May 2019 at 17:40, Amit Langote wrote: >> On 2019/05/11 6:05, Tom Lane wrote: >>> regression=# explain select * from dataid where id=1 and datadatetime < '2018-01-01'::timestamptz; >>> QUERY PLAN >>> -------------------------------------------------------------------------------------------------------- >>> Bitmap Heap Scan on dataid_default (cost=4.18..11.30 rows=3 width=12) >>> Recheck Cond: ((id = 1) AND (datadatetime < '2018-01-01 00:00:00-05'::timestamp with time zone)) >>> -> Bitmap Index Scan on dataid_default_pkey (cost=0.00..4.18 rows=3 width=0) >>> Index Cond: ((id = 1) AND (datadatetime < '2018-01-01 00:00:00-05'::timestamp with time zone)) >>> (4 rows) >>> >>> That's not fine. What we have here is a "timestamp < timestamptz" >>> operator, which is only stable, therefore it might give different >>> results at runtime than at plan time. You can't make plan-time >>> pruning decisions with that. What we should have gotten here was >>> an Append node that could do run-time pruning. >> >> You're right. It seems that prune_append_rel_partitions() is forgetting >> to filter mutable clauses from rel->baserestrictinfo, like >> relation_excluded_by_constraints() does. I fixed that in the attached >> 0003 patch, which also adds a test for this scenario. I needed to also >> tweak run-time pruning support code a bit so that it considers the cases >> involving mutable functions as requiring (startup) run-time pruning, in >> addition to the cases with mutable expressions. Adding David if he wants >> to comment. > > Yeah. I don't think you're going about this the right way. I don't > really see why we need to make any changes to the run-time pruning > code here, that part seems fine to me. The problem seems to be that > match_clause_to_partition_key() thinks it can use a non-const > expression to compare to the partition key. All immutable function > calls will already be folded to constants by this time, so what's > wrong with just insisting that the value being compared to the > partition key is a constant when generating steps during planning? The problem is different. '2018-01-01'::timestamptz' in the condition datadatetime < '2018-01-01'::timestamptz as presented to match_clause_to_partition_key() is indeed a Const node, making it think that it's OK to prune using it, that is, with or without your patch. Here's the result for Tom's query quoted above, with your patch applied: explain analyze select * from dataid where id=1 and datadatetime < '2018-01-01'::timestamptz; QUERY PLAN ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── Bitmap Heap Scan on dataid_default (cost=4.18..11.30 rows=3 width=12) (actual time=0.050..0.058 rows=0 loops=1) Recheck Cond: ((id = 1) AND (datadatetime < '2018-01-01 00:00:00+09'::timestamp with time zone)) -> Bitmap Index Scan on dataid_default_pkey (cost=0.00..4.18 rows=3 width=0) (actual time=0.027..0.035 rows=0 loops=1) Index Cond: ((id = 1) AND (datadatetime < '2018-01-01 00:00:00+09'::timestamp with time zone)) Planning Time: 33660.807 ms Execution Time: 0.236 ms (6 rows) which is same as without the patch and is wrong as Tom complains. His complaint is that planning-time pruning should not have considered this clause, because its result is only stable, not immutable. That is, the operator '<' (function timestamp_lt_timestamptz() in this case) is only stable, not immutable. The expected plan in this case is Append node with run-time pruning set up and initial pruning will do the pruning, which you get with my patch: explain select * from dataid where id=1 and datadatetime < '2018-01-01'::timestamptz; QUERY PLAN ────────────────────────────────────────────────────────────────────────────────────────────────────────────── Append (cost=4.18..22.63 rows=6 width=12) Subplans Removed: 1 -> Bitmap Heap Scan on dataid_default (cost=4.18..11.30 rows=3 width=12) Recheck Cond: ((id = 1) AND (datadatetime < '2018-01-01 00:00:00+09'::timestamp with time zone)) -> Bitmap Index Scan on dataid_default_pkey (cost=0.00..4.18 rows=3 width=0) Index Cond: ((id = 1) AND (datadatetime < '2018-01-01 00:00:00+09'::timestamp with time zone)) (6 rows) The case you seem to be thinking of is where the condition is of shape "partkey op stable-valued-function", but that case works fine today, so needs no fixing. Thanks, Amit
On Tue, 14 May 2019 at 16:07, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote: > > The problem is different. '2018-01-01'::timestamptz' in the condition > datadatetime < '2018-01-01'::timestamptz as presented to > match_clause_to_partition_key() is indeed a Const node, making it think > that it's OK to prune using it, that is, with or without your patch. Looks like I misunderstood the problem. Is it not still better to ignore the unsupported quals in match_clause_to_partition_key() rather than crafting a new List of just the valid ones like you've done in your patch? I admit that at some point during our parallel development of pruning and run-time pruning, I thought that the pruning steps generated for plan-time pruning could have been reused for run-time pruning, but we can't do that since we must also use parameterised quals, which are not in the baserestrictinfo therefore not there when we generate the steps for the plan-time pruning. I feel that what you've got spreads the logic out a bit too much. match_clause_to_partition_key() has traditionally been in charge of deciding what quals can be used and which ones can't. You've gone and added logic in prune_append_rel_partitions() that makes part of this decision now and that feels a bit wrong to me. I've attached patch of how I think it should work. This includes the changes you made to analyze_partkey_exprs() and your tests from v1-0003-Fix-planner-to-not-prune-partitions-with-non-immu.patch Do you think it's a bad idea to do it this way? -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Вложения
On 2019/05/14 15:09, David Rowley wrote: > On Tue, 14 May 2019 at 16:07, Amit Langote > <Langote_Amit_f8@lab.ntt.co.jp> wrote: >> >> The problem is different. '2018-01-01'::timestamptz' in the condition >> datadatetime < '2018-01-01'::timestamptz as presented to >> match_clause_to_partition_key() is indeed a Const node, making it think >> that it's OK to prune using it, that is, with or without your patch. > > Looks like I misunderstood the problem. > > Is it not still better to ignore the unsupported quals in > match_clause_to_partition_key() rather than crafting a new List of > just the valid ones like you've done in your patch? That's another way... > I feel that what you've got spreads > the logic out a bit too much. match_clause_to_partition_key() has > traditionally been in charge of deciding what quals can be used and > which ones can't. You've gone and added logic in> prune_append_rel_partitions() that makes part of this decision now and > that feels a bit wrong to me. Actually, I had considered a solution like yours of distinguishing prune_append_rel_partitions()'s invocations of gen_partprune_steps() from make_partition_pruneinfo()'s, but thought it would be too much churn. Also, another thing that pushed me toward the approach I took is what I saw in predtest.c. I mentioned upthread that constraint exclusion doesn't have this problem, that is, it knows to skip stable-valued clauses, so I went into predicate_refuted_by() and underlings to see how they do that, but the logic wasn't down there. It was in the following code in relation_excluded_by_constraints(): /* * ... We dare not make * deductions with non-immutable functions * ... */ safe_restrictions = NIL; foreach(lc, rel->baserestrictinfo) { RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc); if (!contain_mutable_functions((Node *) rinfo->clause)) safe_restrictions = lappend(safe_restrictions, rinfo->clause); } I think prune_append_rel_partitions() is playing a similar role as relation_excluded_by_constraints(), that is, of dispatching the appropriate clauses to the main pruning logic. So, I thought enforcing this policy in prune_append_rel_partitions() wouldn't be such a bad idea. > I've attached patch of how I think it should work. This includes the > changes you made to analyze_partkey_exprs() and your tests from > v1-0003-Fix-planner-to-not-prune-partitions-with-non-immu.patch Thanks for updating the patch. It works now. > Do you think it's a bad idea to do it this way? As I mentioned above, I think this may be quite a bit of code churn for enforcing a policy that's elsewhere enforced in a much simpler manner. How about deferring this to committer? Thanks, Amit
Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes: > On 2019/05/14 15:09, David Rowley wrote: >> Do you think it's a bad idea to do it this way? > As I mentioned above, I think this may be quite a bit of code churn for > enforcing a policy that's elsewhere enforced in a much simpler manner. > How about deferring this to committer? Not sure I'll be the one to commit this, but ... I don't much like the approach in v1-0003-Fix-planner-to-not-prune-partitions-with-non-immu.patch because (a) it appears to me to add duplicative expression mutability testing, which is kind of expensive since that adds syscache lookups, plus it adds such tests even for clauses that aren't going to match any partition keys; and (b) it's extremely nonobvious that this restricts plan-time pruning and not run-time pruning. I do see that prune_append_rel_partitions is used only for the former; but you sure as heck can't deduce that from any of its comments, so somebody might try to use it for run-time pruning. So I think David's got the right idea that match_clause_to_partition_key is where to handle this, and I like the fact that his patch explicitly represents whether we're trying to do run-time or plan-time pruning. I agree it's kind of invasive, and I wonder why. Shouldn't the additional flag just be a field in the "context" struct, instead of being a separate parameter that has to be laboriously passed through many call levels? (For myself, I'd have made it just a bool not an enum, given that I don't foresee a need for additional values. But that's definitely a matter of preference.) Also, I'm a bit confused by the fact that we seem to have a bunch of spare-parts patches in this thread. What combination is actually being proposed for commit? regards, tom lane
On Wed, 15 May 2019 at 08:58, Tom Lane <tgl@sss.pgh.pa.us> wrote: > So I think David's got the right idea that match_clause_to_partition_key > is where to handle this, and I like the fact that his patch explicitly > represents whether we're trying to do run-time or plan-time pruning. > I agree it's kind of invasive, and I wonder why. Shouldn't the > additional flag just be a field in the "context" struct, instead of > being a separate parameter that has to be laboriously passed through > many call levels? Thanks for having a look. I originally stayed clear of "context" since it looks more like how we output the pruning steps, rather than a context as to how they should be generated. But it's true, using it saves having to pass the extra argument around, so I made that change. > (For myself, I'd have made it just a bool not an enum, given that > I don't foresee a need for additional values. But that's definitely > a matter of preference.) I've changed to bool, mostly because it makes the patch a bit smaller. > Also, I'm a bit confused by the fact that we seem to have a bunch > of spare-parts patches in this thread. What combination is actually > being proposed for commit? I was also confused at the extra two patches Amit posted. The extra one for the tests didn't look very valid to me, as I mentioned yesterday. I propose the attached for master, and I'll post something for v11 shortly. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Вложения
On Wed, 15 May 2019 at 11:52, David Rowley <david.rowley@2ndquadrant.com> wrote: > I propose the attached for master, and I'll post something for v11 shortly. Here's the v11 version, as promised. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Вложения
On 2019/05/15 5:58, Tom Lane wrote: > Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes: >> On 2019/05/14 15:09, David Rowley wrote: >>> Do you think it's a bad idea to do it this way? > >> As I mentioned above, I think this may be quite a bit of code churn for >> enforcing a policy that's elsewhere enforced in a much simpler manner. >> How about deferring this to committer? > > Not sure I'll be the one to commit this, but ... Thanks for the feedback. > I don't much like the approach in > v1-0003-Fix-planner-to-not-prune-partitions-with-non-immu.patch > because (a) it appears to me to add duplicative expression mutability > testing, which is kind of expensive since that adds syscache lookups, > plus it adds such tests even for clauses that aren't going to match any > partition keys; and (b) it's extremely nonobvious that this restricts > plan-time pruning and not run-time pruning. I do see that > prune_append_rel_partitions is used only for the former; but you sure > as heck can't deduce that from any of its comments, so somebody might > try to use it for run-time pruning. OK, both a and b are good arguments. > So I think David's got the right idea that match_clause_to_partition_key > is where to handle this, and I like the fact that his patch explicitly > represents whether we're trying to do run-time or plan-time pruning. Agreed that David's latest patch is good for this. > Also, I'm a bit confused by the fact that we seem to have a bunch > of spare-parts patches in this thread. What combination is actually > being proposed for commit? Sorry for the confusion. 0001 adds tests for the other bugs but with the wrong expected output 0002 fixes those bugs and shows the corrected output 0003 was for the bug that stable operators are being used for plan-time pruning which has its own test (only 0003 needed separate versions for HEAD and v11) Parts of 0003 are now superseded by David's patch, of which he has posted both for-HEAD and for-v11 versions. 0001+0002 is obviously meant to be squashed, which I'll do in a bit. Thanks, Amit
Thanks for the updated patches. On 2019/05/15 8:52, David Rowley wrote: > On Wed, 15 May 2019 at 08:58, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> So I think David's got the right idea that match_clause_to_partition_key >> is where to handle this, and I like the fact that his patch explicitly >> represents whether we're trying to do run-time or plan-time pruning. >> I agree it's kind of invasive, and I wonder why. Shouldn't the >> additional flag just be a field in the "context" struct, instead of >> being a separate parameter that has to be laboriously passed through >> many call levels? > > Thanks for having a look. I originally stayed clear of "context" > since it looks more like how we output the pruning steps, rather than > a context as to how they should be generated. But it's true, using it > saves having to pass the extra argument around, so I made that change. > >> (For myself, I'd have made it just a bool not an enum, given that >> I don't foresee a need for additional values. But that's definitely >> a matter of preference.) > > I've changed to bool, mostly because it makes the patch a bit smaller. Looks good. Thanks for tweaking many comments for clarity, except this one could be improved a bit more: + * 'forplanner' must be true when being called from the query planner. This description of 'forplanner' might make it sound like it's redundant (always true), because gen_partprune_steps() is only called from the planner today. How about making it say the same thing as the comment next to its definition in GeneratePruningStepsContext struct): 'forplanner' must be true if generating steps to be used during query planning. >> Also, I'm a bit confused by the fact that we seem to have a bunch >> of spare-parts patches in this thread. What combination is actually >> being proposed for commit? > > I was also confused at the extra two patches Amit posted. The extra > one for the tests didn't look very valid to me, as I mentioned > yesterday. As I said in my previous reply, 0001 contains wrong expected output because of the other bugs discussed earlier. I guess that's not very helpful. Anyway, I'm attaching a squashed version of those two patches as 0001-Fix-bugs-in-pruning-with-composite-range-partition-k.patch. It applies as-is to both master and v11 branches. > I propose the attached for master, and I'll post something for v11 shortly. Thanks. I'm re-attaching both of your patches here just to have all the patches in one place. Thanks, Amit
Вложения
On Wed, 15 May 2019 at 14:10, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote: > Anyway, I'm attaching a squashed version of those two patches as > 0001-Fix-bugs-in-pruning-with-composite-range-partition-k.patch. It > applies as-is to both master and v11 branches. I had a look at 0001-Fix-bugs-in-pruning-with-composite-range-partition-k.patch, and due to my patch no longer generating pruning steps during planning for non-consts, I don't think the test you have there is doing a very good job of highlighting the bug. Maybe you could exploit this bug in the tests by having both initplan run-time pruning and exec-time run-time pruning run. Something like: set plan_cache_mode = 'force_generic_plan'; prepare q1 (int) as select * from mc3p where a = $1 and abs(b) < (select 2); explain (costs off) execute q1(1); Additionally, I'm wondering if we should also apply the attached as part of my dont_prune_with_nonimmutable_opfuncs_during_planning_v2.patch patch. We should never get a non-Const in the pruning steps for planning, so there's likely no point in doing a run-time check to ensure we have a planstate. This code can execute quite often during run-time pruning, once each time a parameter changes, which could be each row. I think I'll go do that now, and also fix up that forplanner comment you mentioned. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Вложения
On Wed, 15 May 2019 at 14:20, David Rowley <david.rowley@2ndquadrant.com> wrote: > Additionally, I'm wondering if we should also apply the attached as > part of my dont_prune_with_nonimmutable_opfuncs_during_planning_v2.patch > patch. We should never get a non-Const in the pruning steps for > planning, so there's likely no point in doing a run-time check to > ensure we have a planstate. This code can execute quite often during > run-time pruning, once each time a parameter changes, which could be > each row. I think I'll go do that now, and also fix up that > forplanner comment you mentioned. I've made the change to partkey_datum_from_expr() in master's version only. I did this because I had to document that the context argument in get_matching_partitions() must have a valid planstate set when the given pruning_steps were generated for the executor. It's probably unlikely that anything else is using these functions, but still thought it was a bad idea to change that in v11. At this stage, I don't see any big issues changing that in master. The only change in pg11_dont_prune_with_nonimmutable_opfuncs_during_planning_v3.patch since v2 is the inaccurate comment mentioned by Amit. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Вложения
You are right that tests I'd added in 0001 no longer test for the bugs it fixes, that is, after applying your patch, because we no longer end up in a state during *plan-time* pruning where those bugs manifest themselves. Thanks for pointing that out. I have rewritten the tests. Also, I'm attaching the other patch as *0002* this time to signal that it is to be applied after applying yours, that is, if not together. On 2019/05/15 11:47, David Rowley wrote: > On Wed, 15 May 2019 at 14:20, David Rowley <david.rowley@2ndquadrant.com> wrote: >> Additionally, I'm wondering if we should also apply the attached as >> part of my dont_prune_with_nonimmutable_opfuncs_during_planning_v2.patch >> patch. We should never get a non-Const in the pruning steps for >> planning, so there's likely no point in doing a run-time check to >> ensure we have a planstate. This code can execute quite often during >> run-time pruning, once each time a parameter changes, which could be >> each row. I think I'll go do that now, and also fix up that >> forplanner comment you mentioned. > > I've made the change to partkey_datum_from_expr() in master's version > only. I did this because I had to document that the context argument > in get_matching_partitions() must have a valid planstate set when the > given pruning_steps were generated for the executor. It's probably > unlikely that anything else is using these functions, but still > thought it was a bad idea to change that in v11. At this stage, I > don't see any big issues changing that in master. Agreed about changing that "if" to an "Assert" and doing it only in the master. FWIW, I also ended up adding an Assert in the 0002 patch, because one of the bugs being fixed can only occur in a run-time pruning scenario. That means I had to create a version of the patch for v11 without the Assert; that's the only difference between the attached pg11-0002- and 0002- patches. Thanks, Amit
Вложения
David Rowley <david.rowley@2ndquadrant.com> writes: > [ dont_prune_with_nonimmutable_opfuncs_during_planning_v3.patch ] I started working through this, and changed some comments I thought could be improved, and noticed that you'd missed out making the same changes for ScalarArrayOp so I did that. However, when I got to the changes in analyze_partkey_exprs, my bogometer went off. Why do we need to recheck this, in fact why does that function exist at all? ISTM if we have used a pruning qual at plan time there is no need to check it again at runtime; therefore, it shouldn't even be in the list that analyze_partkey_exprs is looking at. I am wondering therefore whether we shouldn't do one of these two things: * Teach match_clause_to_partition_key to not emit plan-time-usable quals at all if it is called with forplanner = false, or * Add a bool field to PartitionPruneStep indicating whether the step can be used at plan time, and use that to skip redundant checks at run time. This would make more sense if we can then fix things so that we only run match_clause_to_partition_key once per qual clause --- I gather that it's now being run twice, which seems pretty inefficient. Perhaps this implies too much code churn/restructuring to be reasonable to consider right now, but I thought I'd ask. It sure seems like this work is being done in a pretty inefficient and duplicative way. regards, tom lane diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c index 59f34f5..a628858 100644 --- a/src/backend/partitioning/partprune.c +++ b/src/backend/partitioning/partprune.c @@ -39,6 +39,7 @@ #include "access/nbtree.h" #include "catalog/pg_operator.h" #include "catalog/pg_opfamily.h" +#include "catalog/pg_proc.h" #include "catalog/pg_type.h" #include "executor/executor.h" #include "miscadmin.h" @@ -93,8 +94,12 @@ typedef enum PartClauseMatchStatus */ typedef struct GeneratePruningStepsContext { + /* Input data: */ + bool forplanner; /* true when generating steps to be used + * during query planning */ + /* Working state and result data: */ int next_step_id; - List *steps; + List *steps; /* output, list of PartitionPruneSteps */ } GeneratePruningStepsContext; /* The result of performing one PartitionPruneStep */ @@ -117,7 +122,7 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root, List *partitioned_rels, List *prunequal, Bitmapset **matchedsubplans); static List *gen_partprune_steps(RelOptInfo *rel, List *clauses, - bool *contradictory); + bool forplanner, bool *contradictory); static List *gen_partprune_steps_internal(GeneratePruningStepsContext *context, RelOptInfo *rel, List *clauses, bool *contradictory); @@ -409,8 +414,13 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel, targetpart->relids); } - /* Convert pruning qual to pruning steps. */ - pruning_steps = gen_partprune_steps(subpart, partprunequal, + /* + * Convert pruning qual to pruning steps. Since these steps will be + * used in the executor, we can pass 'forplanner' as false to allow + * steps to be generated that are unsafe for evaluation during + * planning, e.g. evaluation of stable functions. + */ + pruning_steps = gen_partprune_steps(subpart, partprunequal, false, &contradictory); if (contradictory) @@ -523,16 +533,21 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel, /* * gen_partprune_steps * Process 'clauses' (a rel's baserestrictinfo list of clauses) and return - * a list of "partition pruning steps" + * a list of "partition pruning steps". + * + * 'forplanner' must be true when generating steps to be evaluated during + * query planning, false when generating steps to be used at run-time. * * If the clauses in the input list are contradictory or there is a * pseudo-constant "false", *contradictory is set to true upon return. */ static List * -gen_partprune_steps(RelOptInfo *rel, List *clauses, bool *contradictory) +gen_partprune_steps(RelOptInfo *rel, List *clauses, bool forplanner, + bool *contradictory) { GeneratePruningStepsContext context; + context.forplanner = forplanner; context.next_step_id = 0; context.steps = NIL; @@ -574,9 +589,11 @@ gen_partprune_steps(RelOptInfo *rel, List *clauses, bool *contradictory) /* * prune_append_rel_partitions - * Returns indexes into rel->part_rels of the minimum set of child - * partitions which must be scanned to satisfy rel's baserestrictinfo - * quals. + * Process rel's baserestrictinfo and make use of quals which can be + * evaluated during query planning in order to determine the minimum set + * of partitions which must be scanned to satisfy these quals. Returns + * the matching partitions in the form of a Bitmapset containing the + * partitions' indexes in the rel's part_rels array. * * Callers must ensure that 'rel' is a partitioned table. */ @@ -603,9 +620,12 @@ prune_append_rel_partitions(RelOptInfo *rel) /* * Process clauses. If the clauses are found to be contradictory, we can - * return the empty set. + * return the empty set. Pass 'forplanner' as true to indicate to the + * pruning code that we only want pruning steps that can be evaluated + * during planning. */ - pruning_steps = gen_partprune_steps(rel, clauses, &contradictory); + pruning_steps = gen_partprune_steps(rel, clauses, true, + &contradictory); if (contradictory) return NULL; @@ -635,6 +655,9 @@ prune_append_rel_partitions(RelOptInfo *rel) * get_matching_partitions * Determine partitions that survive partition pruning * + * Note: context->planstate must be set to a valid PlanState when the + * pruning_steps were generated with 'forplanner' = false. + * * Returns a Bitmapset of the RelOptInfo->part_rels indexes of the surviving * partitions. */ @@ -1572,8 +1595,7 @@ match_clause_to_partition_key(RelOptInfo *rel, Expr *expr; /* - * Recognize specially shaped clauses that match with the Boolean - * partition key. + * Recognize specially shaped clauses that match a Boolean partition key. */ if (match_boolean_partition_clause(partopfamily, clause, partkey, &expr)) { @@ -1648,16 +1670,47 @@ match_clause_to_partition_key(RelOptInfo *rel, * Matched with this key. Now check various properties of the clause * to see if it's sane to use it for pruning. In most of these cases, * we can return UNSUPPORTED because the same failure would occur no - * matter which partkey it's matched to. + * matter which partkey it's matched to. (In particular, now that + * we've successfully matched one side of the operator to a partkey, + * there is no chance that matching the other side to another partkey + * will produce a usable result, since that'd mean there are Vars on + * both sides.) */ + if (context->forplanner) + { + /* + * When pruning in the planner, we only support pruning using + * comparisons to constants. Immutable subexpressions will have + * been folded to constants already, and we cannot prune on the + * basis of anything that's not immutable. + */ + if (!IsA(expr, Const)) + return PARTCLAUSE_UNSUPPORTED; - /* - * We can't prune using an expression with Vars. (Report failure as - * UNSUPPORTED, not NOMATCH: as in the no-commutator case above, we - * now know there are Vars on both sides, so it's no good.) - */ - if (contain_var_clause((Node *) expr)) - return PARTCLAUSE_UNSUPPORTED; + /* + * Also, the comparison operator itself must be immutable. + */ + if (op_volatile(opno) != PROVOLATILE_IMMUTABLE) + return PARTCLAUSE_UNSUPPORTED; + } + else + { + /* + * Otherwise, non-consts are allowed, but we can't prune using an + * expression that contains Vars. + */ + if (contain_var_clause((Node *) expr)) + return PARTCLAUSE_UNSUPPORTED; + + /* + * And we must reject anything containing a volatile function. + * Stable functions are OK though. (We need not check this for + * the comparison operator itself: anything that belongs to a + * partitioning operator family must be at least stable.) + */ + if (contain_volatile_functions((Node *) expr)) + return PARTCLAUSE_UNSUPPORTED; + } /* * Only allow strict operators. This will guarantee nulls are @@ -1666,10 +1719,6 @@ match_clause_to_partition_key(RelOptInfo *rel, if (!op_strict(opno)) return PARTCLAUSE_UNSUPPORTED; - /* We can't use any volatile expressions to prune partitions. */ - if (contain_volatile_functions((Node *) expr)) - return PARTCLAUSE_UNSUPPORTED; - /* * See if the operator is relevant to the partitioning opfamily. * @@ -1798,20 +1847,51 @@ match_clause_to_partition_key(RelOptInfo *rel, if (IsA(leftop, RelabelType)) leftop = ((RelabelType *) leftop)->arg; - /* Check it matches this partition key */ + /* check if the LHS matches this partition key */ if (!equal(leftop, partkey) || !PartCollMatchesExprColl(partcoll, saop->inputcollid)) return PARTCLAUSE_NOMATCH; /* * Matched with this key. Check various properties of the clause to - * see if it can sanely be used for partition pruning (this is mostly - * the same as for a plain OpExpr). + * see if it can sanely be used for partition pruning (this is + * identical to the logic for a plain OpExpr). */ + if (context->forplanner) + { + /* + * When pruning in the planner, we only support pruning using + * comparisons to constants. Immutable subexpressions will have + * been folded to constants already, and we cannot prune on the + * basis of anything that's not immutable. + */ + if (!IsA(rightop, Const)) + return PARTCLAUSE_UNSUPPORTED; - /* We can't prune using an expression with Vars. */ - if (contain_var_clause((Node *) rightop)) - return PARTCLAUSE_UNSUPPORTED; + /* + * Also, the comparison operator itself must be immutable. + */ + if (op_volatile(saop_op) != PROVOLATILE_IMMUTABLE) + return PARTCLAUSE_UNSUPPORTED; + } + else + { + /* + * Otherwise, non-consts are allowed, but we can't prune using an + * expression that contains Vars. + */ + if (contain_var_clause((Node *) rightop)) + return PARTCLAUSE_UNSUPPORTED; + + /* + * And we must reject anything containing a volatile function. + * Stable functions are OK though. (We need not check this for + * the comparison operator itself: anything that belongs to a + * partitioning operator family must be at least stable.) + */ + if (contain_volatile_functions((Node *) rightop)) + return PARTCLAUSE_UNSUPPORTED; + } /* * Only allow strict operators. This will guarantee nulls are @@ -1820,10 +1900,6 @@ match_clause_to_partition_key(RelOptInfo *rel, if (!op_strict(saop_op)) return PARTCLAUSE_UNSUPPORTED; - /* We can't use any volatile expressions to prune partitions. */ - if (contain_volatile_functions((Node *) rightop)) - return PARTCLAUSE_UNSUPPORTED; - /* * In case of NOT IN (..), we get a '<>', which we handle if list * partitioning is in use and we're able to confirm that it's negator @@ -2961,15 +3037,17 @@ analyze_partkey_exprs(PartitionedRelPruneInfo *pinfo, List *steps, { PartitionPruneStepOp *step = (PartitionPruneStepOp *) lfirst(lc); ListCell *lc2; + ListCell *lc3; int keyno; if (!IsA(step, PartitionPruneStepOp)) continue; keyno = 0; - foreach(lc2, step->exprs) + forboth(lc2, step->exprs, lc3, step->cmpfns) { Expr *expr = lfirst(lc2); + Oid fnoid = lfirst_oid(lc3); if (!IsA(expr, Const)) { @@ -2992,6 +3070,18 @@ analyze_partkey_exprs(PartitionedRelPruneInfo *pinfo, List *steps, doruntimeprune = true; } + + /* + * Pruning done during planning won't have pruned using quals with + * a non-immutable comparison functions, so we'll switch on + * run-time pruning for these. + */ + else if (func_volatile(fnoid) != PROVOLATILE_IMMUTABLE) + { + /* No need to check for exec params inside a const */ + doruntimeprune = true; + pinfo->do_initial_prune = true; + } keyno++; } } @@ -3347,6 +3437,12 @@ partkey_datum_from_expr(PartitionPruneContext *context, else { /* + * We should never see a non-Const in a step unless we're running in + * the executor. + */ + Assert(context->planstate != NULL); + + /* * When called from the executor we'll have a valid planstate so we * may be able to evaluate an expression which could not be folded to * a Const during planning. Since run-time pruning can occur both @@ -3354,9 +3450,7 @@ partkey_datum_from_expr(PartitionPruneContext *context, * must be careful here to evaluate expressions containing PARAM_EXEC * Params only when told it's OK. */ - if (context->planstate && - (context->evalexecparams || - !context->exprhasexecparam[stateidx])) + if (context->evalexecparams || !context->exprhasexecparam[stateidx]) { ExprState *exprstate; ExprContext *ectx; diff --git a/src/test/regress/expected/partition_prune.out b/src/test/regress/expected/partition_prune.out index bd64bed..cad5967 100644 --- a/src/test/regress/expected/partition_prune.out +++ b/src/test/regress/expected/partition_prune.out @@ -3036,6 +3036,41 @@ select * from listp where a = (select null::int); (7 rows) drop table listp; +-- +-- check that stable query clauses are only used in run-time pruning +-- +create table stable_qual_pruning (a timestamp) partition by range (a); +create table stable_qual_pruning1 partition of stable_qual_pruning + for values from ('2000-01-01') to ('2000-02-01'); +create table stable_qual_pruning2 partition of stable_qual_pruning + for values from ('2000-02-01') to ('2000-03-01'); +create table stable_qual_pruning3 partition of stable_qual_pruning + for values from ('3000-02-01') to ('3000-03-01'); +-- comparison against a stable value requires run-time pruning +explain (analyze, costs off, summary off, timing off) +select * from stable_qual_pruning where a < localtimestamp; + QUERY PLAN +---------------------------------------------------------------- + Append (actual rows=0 loops=1) + Subplans Removed: 1 + -> Seq Scan on stable_qual_pruning1 (actual rows=0 loops=1) + Filter: (a < LOCALTIMESTAMP) + -> Seq Scan on stable_qual_pruning2 (actual rows=0 loops=1) + Filter: (a < LOCALTIMESTAMP) +(6 rows) + +-- timestamp < timestamptz comparison is only stable, not immutable +explain (analyze, costs off, summary off, timing off) +select * from stable_qual_pruning where a < '2000-02-01'::timestamptz; + QUERY PLAN +-------------------------------------------------------------------------------- + Append (actual rows=0 loops=1) + Subplans Removed: 2 + -> Seq Scan on stable_qual_pruning1 (actual rows=0 loops=1) + Filter: (a < 'Tue Feb 01 00:00:00 2000 PST'::timestamp with time zone) +(4 rows) + +drop table stable_qual_pruning; -- Ensure runtime pruning works with initplans params with boolean types create table boolvalues (value bool not null); insert into boolvalues values('t'),('f'); diff --git a/src/test/regress/sql/partition_prune.sql b/src/test/regress/sql/partition_prune.sql index 246c627..94d9e1e 100644 --- a/src/test/regress/sql/partition_prune.sql +++ b/src/test/regress/sql/partition_prune.sql @@ -754,6 +754,27 @@ select * from listp where a = (select null::int); drop table listp; +-- +-- check that stable query clauses are only used in run-time pruning +-- +create table stable_qual_pruning (a timestamp) partition by range (a); +create table stable_qual_pruning1 partition of stable_qual_pruning + for values from ('2000-01-01') to ('2000-02-01'); +create table stable_qual_pruning2 partition of stable_qual_pruning + for values from ('2000-02-01') to ('2000-03-01'); +create table stable_qual_pruning3 partition of stable_qual_pruning + for values from ('3000-02-01') to ('3000-03-01'); + +-- comparison against a stable value requires run-time pruning +explain (analyze, costs off, summary off, timing off) +select * from stable_qual_pruning where a < localtimestamp; + +-- timestamp < timestamptz comparison is only stable, not immutable +explain (analyze, costs off, summary off, timing off) +select * from stable_qual_pruning where a < '2000-02-01'::timestamptz; + +drop table stable_qual_pruning; + -- Ensure runtime pruning works with initplans params with boolean types create table boolvalues (value bool not null); insert into boolvalues values('t'),('f');
On Thu, 16 May 2019 at 05:38, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > David Rowley <david.rowley@2ndquadrant.com> writes: > > [ dont_prune_with_nonimmutable_opfuncs_during_planning_v3.patch ] > > I started working through this, and changed some comments I thought could > be improved, and noticed that you'd missed out making the same changes > for ScalarArrayOp so I did that. However, when I got to the changes in > analyze_partkey_exprs, my bogometer went off. Why do we need to recheck > this, in fact why does that function exist at all? ISTM if we have used > a pruning qual at plan time there is no need to check it again at runtime; > therefore, it shouldn't even be in the list that analyze_partkey_exprs > is looking at. Thinking about this more, if we're now making it so a forplanner = true set of steps requires all values being compared to the partition key to be Const, then anything that's not Const must require run-time pruning. That means the: else if (func_volatile(fnoid) != PROVOLATILE_IMMUTABLE) in analyze_partkey_exprs() can simply become "else". > I am wondering therefore whether we shouldn't do one of these two > things: > > * Teach match_clause_to_partition_key to not emit plan-time-usable > quals at all if it is called with forplanner = false, or That's not really possible. Imagine a table partitioned by HASH(a,b) and WHERE a = $1 and b = 10; we can't do any pruning during planning here, and if we only generate steps containing "a = $1" for run-time, then we can't do anything there either. > * Add a bool field to PartitionPruneStep indicating whether the step > can be used at plan time, and use that to skip redundant checks at run > time. This would make more sense if we can then fix things so that > we only run match_clause_to_partition_key once per qual clause --- > I gather that it's now being run twice, which seems pretty inefficient. I think it would be nicer if we made match_clause_to_partition_key() do everything that analyze_partkey_exprs() currently does, but if we change that "else if" to just "else" then the only advantage is not having to make another pass over the pruning steps. We'd have to make it so match_clause_to_partition_key() did all the pull_exec_paramids() stuff and add some new fields to PartitionPruneStepOp to store that info. We'd also need to store hasexecparam in the PartitionPruneStepOp and change the signature of partkey_datum_from_expr() so we could pass that down as we'd no longer have context->exprhasexecparam... Well, we could keep that, but if we're trying to not loop over the steps again then we'll need to store that value in the step when it's created rather than put it in PartitionPruneContext. > Perhaps this implies too much code churn/restructuring to be reasonable > to consider right now, but I thought I'd ask. It sure seems like this > work is being done in a pretty inefficient and duplicative way. I certainly think the above would be cleaner and I don't mind working on it, but if it's too much churn for this time of year then I'd rather not waste time writing a patch that's going to get rejected. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
David Rowley <david.rowley@2ndquadrant.com> writes: > Thinking about this more, if we're now making it so a forplanner = > true set of steps requires all values being compared to the partition > key to be Const, then anything that's not Const must require run-time > pruning. That means the: > else if (func_volatile(fnoid) != PROVOLATILE_IMMUTABLE) > in analyze_partkey_exprs() can simply become "else". Um ... what about a merely-stable comparison operator with a Const? > I think it would be nicer if we made match_clause_to_partition_key() > do everything that analyze_partkey_exprs() currently does, but if we > change that "else if" to just "else" then the only advantage is not > having to make another pass over the pruning steps. We'd have to make > it so match_clause_to_partition_key() did all the pull_exec_paramids() > stuff and add some new fields to PartitionPruneStepOp to store that > info. We'd also need to store hasexecparam in the > PartitionPruneStepOp and change the signature of > partkey_datum_from_expr() so we could pass that down as we'd no longer > have context->exprhasexecparam... Well, we could keep that, but if > we're trying to not loop over the steps again then we'll need to store > that value in the step when it's created rather than put it in > PartitionPruneContext. The loop over steps, per se, isn't that expensive --- but extra syscache lookups are. Or at least that's my gut feeling about it. If we just had match_clause_to_partition_key mark the steps as being plan-time executable or not, we could avoid the repeat lookup. >> * Teach match_clause_to_partition_key to not emit plan-time-usable >> quals at all if it is called with forplanner = false, or > That's not really possible. Imagine a table partitioned by HASH(a,b) > and WHERE a = $1 and b = 10; we can't do any pruning during planning > here, and if we only generate steps containing "a = $1" for run-time, > then we can't do anything there either. That's an interesting example, but how does it work now? I did not observe any code in there that seems to be aware of cross-dependencies between steps. Presumably somewhere we are combining those hash quals, so couldn't we mark the combined step properly? >> Perhaps this implies too much code churn/restructuring to be reasonable >> to consider right now, but I thought I'd ask. It sure seems like this >> work is being done in a pretty inefficient and duplicative way. > I certainly think the above would be cleaner and I don't mind working > on it, but if it's too much churn for this time of year then I'd > rather not waste time writing a patch that's going to get rejected. Yeah, I'm leery of adding much new complexity right now, but maybe adding a flag field to save a syscache lookup wouldn't be out of line. regards, tom lane
On Thu, 16 May 2019 at 12:28, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > David Rowley <david.rowley@2ndquadrant.com> writes: > > Thinking about this more, if we're now making it so a forplanner = > > true set of steps requires all values being compared to the partition > > key to be Const, then anything that's not Const must require run-time > > pruning. That means the: > > > else if (func_volatile(fnoid) != PROVOLATILE_IMMUTABLE) > > > in analyze_partkey_exprs() can simply become "else". > > Um ... what about a merely-stable comparison operator with a Const? Oops. It seems I got my wires crossed there. Wasn't quite fully awake when I started thinking about this earlier. > The loop over steps, per se, isn't that expensive --- but extra syscache > lookups are. Or at least that's my gut feeling about it. If we just had > match_clause_to_partition_key mark the steps as being plan-time executable > or not, we could avoid the repeat lookup. okay, on re-think. I'm a little unsure of if you're mixing up "extra" and "repeat", we do need an "extra" lookup because we've discovered that strict ops are not safe to use during planning. I can't see around having this extra call. If you mean it's a repeat call, then it's really not, as we only do the op_volatile() check with forplanner=true. With forplanner = false we only call contain_var_clause() and contain_volatile_functions(). > >> * Teach match_clause_to_partition_key to not emit plan-time-usable > >> quals at all if it is called with forplanner = false, or > > > That's not really possible. Imagine a table partitioned by HASH(a,b) > > and WHERE a = $1 and b = 10; we can't do any pruning during planning > > here, and if we only generate steps containing "a = $1" for run-time, > > then we can't do anything there either. > > That's an interesting example, but how does it work now? I did not > observe any code in there that seems to be aware of cross-dependencies > between steps. Presumably somewhere we are combining those hash quals, > so couldn't we mark the combined step properly? In current HEAD we always generate steps for both quals when generating steps for planning and execution. partkey_datum_from_expr() just can't eval the qual when we're executing the steps during planning. My patch skips the step generation when the Expr is non-Const, so we save a few cycles and some memory allocation there. Remember that we don't reuse steps created for plan-time pruning in run-time pruning. A fresh set is created. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 2019/05/16 10:00, David Rowley wrote: > On Thu, 16 May 2019 at 12:28, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> The loop over steps, per se, isn't that expensive --- but extra syscache >> lookups are. Or at least that's my gut feeling about it. If we just had >> match_clause_to_partition_key mark the steps as being plan-time executable >> or not, we could avoid the repeat lookup. > > okay, on re-think. I'm a little unsure of if you're mixing up "extra" > and "repeat", we do need an "extra" lookup because we've discovered > that strict ops are not safe to use during planning. I can't see > around having this extra call. If you mean it's a repeat call, then > it's really not, as we only do the op_volatile() check with > forplanner=true. With forplanner = false we only call > contain_var_clause() and contain_volatile_functions(). How about we add one more bool, say, runtime_pruning_needed to GeneratePruningStepsContext and set it when we discover in match_clause_to_partition_key() that runtime pruning will be needed? Then return it to make_partitionedrel_pruneinfo() using a new output parameter of gen_partprune_steps()? Thanks, Amit
On Thu, 16 May 2019 at 13:51, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote: > How about we add one more bool, say, runtime_pruning_needed to > GeneratePruningStepsContext and set it when we discover in > match_clause_to_partition_key() that runtime pruning will be needed? Then > return it to make_partitionedrel_pruneinfo() using a new output parameter > of gen_partprune_steps()? I don't think this can be done the way you think. If we just have one flag to say any run-time pruning is needed then we'd still need to check op_volatile(opno) != PROVOLATILE_IMMUTABLE in analyze_partkey_exprs() to see if an initial run-time prune is needed. That means we might do two calls to op_volatile()! The other one is required to set the runtime_pruning_needed in match_clause_to_partition_key(). We can't have an additional flag to mark what do_exec_prune should get set to either since we only know that once we check what Param types are in the Expr and if we do that then that's the full-blown patch that we thought might be too much churn. I don't really see where this duplicate call is. The patch calls op_volatile() at the very most just once per pruning step. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 2019/05/16 11:52, David Rowley wrote: > I don't really see where this duplicate call is. The patch calls > op_volatile() at the very most just once per pruning step. Ah, there's no op_volatile() on the matched clause's operator in match_clause_to_partition_key() when it's called during a forplanner=false run. So, while there is a duplication of efforts because match_clause_to_partition_key() runs twice for any given clause which is due to original design design, there is no duplication of efforts between match_clause_to_partition_key() and analyze_partkey_exprs(). Thanks, Amit
On 2019/05/16 10:00, David Rowley wrote: > On Thu, 16 May 2019 at 12:28, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> That's an interesting example, but how does it work now? I did not >> observe any code in there that seems to be aware of cross-dependencies >> between steps. Presumably somewhere we are combining those hash quals, >> so couldn't we mark the combined step properly? > > In current HEAD we always generate steps for both quals when > generating steps for planning and execution. > partkey_datum_from_expr() just can't eval the qual when we're > executing the steps during planning. My patch skips the step > generation when the Expr is non-Const, so we save a few cycles and > some memory allocation there. Remember that we don't reuse steps > created for plan-time pruning in run-time pruning. A fresh set is > created. Given the way gen_partprune_steps() is designed [1], while it may be possible to discard steps generated by run-time pruning that need not be applied during execution because they would've already been fully applied during plan-time pruning, it seems hard to keep around *partial* steps generated during plan-time pruning and use them for run-time pruning after filling missing expressions from run-time quals. Filling the missing expressions into existing steps part seems hard. Thanks, Amit [1] Any given PartitionPruneStepOp pruning step is built from a *set of* clauses which have matched the required set of partitions keys (all keys in the hash case and a prefix of keys in the range case). If the set of clauses doesn't cover the required set of partition keys, no step will be generated at all, which may happen if some of the clauses were run-time clauses. The same set of clauses + parameterized path quals (if any) might cover the required set and if so a step will be generated for run-time pruning.
Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes: > Given the way gen_partprune_steps() is designed [1], while it may be > possible to discard steps generated by run-time pruning that need not be > applied during execution because they would've already been fully applied > during plan-time pruning, it seems hard to keep around *partial* steps > generated during plan-time pruning and use them for run-time pruning after > filling missing expressions from run-time quals. Filling the missing > expressions into existing steps part seems hard. Yeah. It seems like the way to make some progress on this would be to annotate the steps when they're generated as to whether they involve any run-time component or not. However, we might end up making multiple calls anyway because of the possibility of parameterized paths having additional qual clauses to work with, so I'm not clear on exactly how we could reduce the number of times we do gen_partprune_steps. It's not something we could fix now in any case. However, I'm still not real happy with analyze_partkey_exprs(); its analysis seems well short of ideal. I notice that in the sort of case suggested upthread, with more than one opclause contributing to a step, it's entirely possible for it to report that both initial and runtime pruning are needed even though only the latter is really possible. Perhaps this could be improved a bit by making the analysis operate per-step rather than being totally per-expression. Anyway, I've pushed David's patch now, on to look at Amit's. regards, tom lane
I wrote: > Anyway, I've pushed David's patch now, on to look at Amit's. So, the changes in gen_prune_steps_from_opexps seem okay, but the change in perform_pruning_base_step is just wrong, as can be seen from this extended form of the test case: create table mc3p (a int, b int, c int) partition by range (a, abs(b), c); create table mc3p1 partition of mc3p for values from (1, 1, 1) to (1, 1, 10); create table mc3p2 partition of mc3p for values from (1, 1, 10) to (2, minvalue, minvalue); create table mc3p3 partition of mc3p for values from (2, minvalue, minvalue) to (3, maxvalue, maxvalue); insert into mc3p values (1, 1, 1), (1,1,10), (2, 1, 1); set plan_cache_mode = force_generic_plan; prepare init_exec_prune_q1 as select * from mc3p where a = $1 and abs(b) <= $2 and c < (select 13); explain (analyze, costs off, summary off, timing off) execute init_exec_prune_q1 (1,2); deallocate init_exec_prune_q1; reset plan_cache_mode; drop table mc3p; This should of course find the (1,1,1) row, but it does not because it converts the initial pruning condition to be "a = 1 and abs(b) = 2", whereas the correct thing would be "a = 1 and abs(b) <= 2". It seems to me that the correct fix, rather than forcing equality strategy when you've missed out some of the quals, is to back up and use the strategy of the last clause you did use. But unless I'm missing something, that information is just not available from the PartitionPruneStepOp data structure. We could extend the PartitionPruneStepOp struct to have an array or list of StrategyNumbers (but I'm unsure about how safe that'd be to backpatch into v11). Another idea is that we ought to generate separate partitioning steps lists for the initial-pruning and runtime-pruning cases, ie invoke gen_partprune_steps yet a third time. Ugh (and that still means a data structure change, just in a different place). The bottom line here is that we've got an inflexible data structure that was designed on the assumption that gen_partprune_steps has all the intelligence, and trying to adjust things later is just trouble. I think this is really quite closely tied to my previous complaints about the design for plan-time versus run-time pruning, just mutatis mutandis. The lack of any holistic, principled approach to which-quals-apply-when is what's biting us here. Moreover, now that I understand what's going on here a bit better, I am deathly afraid that there are yet other bugs hiding in this same area. What we've basically got is that analyze_partkey_exprs and perform_pruning_base_step are trying to reverse-engineer what gen_partprune_steps would have done differently if it used only the quals not containing exec params. They don't really have enough information available to do that correctly, and even if they did, the fact that the logic looks nothing like the same is going to be a fertile source of bugs and maintenance gotchas. In other words, as I compose this I'm talking myself into the idea that invoking gen_partprune_steps three times (for planner, executor startup, and executor runtime, with corresponding filters on which clauses can be used) is the only safe near-term fix. In the future we can look at cutting down the repetitive costs that entails. Thoughts? regards, tom lane
I wrote: > So, the changes in gen_prune_steps_from_opexps seem okay, but > the change in perform_pruning_base_step is just wrong, I went ahead and pushed the non-controversial part of that patch, just to avoid carrying it around longer. I propose that we use the attached test cases once we have a fix for the remaining problem. The results shown are the wrong answers that HEAD produces. Amit's patch fixes the first case but not the second, since substituting equality semantics is the right thing to do in only the first case. regards, tom lane diff --git a/src/test/regress/expected/partition_prune.out b/src/test/regress/expected/partition_prune.out index 723fc70..88701d8 100644 --- a/src/test/regress/expected/partition_prune.out +++ b/src/test/regress/expected/partition_prune.out @@ -3149,6 +3149,43 @@ select * from mc3p where a < 3 and abs(b) = 1; Filter: ((a < 3) AND (abs(b) = 1)) (7 rows) +-- +-- Check that pruning with composite range partitioning works correctly when +-- a combination of runtime parameters is specified, not all of whose values +-- are available at the same time +-- +set plan_cache_mode = force_generic_plan; +prepare ps1 as + select * from mc3p where a = $1 and abs(b) < (select 3); +explain (analyze, costs off, summary off, timing off) +execute ps1(1); + QUERY PLAN +---------------------------------------------- + Append (actual rows=0 loops=1) + InitPlan 1 (returns $0) + -> Result (never executed) + Subplans Removed: 2 + -> Seq Scan on mc3p0 (never executed) + Filter: ((a = $1) AND (abs(b) < $0)) +(6 rows) + +deallocate ps1; +prepare ps2 as + select * from mc3p where a <= $1 and abs(b) < (select 3); +explain (analyze, costs off, summary off, timing off) +execute ps2(1); + QUERY PLAN +------------------------------------------------- + Append (actual rows=1 loops=1) + InitPlan 1 (returns $0) + -> Result (actual rows=1 loops=1) + Subplans Removed: 2 + -> Seq Scan on mc3p0 (actual rows=1 loops=1) + Filter: ((a <= $1) AND (abs(b) < $0)) +(6 rows) + +deallocate ps2; +reset plan_cache_mode; drop table mc3p; -- Ensure runtime pruning works with initplans params with boolean types create table boolvalues (value bool not null); diff --git a/src/test/regress/sql/partition_prune.sql b/src/test/regress/sql/partition_prune.sql index 2373bd8..071e28d 100644 --- a/src/test/regress/sql/partition_prune.sql +++ b/src/test/regress/sql/partition_prune.sql @@ -809,6 +809,24 @@ insert into mc3p values (0, 1, 1), (1, 1, 1), (2, 1, 1); explain (analyze, costs off, summary off, timing off) select * from mc3p where a < 3 and abs(b) = 1; +-- +-- Check that pruning with composite range partitioning works correctly when +-- a combination of runtime parameters is specified, not all of whose values +-- are available at the same time +-- +set plan_cache_mode = force_generic_plan; +prepare ps1 as + select * from mc3p where a = $1 and abs(b) < (select 3); +explain (analyze, costs off, summary off, timing off) +execute ps1(1); +deallocate ps1; +prepare ps2 as + select * from mc3p where a <= $1 and abs(b) < (select 3); +explain (analyze, costs off, summary off, timing off) +execute ps2(1); +deallocate ps2; +reset plan_cache_mode; + drop table mc3p; -- Ensure runtime pruning works with initplans params with boolean types
I wrote: > In other words, as I compose this I'm talking myself into the > idea that invoking gen_partprune_steps three times (for planner, > executor startup, and executor runtime, with corresponding > filters on which clauses can be used) is the only safe near-term > fix. In the future we can look at cutting down the repetitive > costs that entails. Here's a really quick and dirty POC along that line. It's unfinished but I'm out of time for today. I think the executor side of it is OK, but the planner side is crufty: it lacks logic to decide whether either the initial pruning or exec pruning steps are unnecessary. And I didn't update most of the comments either. I think the rule for "exec pruning is unnecessary" could probably just be "we didn't find any exec params in the prune steps", since that would imply that the initial steps include everything. I don't see an equally cheap way to decide that initial pruning is useless, though. Thoughts? Also, it seems like we could save at least some of the planning expense by having the PARTCLAUSE_INITIAL pass report back whether it had discarded any Param-bearing clauses or not. If not, there's definitely no need to run the PARTCLAUSE_EXEC pass. regards, tom lane diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c index 70709e5..d663118 100644 --- a/src/backend/executor/execPartition.c +++ b/src/backend/executor/execPartition.c @@ -183,6 +183,11 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel, bool *isnull, int maxfieldlen); static List *adjust_partition_tlist(List *tlist, TupleConversionMap *map); +static void ExecInitPruningContext(PartitionPruneContext *context, + List *pruning_steps, + PartitionDesc partdesc, + PartitionKey partkey, + PlanState *planstate); static void find_matching_subplans_recurse(PartitionPruningData *prunedata, PartitionedRelPruningData *pprune, bool initial_prune, @@ -1614,13 +1619,9 @@ ExecCreatePartitionPruneState(PlanState *planstate, { PartitionedRelPruneInfo *pinfo = lfirst_node(PartitionedRelPruneInfo, lc2); PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j]; - PartitionPruneContext *context = &pprune->context; Relation partrel; PartitionDesc partdesc; PartitionKey partkey; - int partnatts; - int n_steps; - ListCell *lc3; /* present_parts is also subject to later modification */ pprune->present_parts = bms_copy(pinfo->present_parts); @@ -1700,66 +1701,36 @@ ExecCreatePartitionPruneState(PlanState *planstate, Assert(pd_idx == pinfo->nparts); } - n_steps = list_length(pinfo->pruning_steps); - - context->strategy = partkey->strategy; - context->partnatts = partnatts = partkey->partnatts; - context->nparts = pinfo->nparts; - context->boundinfo = partdesc->boundinfo; - context->partcollation = partkey->partcollation; - context->partsupfunc = partkey->partsupfunc; - - /* We'll look up type-specific support functions as needed */ - context->stepcmpfuncs = (FmgrInfo *) - palloc0(sizeof(FmgrInfo) * n_steps * partnatts); - - context->ppccontext = CurrentMemoryContext; - context->planstate = planstate; - - /* Initialize expression state for each expression we need */ - context->exprstates = (ExprState **) - palloc0(sizeof(ExprState *) * n_steps * partnatts); - foreach(lc3, pinfo->pruning_steps) + /* + * Initialize pruning contexts as needed. + */ + pprune->initial_pruning_steps = pinfo->initial_pruning_steps; + if (pinfo->initial_pruning_steps) { - PartitionPruneStepOp *step = (PartitionPruneStepOp *) lfirst(lc3); - ListCell *lc4; - int keyno; - - /* not needed for other step kinds */ - if (!IsA(step, PartitionPruneStepOp)) - continue; - - Assert(list_length(step->exprs) <= partnatts); - - keyno = 0; - foreach(lc4, step->exprs) - { - Expr *expr = (Expr *) lfirst(lc4); - - /* not needed for Consts */ - if (!IsA(expr, Const)) - { - int stateidx = PruneCxtStateIdx(partnatts, - step->step.step_id, - keyno); - - context->exprstates[stateidx] = - ExecInitExpr(expr, context->planstate); - } - keyno++; - } + ExecInitPruningContext(&pprune->initial_context, + pinfo->initial_pruning_steps, + partdesc, partkey, planstate); + /* Record whether initial pruning is needed at any level */ + prunestate->do_initial_prune = true; + } + else + { + /* + * Hack: ExecFindInitialMatchingSubPlans requires this field + * to be valid whether pruning or not. + */ + pprune->initial_context.nparts = partdesc->nparts; } - /* Array is not modified at runtime, so just point to plan's copy */ - context->exprhasexecparam = pinfo->hasexecparam; - - pprune->pruning_steps = pinfo->pruning_steps; - pprune->do_initial_prune = pinfo->do_initial_prune; - pprune->do_exec_prune = pinfo->do_exec_prune; - - /* Record if pruning would be useful at any level */ - prunestate->do_initial_prune |= pinfo->do_initial_prune; - prunestate->do_exec_prune |= pinfo->do_exec_prune; + pprune->exec_pruning_steps = pinfo->exec_pruning_steps; + if (pinfo->exec_pruning_steps) + { + ExecInitPruningContext(&pprune->exec_context, + pinfo->exec_pruning_steps, + partdesc, partkey, planstate); + /* Record whether exec pruning is needed at any level */ + prunestate->do_exec_prune = true; + } /* * Accumulate the IDs of all PARAM_EXEC Params affecting the @@ -1777,6 +1748,71 @@ ExecCreatePartitionPruneState(PlanState *planstate, } /* + * Initialize a PartitionPruneContext for the given list of pruning steps. + */ +static void +ExecInitPruningContext(PartitionPruneContext *context, + List *pruning_steps, + PartitionDesc partdesc, + PartitionKey partkey, + PlanState *planstate) +{ + int n_steps; + int partnatts; + ListCell *lc; + + n_steps = list_length(pruning_steps); + + context->strategy = partkey->strategy; + context->partnatts = partnatts = partkey->partnatts; + context->nparts = partdesc->nparts; + context->boundinfo = partdesc->boundinfo; + context->partcollation = partkey->partcollation; + context->partsupfunc = partkey->partsupfunc; + + /* We'll look up type-specific support functions as needed */ + context->stepcmpfuncs = (FmgrInfo *) + palloc0(sizeof(FmgrInfo) * n_steps * partnatts); + + context->ppccontext = CurrentMemoryContext; + context->planstate = planstate; + + /* Initialize expression state for each expression we need */ + context->exprstates = (ExprState **) + palloc0(sizeof(ExprState *) * n_steps * partnatts); + foreach(lc, pruning_steps) + { + PartitionPruneStepOp *step = (PartitionPruneStepOp *) lfirst(lc); + ListCell *lc2; + int keyno; + + /* not needed for other step kinds */ + if (!IsA(step, PartitionPruneStepOp)) + continue; + + Assert(list_length(step->exprs) <= partnatts); + + keyno = 0; + foreach(lc2, step->exprs) + { + Expr *expr = (Expr *) lfirst(lc2); + + /* not needed for Consts */ + if (!IsA(expr, Const)) + { + int stateidx = PruneCxtStateIdx(partnatts, + step->step.step_id, + keyno); + + context->exprstates[stateidx] = + ExecInitExpr(expr, context->planstate); + } + keyno++; + } + } +} + +/* * ExecFindInitialMatchingSubPlans * Identify the set of subplans that cannot be eliminated by initial * pruning, disregarding any pruning constraints involving PARAM_EXEC @@ -1824,7 +1860,8 @@ ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate, int nsubplans) find_matching_subplans_recurse(prunedata, pprune, true, &result); /* Expression eval may have used space in node's ps_ExprContext too */ - ResetExprContext(pprune->context.planstate->ps_ExprContext); + if (pprune->initial_pruning_steps) + ResetExprContext(pprune->initial_context.planstate->ps_ExprContext); } /* Add in any subplans that partition pruning didn't account for */ @@ -1888,7 +1925,7 @@ ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate, int nsubplans) for (j = prunedata->num_partrelprunedata - 1; j >= 0; j--) { PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j]; - int nparts = pprune->context.nparts; + int nparts = pprune->initial_context.nparts; int k; /* We just rebuild present_parts from scratch */ @@ -1993,7 +2030,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate) find_matching_subplans_recurse(prunedata, pprune, false, &result); /* Expression eval may have used space in node's ps_ExprContext too */ - ResetExprContext(pprune->context.planstate->ps_ExprContext); + if (pprune->exec_pruning_steps) + ResetExprContext(pprune->exec_context.planstate->ps_ExprContext); } /* Add in any subplans that partition pruning didn't account for */ @@ -2029,15 +2067,15 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata, check_stack_depth(); /* Only prune if pruning would be useful at this level. */ - if (initial_prune ? pprune->do_initial_prune : pprune->do_exec_prune) + if (initial_prune && pprune->initial_pruning_steps) { - PartitionPruneContext *context = &pprune->context; - - /* Set whether we can evaluate PARAM_EXEC Params or not */ - context->evalexecparams = !initial_prune; - - partset = get_matching_partitions(context, - pprune->pruning_steps); + partset = get_matching_partitions(&pprune->initial_context, + pprune->initial_pruning_steps); + } + else if (!initial_prune && pprune->exec_pruning_steps) + { + partset = get_matching_partitions(&pprune->exec_context, + pprune->exec_pruning_steps); } else { diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c index 780d7ab..78deade 100644 --- a/src/backend/nodes/copyfuncs.c +++ b/src/backend/nodes/copyfuncs.c @@ -1198,16 +1198,13 @@ _copyPartitionedRelPruneInfo(const PartitionedRelPruneInfo *from) PartitionedRelPruneInfo *newnode = makeNode(PartitionedRelPruneInfo); COPY_SCALAR_FIELD(rtindex); - COPY_NODE_FIELD(pruning_steps); COPY_BITMAPSET_FIELD(present_parts); COPY_SCALAR_FIELD(nparts); - COPY_SCALAR_FIELD(nexprs); COPY_POINTER_FIELD(subplan_map, from->nparts * sizeof(int)); COPY_POINTER_FIELD(subpart_map, from->nparts * sizeof(int)); COPY_POINTER_FIELD(relid_map, from->nparts * sizeof(Oid)); - COPY_POINTER_FIELD(hasexecparam, from->nexprs * sizeof(bool)); - COPY_SCALAR_FIELD(do_initial_prune); - COPY_SCALAR_FIELD(do_exec_prune); + COPY_NODE_FIELD(initial_pruning_steps); + COPY_NODE_FIELD(exec_pruning_steps); COPY_BITMAPSET_FIELD(execparamids); return newnode; diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c index 387e4b9..237598e 100644 --- a/src/backend/nodes/outfuncs.c +++ b/src/backend/nodes/outfuncs.c @@ -948,16 +948,13 @@ _outPartitionedRelPruneInfo(StringInfo str, const PartitionedRelPruneInfo *node) WRITE_NODE_TYPE("PARTITIONEDRELPRUNEINFO"); WRITE_UINT_FIELD(rtindex); - WRITE_NODE_FIELD(pruning_steps); WRITE_BITMAPSET_FIELD(present_parts); WRITE_INT_FIELD(nparts); - WRITE_INT_FIELD(nexprs); WRITE_INT_ARRAY(subplan_map, node->nparts); WRITE_INT_ARRAY(subpart_map, node->nparts); WRITE_OID_ARRAY(relid_map, node->nparts); - WRITE_BOOL_ARRAY(hasexecparam, node->nexprs); - WRITE_BOOL_FIELD(do_initial_prune); - WRITE_BOOL_FIELD(do_exec_prune); + WRITE_NODE_FIELD(initial_pruning_steps); + WRITE_NODE_FIELD(exec_pruning_steps); WRITE_BITMAPSET_FIELD(execparamids); } diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c index 3b96492..6c2626e 100644 --- a/src/backend/nodes/readfuncs.c +++ b/src/backend/nodes/readfuncs.c @@ -2388,16 +2388,13 @@ _readPartitionedRelPruneInfo(void) READ_LOCALS(PartitionedRelPruneInfo); READ_UINT_FIELD(rtindex); - READ_NODE_FIELD(pruning_steps); READ_BITMAPSET_FIELD(present_parts); READ_INT_FIELD(nparts); - READ_INT_FIELD(nexprs); READ_INT_ARRAY(subplan_map, local_node->nparts); READ_INT_ARRAY(subpart_map, local_node->nparts); READ_OID_ARRAY(relid_map, local_node->nparts); - READ_BOOL_ARRAY(hasexecparam, local_node->nexprs); - READ_BOOL_FIELD(do_initial_prune); - READ_BOOL_FIELD(do_exec_prune); + READ_NODE_FIELD(initial_pruning_steps); + READ_NODE_FIELD(exec_pruning_steps); READ_BITMAPSET_FIELD(execparamids); READ_DONE(); diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c index ff3caf1..a892ca1 100644 --- a/src/backend/partitioning/partprune.c +++ b/src/backend/partitioning/partprune.c @@ -85,6 +85,17 @@ typedef enum PartClauseMatchStatus } PartClauseMatchStatus; /* + * PartClauseTarget + * Identifies which qual clauses we can use for generating pruning steps + */ +typedef enum PartClauseTarget +{ + PARTCLAUSE_PLANNER, /* want to prune during planning */ + PARTCLAUSE_INITIAL, /* want to prune during executor startup */ + PARTCLAUSE_EXEC /* want to prune for each plan node scan */ +} PartClauseTarget; + +/* * GeneratePruningStepsContext * Information about the current state of generation of "pruning steps" * for a given set of clauses @@ -95,8 +106,7 @@ typedef enum PartClauseMatchStatus typedef struct GeneratePruningStepsContext { /* Input data: */ - bool forplanner; /* true when generating steps to be used - * during query planning */ + PartClauseTarget target; /* context we're generating steps for */ /* Working state and result data: */ int next_step_id; List *steps; /* output, list of PartitionPruneSteps */ @@ -122,7 +132,7 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root, List *partitioned_rels, List *prunequal, Bitmapset **matchedsubplans); static List *gen_partprune_steps(RelOptInfo *rel, List *clauses, - bool forplanner, bool *contradictory); + PartClauseTarget target, bool *contradictory); static List *gen_partprune_steps_internal(GeneratePruningStepsContext *context, RelOptInfo *rel, List *clauses, bool *contradictory); @@ -169,8 +179,7 @@ static PruneStepResult *get_matching_range_bounds(PartitionPruneContext *context FmgrInfo *partsupfunc, Bitmapset *nullkeys); static Bitmapset *pull_exec_paramids(Expr *expr); static bool pull_exec_paramids_walker(Node *node, Bitmapset **context); -static bool analyze_partkey_exprs(PartitionedRelPruneInfo *pinfo, List *steps, - int partnatts); +static Bitmapset *get_partkey_exec_paramids(List *steps); static PruneStepResult *perform_pruning_base_step(PartitionPruneContext *context, PartitionPruneStepOp *opstep); static PruneStepResult *perform_pruning_combine_step(PartitionPruneContext *context, @@ -347,9 +356,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel, Index rti = lfirst_int(lc); RelOptInfo *subpart = find_base_rel(root, rti); PartitionedRelPruneInfo *pinfo; - int partnatts = subpart->part_scheme->partnatts; List *partprunequal; - List *pruning_steps; + List *initial_pruning_steps; + List *exec_pruning_steps; bool contradictory; /* @@ -415,13 +424,13 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel, } /* - * Convert pruning qual to pruning steps. Since these steps will be - * used in the executor, we can pass 'forplanner' as false to allow - * steps to be generated that are unsafe for evaluation during - * planning, e.g. evaluation of stable functions. + * Convert pruning qual to pruning steps. We need to do this twice, + * once to obtain executor startup pruning steps, and once for + * executor per-scan pruning steps. */ - pruning_steps = gen_partprune_steps(subpart, partprunequal, false, - &contradictory); + initial_pruning_steps = gen_partprune_steps(subpart, partprunequal, + PARTCLAUSE_INITIAL, + &contradictory); if (contradictory) { @@ -437,20 +446,28 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel, return NIL; } + exec_pruning_steps = gen_partprune_steps(subpart, partprunequal, + PARTCLAUSE_EXEC, + &contradictory); + + if (contradictory) + { + /* As above, abort run-time pruning if anything fishy happens */ + return NIL; + } + + if (initial_pruning_steps || exec_pruning_steps) + doruntimeprune = true; + /* Begin constructing the PartitionedRelPruneInfo for this rel */ pinfo = makeNode(PartitionedRelPruneInfo); pinfo->rtindex = rti; - pinfo->pruning_steps = pruning_steps; + pinfo->initial_pruning_steps = initial_pruning_steps; + pinfo->exec_pruning_steps = exec_pruning_steps; + pinfo->execparamids = get_partkey_exec_paramids(exec_pruning_steps); /* Remaining fields will be filled in the next loop */ pinfolist = lappend(pinfolist, pinfo); - - /* - * Determine which pruning types should be enabled at this level. This - * also records paramids relevant to pruning steps in 'pinfo'. - */ - doruntimeprune |= analyze_partkey_exprs(pinfo, pruning_steps, - partnatts); } if (!doruntimeprune) @@ -535,6 +552,8 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel, * Process 'clauses' (a rel's baserestrictinfo list of clauses) and return * a list of "partition pruning steps". * + * XXX: FIX COMMENTS + * * 'forplanner' must be true when generating steps to be evaluated during * query planning, false when generating steps to be used at run-time. * @@ -555,12 +574,12 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel, * else it's set false. */ static List * -gen_partprune_steps(RelOptInfo *rel, List *clauses, bool forplanner, +gen_partprune_steps(RelOptInfo *rel, List *clauses, PartClauseTarget target, bool *contradictory) { GeneratePruningStepsContext context; - context.forplanner = forplanner; + context.target = target; context.next_step_id = 0; context.steps = NIL; @@ -635,7 +654,8 @@ prune_append_rel_partitions(RelOptInfo *rel) * pruning code that we only want pruning steps that can be evaluated * during planning. */ - pruning_steps = gen_partprune_steps(rel, clauses, true, + pruning_steps = gen_partprune_steps(rel, clauses, + PARTCLAUSE_PLANNER, &contradictory); if (contradictory) return NULL; @@ -655,8 +675,6 @@ prune_append_rel_partitions(RelOptInfo *rel) /* These are not valid when being called from the planner */ context.planstate = NULL; context.exprstates = NULL; - context.exprhasexecparam = NULL; - context.evalexecparams = false; /* Actual pruning happens here. */ return get_matching_partitions(&context, pruning_steps); @@ -1656,7 +1674,7 @@ match_clause_to_partition_key(RelOptInfo *rel, * will produce a usable result, since that'd mean there are Vars on * both sides.) */ - if (context->forplanner) + if (context->target == PARTCLAUSE_PLANNER) { /* * When pruning in the planner, we only support pruning using @@ -1675,6 +1693,8 @@ match_clause_to_partition_key(RelOptInfo *rel, } else { + Bitmapset *paramids; + /* * Otherwise, non-consts are allowed, but we can't prune using an * expression that contains Vars. @@ -1683,6 +1703,14 @@ match_clause_to_partition_key(RelOptInfo *rel, return PARTCLAUSE_UNSUPPORTED; /* + * See if there are exec Params, too. If so, we can only use this + * expression during per-scan pruning. + */ + paramids = pull_exec_paramids(expr); + if (!bms_is_empty(paramids) && context->target != PARTCLAUSE_EXEC) + return PARTCLAUSE_UNSUPPORTED; + + /* * And we must reject anything containing a volatile function. * Stable functions are OK though. (We need not check this for * the comparison operator itself: anything that belongs to a @@ -1837,7 +1865,7 @@ match_clause_to_partition_key(RelOptInfo *rel, * see if it can sanely be used for partition pruning (this is * identical to the logic for a plain OpExpr). */ - if (context->forplanner) + if (context->target == PARTCLAUSE_PLANNER) { /* * When pruning in the planner, we only support pruning using @@ -1856,6 +1884,8 @@ match_clause_to_partition_key(RelOptInfo *rel, } else { + Bitmapset *paramids; + /* * Otherwise, non-consts are allowed, but we can't prune using an * expression that contains Vars. @@ -1864,6 +1894,14 @@ match_clause_to_partition_key(RelOptInfo *rel, return PARTCLAUSE_UNSUPPORTED; /* + * See if there are exec Params, too. If so, we can only use this + * expression during per-scan pruning. + */ + paramids = pull_exec_paramids(expr); + if (!bms_is_empty(paramids) && context->target != PARTCLAUSE_EXEC) + return PARTCLAUSE_UNSUPPORTED; + + /* * And we must reject anything containing a volatile function. * Stable functions are OK though. (We need not check this for * the comparison operator itself: anything that belongs to a @@ -2985,89 +3023,38 @@ pull_exec_paramids_walker(Node *node, Bitmapset **context) } /* - * analyze_partkey_exprs - * Loop through all pruning steps and identify which ones require - * executor startup-time or executor run-time pruning. - * - * Returns true if any executor partition pruning should be attempted at this - * level. Also fills fields of *pinfo to record how to process each step. + * get_partkey_exec_paramids + * Loop through given pruning steps and find out which exec Params + * are used. * - * Note: when this is called, not much of *pinfo is valid; but that's OK - * since we only use it as an output area. + * Returns a Bitmapset of Param IDs. */ -static bool -analyze_partkey_exprs(PartitionedRelPruneInfo *pinfo, List *steps, - int partnatts) +static Bitmapset * +get_partkey_exec_paramids(List *steps) { - bool doruntimeprune = false; + Bitmapset *execparamids = NULL; ListCell *lc; - /* - * Steps require run-time pruning if they contain EXEC_PARAM Params. - * Otherwise, if their expressions aren't simple Consts or they involve - * non-immutable comparison operators, they require startup-time pruning. - * (Otherwise, the pruning would have been done at plan time.) - * - * Notice that what we actually check for mutability is the comparison - * functions, not the original operators. This relies on the support - * functions of the btree or hash opfamily being marked consistently with - * the operators. - */ - pinfo->nexprs = list_length(steps) * partnatts; - pinfo->hasexecparam = (bool *) palloc0(sizeof(bool) * pinfo->nexprs); - pinfo->do_initial_prune = false; - pinfo->do_exec_prune = false; - pinfo->execparamids = NULL; - foreach(lc, steps) { PartitionPruneStepOp *step = (PartitionPruneStepOp *) lfirst(lc); ListCell *lc2; - ListCell *lc3; - int keyno; if (!IsA(step, PartitionPruneStepOp)) continue; - keyno = 0; - Assert(list_length(step->exprs) == list_length(step->cmpfns)); - forboth(lc2, step->exprs, lc3, step->cmpfns) + foreach(lc2, step->exprs) { Expr *expr = lfirst(lc2); - Oid fnoid = lfirst_oid(lc3); + /* We can be quick for plain Consts */ if (!IsA(expr, Const)) - { - Bitmapset *execparamids = pull_exec_paramids(expr); - bool hasexecparams; - int stateidx = PruneCxtStateIdx(partnatts, - step->step.step_id, - keyno); - - Assert(stateidx < pinfo->nexprs); - hasexecparams = !bms_is_empty(execparamids); - pinfo->hasexecparam[stateidx] = hasexecparams; - pinfo->execparamids = bms_join(pinfo->execparamids, - execparamids); - - if (hasexecparams) - pinfo->do_exec_prune = true; - else - pinfo->do_initial_prune = true; - - doruntimeprune = true; - } - else if (func_volatile(fnoid) != PROVOLATILE_IMMUTABLE) - { - /* No exec params here, but must do initial pruning */ - pinfo->do_initial_prune = true; - doruntimeprune = true; - } - keyno++; + execparamids = bms_join(execparamids, + pull_exec_paramids(expr)); } } - return doruntimeprune; + return execparamids; } /* @@ -3392,6 +3379,8 @@ match_boolean_partition_clause(Oid partopfamily, Expr *clause, Expr *partkey, * partkey_datum_from_expr * Evaluate expression for potential partition pruning * + * XXX this can't fail anymore, so change to return void + * * Evaluate 'expr', whose ExprState is stateidx of the context exprstate * array; set *value and *isnull to the resulting Datum and nullflag. * Return true if evaluation was possible, otherwise false. @@ -3417,30 +3406,19 @@ partkey_datum_from_expr(PartitionPruneContext *context, } else { + ExprState *exprstate; + ExprContext *ectx; + /* * We should never see a non-Const in a step unless we're running in * the executor. */ Assert(context->planstate != NULL); - /* - * When called from the executor we'll have a valid planstate so we - * may be able to evaluate an expression which could not be folded to - * a Const during planning. Since run-time pruning can occur both - * during initialization of the executor or while it's running, we - * must be careful here to evaluate expressions containing PARAM_EXEC - * Params only when told it's OK. - */ - if (context->evalexecparams || !context->exprhasexecparam[stateidx]) - { - ExprState *exprstate; - ExprContext *ectx; - - exprstate = context->exprstates[stateidx]; - ectx = context->planstate->ps_ExprContext; - *value = ExecEvalExprSwitchContext(exprstate, ectx, isnull); - return true; - } + exprstate = context->exprstates[stateidx]; + ectx = context->planstate->ps_ExprContext; + *value = ExecEvalExprSwitchContext(exprstate, ectx, isnull); + return true; } return false; diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h index b363aba..465a975 100644 --- a/src/include/executor/execPartition.h +++ b/src/include/executor/execPartition.h @@ -62,24 +62,24 @@ typedef struct PartitionRoutingInfo * subpart_map Subpart index by partition index, or -1. * present_parts A Bitmapset of the partition indexes that we * have subplans or subparts for. - * context Contains the context details required to call - * the partition pruning code. - * pruning_steps List of PartitionPruneSteps used to - * perform the actual pruning. - * do_initial_prune true if pruning should be performed during - * executor startup (for this partitioning level). - * do_exec_prune true if pruning should be performed during - * executor run (for this partitioning level). + * initial_pruning_steps List of PartitionPruneSteps used to + * perform executor startup pruning. + * exec_pruning_steps List of PartitionPruneSteps used to + * perform per-scan pruning. + * initial_context If initial_pruning_steps isn't NIL, contains + * the details needed to execute those steps. + * exec_context If exec_pruning_steps isn't NIL, contains + * the details needed to execute those steps. */ typedef struct PartitionedRelPruningData { int *subplan_map; int *subpart_map; Bitmapset *present_parts; - PartitionPruneContext context; - List *pruning_steps; - bool do_initial_prune; - bool do_exec_prune; + List *initial_pruning_steps; + List *exec_pruning_steps; + PartitionPruneContext initial_context; + PartitionPruneContext exec_context; } PartitionedRelPruningData; /* diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h index 1cce762..1241245 100644 --- a/src/include/nodes/plannodes.h +++ b/src/include/nodes/plannodes.h @@ -1109,21 +1109,23 @@ typedef struct PartitionedRelPruneInfo { NodeTag type; Index rtindex; /* RT index of partition rel for this level */ - List *pruning_steps; /* List of PartitionPruneStep, see below */ Bitmapset *present_parts; /* Indexes of all partitions which subplans or - * subparts are present for. */ - int nparts; /* Length of subplan_map[] and subpart_map[] */ - int nexprs; /* Length of hasexecparam[] */ + * subparts are present for */ + int nparts; /* Length of the following arrays: */ int *subplan_map; /* subplan index by partition index, or -1 */ int *subpart_map; /* subpart index by partition index, or -1 */ Oid *relid_map; /* relation OID by partition index, or 0 */ - bool *hasexecparam; /* true if corresponding pruning_step contains - * any PARAM_EXEC Params. */ - bool do_initial_prune; /* true if pruning should be performed - * during executor startup. */ - bool do_exec_prune; /* true if pruning should be performed during - * executor run. */ - Bitmapset *execparamids; /* All PARAM_EXEC Param IDs in pruning_steps */ + + /* + * initial_pruning_steps shows how to prune during executor startup (i.e., + * without use of any PARAM_EXEC Params); it is NIL if no startup pruning + * is required. exec_pruning_steps shows how to prune with PARAM_EXEC + * Params; it is NIL if no per-scan pruning is required. + */ + List *initial_pruning_steps; /* List of PartitionPruneStep */ + List *exec_pruning_steps; /* List of PartitionPruneStep */ + Bitmapset *execparamids; /* All PARAM_EXEC Param IDs in + * exec_pruning_steps */ } PartitionedRelPruneInfo; /* diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h index 2f75717..b906ae1 100644 --- a/src/include/partitioning/partprune.h +++ b/src/include/partitioning/partprune.h @@ -44,10 +44,6 @@ struct RelOptInfo; * exprstates Array of ExprStates, indexed as per PruneCtxStateIdx; one * for each partition key in each pruning step. Allocated if * planstate is non-NULL, otherwise NULL. - * exprhasexecparam Array of bools, each true if corresponding 'exprstate' - * expression contains any PARAM_EXEC Params. (Can be NULL - * if planstate is NULL.) - * evalexecparams True if it's safe to evaluate PARAM_EXEC Params. */ typedef struct PartitionPruneContext { @@ -61,8 +57,6 @@ typedef struct PartitionPruneContext MemoryContext ppccontext; PlanState *planstate; ExprState **exprstates; - bool *exprhasexecparam; - bool evalexecparams; } PartitionPruneContext; /* diff --git a/src/test/regress/expected/partition_prune.out b/src/test/regress/expected/partition_prune.out index 723fc70..841bd8b 100644 --- a/src/test/regress/expected/partition_prune.out +++ b/src/test/regress/expected/partition_prune.out @@ -3149,6 +3149,45 @@ select * from mc3p where a < 3 and abs(b) = 1; Filter: ((a < 3) AND (abs(b) = 1)) (7 rows) +-- +-- Check that pruning with composite range partitioning works correctly when +-- a combination of runtime parameters is specified, not all of whose values +-- are available at the same time +-- +set plan_cache_mode = force_generic_plan; +prepare ps1 as + select * from mc3p where a = $1 and abs(b) < (select 3); +explain (analyze, costs off, summary off, timing off) +execute ps1(1); + QUERY PLAN +------------------------------------------------- + Append (actual rows=1 loops=1) + InitPlan 1 (returns $0) + -> Result (actual rows=1 loops=1) + Subplans Removed: 2 + -> Seq Scan on mc3p1 (actual rows=1 loops=1) + Filter: ((a = $1) AND (abs(b) < $0)) +(6 rows) + +deallocate ps1; +prepare ps2 as + select * from mc3p where a <= $1 and abs(b) < (select 3); +explain (analyze, costs off, summary off, timing off) +execute ps2(1); + QUERY PLAN +------------------------------------------------- + Append (actual rows=2 loops=1) + InitPlan 1 (returns $0) + -> Result (actual rows=1 loops=1) + Subplans Removed: 1 + -> Seq Scan on mc3p0 (actual rows=1 loops=1) + Filter: ((a <= $1) AND (abs(b) < $0)) + -> Seq Scan on mc3p1 (actual rows=1 loops=1) + Filter: ((a <= $1) AND (abs(b) < $0)) +(8 rows) + +deallocate ps2; +reset plan_cache_mode; drop table mc3p; -- Ensure runtime pruning works with initplans params with boolean types create table boolvalues (value bool not null); diff --git a/src/test/regress/sql/partition_prune.sql b/src/test/regress/sql/partition_prune.sql index 2373bd8..071e28d 100644 --- a/src/test/regress/sql/partition_prune.sql +++ b/src/test/regress/sql/partition_prune.sql @@ -809,6 +809,24 @@ insert into mc3p values (0, 1, 1), (1, 1, 1), (2, 1, 1); explain (analyze, costs off, summary off, timing off) select * from mc3p where a < 3 and abs(b) = 1; +-- +-- Check that pruning with composite range partitioning works correctly when +-- a combination of runtime parameters is specified, not all of whose values +-- are available at the same time +-- +set plan_cache_mode = force_generic_plan; +prepare ps1 as + select * from mc3p where a = $1 and abs(b) < (select 3); +explain (analyze, costs off, summary off, timing off) +execute ps1(1); +deallocate ps1; +prepare ps2 as + select * from mc3p where a <= $1 and abs(b) < (select 3); +explain (analyze, costs off, summary off, timing off) +execute ps2(1); +deallocate ps2; +reset plan_cache_mode; + drop table mc3p; -- Ensure runtime pruning works with initplans params with boolean types
On 2019/05/17 5:37, Tom Lane wrote: > I wrote: >> So, the changes in gen_prune_steps_from_opexps seem okay, but >> the change in perform_pruning_base_step is just wrong, Thank you. Will look at your patch for the other bit. Regards, Amit
On 2019/05/17 4:25, Tom Lane wrote: > I wrote: >> Anyway, I've pushed David's patch now, on to look at Amit's. > > So, the changes in gen_prune_steps_from_opexps seem okay, but > the change in perform_pruning_base_step is just wrong, as can > be seen from this extended form of the test case: > > create table mc3p (a int, b int, c int) partition by range (a, abs(b), c); > create table mc3p1 partition of mc3p > for values from (1, 1, 1) to (1, 1, 10); > create table mc3p2 partition of mc3p > for values from (1, 1, 10) to (2, minvalue, minvalue); > create table mc3p3 partition of mc3p > for values from (2, minvalue, minvalue) to (3, maxvalue, maxvalue); > insert into mc3p values (1, 1, 1), (1,1,10), (2, 1, 1); > > set plan_cache_mode = force_generic_plan; > prepare init_exec_prune_q1 as > select * from mc3p where a = $1 and abs(b) <= $2 and c < (select 13); > explain (analyze, costs off, summary off, timing off) > execute init_exec_prune_q1 (1,2); > deallocate init_exec_prune_q1; > reset plan_cache_mode; > > drop table mc3p; > > This should of course find the (1,1,1) row, but it does not because > it converts the initial pruning condition to be "a = 1 and abs(b) = 2", > whereas the correct thing would be "a = 1 and abs(b) <= 2". Oops, you're right. > It seems to me that the correct fix, rather than forcing equality > strategy when you've missed out some of the quals, is to back up > and use the strategy of the last clause you did use. But unless > I'm missing something, that information is just not available from > the PartitionPruneStepOp data structure. It's not. PartitionPruneStepOp's opstrategy refers to the strategy of the operator for the last matched key. > We could extend the PartitionPruneStepOp struct to have an array > or list of StrategyNumbers (but I'm unsure about how safe that'd > be to backpatch into v11). I did consider that, but I too feared that it won't be back-patchable. > Another idea is that we ought to generate separate partitioning > steps lists for the initial-pruning and runtime-pruning cases, > ie invoke gen_partprune_steps yet a third time. Ugh (and that > still means a data structure change, just in a different place). > > The bottom line here is that we've got an inflexible data structure > that was designed on the assumption that gen_partprune_steps has all > the intelligence, and trying to adjust things later is just trouble. > > I think this is really quite closely tied to my previous complaints > about the design for plan-time versus run-time pruning, just mutatis > mutandis. The lack of any holistic, principled approach to > which-quals-apply-when is what's biting us here. > > Moreover, now that I understand what's going on here a bit better, > I am deathly afraid that there are yet other bugs hiding in this > same area. What we've basically got is that analyze_partkey_exprs > and perform_pruning_base_step are trying to reverse-engineer what > gen_partprune_steps would have done differently if it used only the > quals not containing exec params. They don't really have enough > information available to do that correctly, and even if they did, > the fact that the logic looks nothing like the same is going to > be a fertile source of bugs and maintenance gotchas. > > In other words, as I compose this I'm talking myself into the > idea that invoking gen_partprune_steps three times (for planner, > executor startup, and executor runtime, with corresponding > filters on which clauses can be used) is the only safe near-term > fix. In the future we can look at cutting down the repetitive > costs that entails. I have to agree that running gen_partprune_steps separately for each instance of performing get_matching_partitions() would result in cleaner code overall. Cleaner in the sense that the logic of which set of clauses are applicable to which instance of pruning is now centralized and doesn't require code to reverse-engineering that in multiple places, as you say. Maybe, we *can* avoid doing full-blown gen_partprune_steps() a 2nd and a 3rd time, an idea for which I'll post in reply to your other email. Thanks, Amit
On 2019/05/17 7:50, Tom Lane wrote: > I wrote: >> In other words, as I compose this I'm talking myself into the >> idea that invoking gen_partprune_steps three times (for planner, >> executor startup, and executor runtime, with corresponding >> filters on which clauses can be used) is the only safe near-term >> fix. In the future we can look at cutting down the repetitive >> costs that entails. > > Here's a really quick and dirty POC along that line. Thanks for the patch. > It's unfinished > but I'm out of time for today. I think the executor side of it is OK, > but the planner side is crufty: it lacks logic to decide whether > either the initial pruning or exec pruning steps are unnecessary. > And I didn't update most of the comments either. > > I think the rule for "exec pruning is unnecessary" could probably > just be "we didn't find any exec params in the prune steps", since > that would imply that the initial steps include everything. > I don't see an equally cheap way to decide that initial pruning > is useless, though. Thoughts? As promised, here is the idea I was thinking of, although I'm no longer confident you will like it based on your previous emails on this thread. I was thinking of having a walker function that traverses the list of *clauses* that were passed to make_partition_pruneinfo(), which puts the paramids into an output context struct, with separate Bitmapsets for PARAM_EXTERN and PARAM_EXEC parameters. In addition to running this walker, also do a contain_mutable_functions() pass over the clauses. (I can already hear a big no at this point!) As you might've guessed, the result of those two steps would decide whether we need to do the the extra gen_partprune_steps() steps. Fwiw, I prototyped it in the attached patch which applies on top of yours. Maybe as the only defense, I'd say that that would be cheaper in the fairly common case where no run-time pruning is needed than doing two full-blown gen_partprune_steps(). > Also, it seems like we could save at least some of the planning expense by > having the PARTCLAUSE_INITIAL pass report back whether it had discarded > any Param-bearing clauses or not. If not, there's definitely no need to > run the PARTCLAUSE_EXEC pass. That sounds like a good idea. Thanks, Amit
Вложения
Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes: > As promised, here is the idea I was thinking of, although I'm no longer > confident you will like it based on your previous emails on this thread. > I was thinking of having a walker function that traverses the list of > *clauses* that were passed to make_partition_pruneinfo(), which puts the > paramids into an output context struct, with separate Bitmapsets for > PARAM_EXTERN and PARAM_EXEC parameters. In addition to running this > walker, also do a contain_mutable_functions() pass over the clauses. (I > can already hear a big no at this point!) As you might've guessed, the > result of those two steps would decide whether we need to do the the extra > gen_partprune_steps() steps. Fwiw, I prototyped it in the attached patch > which applies on top of yours. Actually, yeah, my thoughts about future development were to split up the gen_partprune_steps processing to avoid the duplicated work. I've not read your patch yet but this sounds like the same idea. What I'm worried about is if it's too much code churn for a back-patchable bug fix. regards, tom lane
Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes: > On 2019/05/17 7:50, Tom Lane wrote: >> Also, it seems like we could save at least some of the planning expense by >> having the PARTCLAUSE_INITIAL pass report back whether it had discarded >> any Param-bearing clauses or not. If not, there's definitely no need to >> run the PARTCLAUSE_EXEC pass. > That sounds like a good idea. Here's a patch that does it like that. Since this requires additional output values from gen_partprune_steps, I decided to go "all in" on having GeneratePruningStepsContext contain everything of interest, including all these result flags. That makes the patch a bit more invasive than it could perhaps be otherwise, but I think it's cleaner. And these are just internal APIs in partprune.c so there's no compatibility problems to worry about. It's kind of hard to tell what this code is actually doing, due to the lack of visibility in EXPLAIN output (something I think Robert has kvetched about before). To try to check that, I made the very quick-and-dirty explain.c patches below, which I'm *not* proposing to commit. They showed that this code is making the same decisions as HEAD for all the currently-existing test cases, which is probably what we want it to do. The new test cases added by the main patch show up as having different initial and exec step lists, which is also expected. So I feel pretty good that this is acting as we wish. The fly in the ointment is what about back-patching? After looking around, I think that we can probably get away with changing the run-time data structures (PartitionPruneContext, PartitionedRelPruningData, PartitionPruningData) in v11; it seems unlikely that anything outside partprune.c and execPartition.c would have occasion to touch those. But it's harder to argue that for the node type PartitionedRelPruneInfo, since that's part of plan trees. What I suggest we do for v11 is to leave the existing fields in that node type that're removed by this patch present, but unused, and add initial_pruning_steps and exec_pruning_steps at the end. This would confuse any code that's studying the contents of PartitionedRelPruneInfo closely, but probably as long as we set do_initial_prune and do_exec_prune properly, that would satisfy most code that might be inspecting it. Since time is growing short before beta1 wrap, I'm going to go ahead and push this in hopes of getting buildfarm cycles on it. But feel free to review and look for further improvements. There's plenty of time to rethink the data-structure details for the v11 backpatch, also. regards, tom lane diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c index 70709e5..d663118 100644 --- a/src/backend/executor/execPartition.c +++ b/src/backend/executor/execPartition.c @@ -183,6 +183,11 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel, bool *isnull, int maxfieldlen); static List *adjust_partition_tlist(List *tlist, TupleConversionMap *map); +static void ExecInitPruningContext(PartitionPruneContext *context, + List *pruning_steps, + PartitionDesc partdesc, + PartitionKey partkey, + PlanState *planstate); static void find_matching_subplans_recurse(PartitionPruningData *prunedata, PartitionedRelPruningData *pprune, bool initial_prune, @@ -1614,13 +1619,9 @@ ExecCreatePartitionPruneState(PlanState *planstate, { PartitionedRelPruneInfo *pinfo = lfirst_node(PartitionedRelPruneInfo, lc2); PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j]; - PartitionPruneContext *context = &pprune->context; Relation partrel; PartitionDesc partdesc; PartitionKey partkey; - int partnatts; - int n_steps; - ListCell *lc3; /* present_parts is also subject to later modification */ pprune->present_parts = bms_copy(pinfo->present_parts); @@ -1700,66 +1701,36 @@ ExecCreatePartitionPruneState(PlanState *planstate, Assert(pd_idx == pinfo->nparts); } - n_steps = list_length(pinfo->pruning_steps); - - context->strategy = partkey->strategy; - context->partnatts = partnatts = partkey->partnatts; - context->nparts = pinfo->nparts; - context->boundinfo = partdesc->boundinfo; - context->partcollation = partkey->partcollation; - context->partsupfunc = partkey->partsupfunc; - - /* We'll look up type-specific support functions as needed */ - context->stepcmpfuncs = (FmgrInfo *) - palloc0(sizeof(FmgrInfo) * n_steps * partnatts); - - context->ppccontext = CurrentMemoryContext; - context->planstate = planstate; - - /* Initialize expression state for each expression we need */ - context->exprstates = (ExprState **) - palloc0(sizeof(ExprState *) * n_steps * partnatts); - foreach(lc3, pinfo->pruning_steps) + /* + * Initialize pruning contexts as needed. + */ + pprune->initial_pruning_steps = pinfo->initial_pruning_steps; + if (pinfo->initial_pruning_steps) { - PartitionPruneStepOp *step = (PartitionPruneStepOp *) lfirst(lc3); - ListCell *lc4; - int keyno; - - /* not needed for other step kinds */ - if (!IsA(step, PartitionPruneStepOp)) - continue; - - Assert(list_length(step->exprs) <= partnatts); - - keyno = 0; - foreach(lc4, step->exprs) - { - Expr *expr = (Expr *) lfirst(lc4); - - /* not needed for Consts */ - if (!IsA(expr, Const)) - { - int stateidx = PruneCxtStateIdx(partnatts, - step->step.step_id, - keyno); - - context->exprstates[stateidx] = - ExecInitExpr(expr, context->planstate); - } - keyno++; - } + ExecInitPruningContext(&pprune->initial_context, + pinfo->initial_pruning_steps, + partdesc, partkey, planstate); + /* Record whether initial pruning is needed at any level */ + prunestate->do_initial_prune = true; + } + else + { + /* + * Hack: ExecFindInitialMatchingSubPlans requires this field + * to be valid whether pruning or not. + */ + pprune->initial_context.nparts = partdesc->nparts; } - /* Array is not modified at runtime, so just point to plan's copy */ - context->exprhasexecparam = pinfo->hasexecparam; - - pprune->pruning_steps = pinfo->pruning_steps; - pprune->do_initial_prune = pinfo->do_initial_prune; - pprune->do_exec_prune = pinfo->do_exec_prune; - - /* Record if pruning would be useful at any level */ - prunestate->do_initial_prune |= pinfo->do_initial_prune; - prunestate->do_exec_prune |= pinfo->do_exec_prune; + pprune->exec_pruning_steps = pinfo->exec_pruning_steps; + if (pinfo->exec_pruning_steps) + { + ExecInitPruningContext(&pprune->exec_context, + pinfo->exec_pruning_steps, + partdesc, partkey, planstate); + /* Record whether exec pruning is needed at any level */ + prunestate->do_exec_prune = true; + } /* * Accumulate the IDs of all PARAM_EXEC Params affecting the @@ -1777,6 +1748,71 @@ ExecCreatePartitionPruneState(PlanState *planstate, } /* + * Initialize a PartitionPruneContext for the given list of pruning steps. + */ +static void +ExecInitPruningContext(PartitionPruneContext *context, + List *pruning_steps, + PartitionDesc partdesc, + PartitionKey partkey, + PlanState *planstate) +{ + int n_steps; + int partnatts; + ListCell *lc; + + n_steps = list_length(pruning_steps); + + context->strategy = partkey->strategy; + context->partnatts = partnatts = partkey->partnatts; + context->nparts = partdesc->nparts; + context->boundinfo = partdesc->boundinfo; + context->partcollation = partkey->partcollation; + context->partsupfunc = partkey->partsupfunc; + + /* We'll look up type-specific support functions as needed */ + context->stepcmpfuncs = (FmgrInfo *) + palloc0(sizeof(FmgrInfo) * n_steps * partnatts); + + context->ppccontext = CurrentMemoryContext; + context->planstate = planstate; + + /* Initialize expression state for each expression we need */ + context->exprstates = (ExprState **) + palloc0(sizeof(ExprState *) * n_steps * partnatts); + foreach(lc, pruning_steps) + { + PartitionPruneStepOp *step = (PartitionPruneStepOp *) lfirst(lc); + ListCell *lc2; + int keyno; + + /* not needed for other step kinds */ + if (!IsA(step, PartitionPruneStepOp)) + continue; + + Assert(list_length(step->exprs) <= partnatts); + + keyno = 0; + foreach(lc2, step->exprs) + { + Expr *expr = (Expr *) lfirst(lc2); + + /* not needed for Consts */ + if (!IsA(expr, Const)) + { + int stateidx = PruneCxtStateIdx(partnatts, + step->step.step_id, + keyno); + + context->exprstates[stateidx] = + ExecInitExpr(expr, context->planstate); + } + keyno++; + } + } +} + +/* * ExecFindInitialMatchingSubPlans * Identify the set of subplans that cannot be eliminated by initial * pruning, disregarding any pruning constraints involving PARAM_EXEC @@ -1824,7 +1860,8 @@ ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate, int nsubplans) find_matching_subplans_recurse(prunedata, pprune, true, &result); /* Expression eval may have used space in node's ps_ExprContext too */ - ResetExprContext(pprune->context.planstate->ps_ExprContext); + if (pprune->initial_pruning_steps) + ResetExprContext(pprune->initial_context.planstate->ps_ExprContext); } /* Add in any subplans that partition pruning didn't account for */ @@ -1888,7 +1925,7 @@ ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate, int nsubplans) for (j = prunedata->num_partrelprunedata - 1; j >= 0; j--) { PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j]; - int nparts = pprune->context.nparts; + int nparts = pprune->initial_context.nparts; int k; /* We just rebuild present_parts from scratch */ @@ -1993,7 +2030,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate) find_matching_subplans_recurse(prunedata, pprune, false, &result); /* Expression eval may have used space in node's ps_ExprContext too */ - ResetExprContext(pprune->context.planstate->ps_ExprContext); + if (pprune->exec_pruning_steps) + ResetExprContext(pprune->exec_context.planstate->ps_ExprContext); } /* Add in any subplans that partition pruning didn't account for */ @@ -2029,15 +2067,15 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata, check_stack_depth(); /* Only prune if pruning would be useful at this level. */ - if (initial_prune ? pprune->do_initial_prune : pprune->do_exec_prune) + if (initial_prune && pprune->initial_pruning_steps) { - PartitionPruneContext *context = &pprune->context; - - /* Set whether we can evaluate PARAM_EXEC Params or not */ - context->evalexecparams = !initial_prune; - - partset = get_matching_partitions(context, - pprune->pruning_steps); + partset = get_matching_partitions(&pprune->initial_context, + pprune->initial_pruning_steps); + } + else if (!initial_prune && pprune->exec_pruning_steps) + { + partset = get_matching_partitions(&pprune->exec_context, + pprune->exec_pruning_steps); } else { diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c index 780d7ab..78deade 100644 --- a/src/backend/nodes/copyfuncs.c +++ b/src/backend/nodes/copyfuncs.c @@ -1198,16 +1198,13 @@ _copyPartitionedRelPruneInfo(const PartitionedRelPruneInfo *from) PartitionedRelPruneInfo *newnode = makeNode(PartitionedRelPruneInfo); COPY_SCALAR_FIELD(rtindex); - COPY_NODE_FIELD(pruning_steps); COPY_BITMAPSET_FIELD(present_parts); COPY_SCALAR_FIELD(nparts); - COPY_SCALAR_FIELD(nexprs); COPY_POINTER_FIELD(subplan_map, from->nparts * sizeof(int)); COPY_POINTER_FIELD(subpart_map, from->nparts * sizeof(int)); COPY_POINTER_FIELD(relid_map, from->nparts * sizeof(Oid)); - COPY_POINTER_FIELD(hasexecparam, from->nexprs * sizeof(bool)); - COPY_SCALAR_FIELD(do_initial_prune); - COPY_SCALAR_FIELD(do_exec_prune); + COPY_NODE_FIELD(initial_pruning_steps); + COPY_NODE_FIELD(exec_pruning_steps); COPY_BITMAPSET_FIELD(execparamids); return newnode; diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c index 387e4b9..237598e 100644 --- a/src/backend/nodes/outfuncs.c +++ b/src/backend/nodes/outfuncs.c @@ -948,16 +948,13 @@ _outPartitionedRelPruneInfo(StringInfo str, const PartitionedRelPruneInfo *node) WRITE_NODE_TYPE("PARTITIONEDRELPRUNEINFO"); WRITE_UINT_FIELD(rtindex); - WRITE_NODE_FIELD(pruning_steps); WRITE_BITMAPSET_FIELD(present_parts); WRITE_INT_FIELD(nparts); - WRITE_INT_FIELD(nexprs); WRITE_INT_ARRAY(subplan_map, node->nparts); WRITE_INT_ARRAY(subpart_map, node->nparts); WRITE_OID_ARRAY(relid_map, node->nparts); - WRITE_BOOL_ARRAY(hasexecparam, node->nexprs); - WRITE_BOOL_FIELD(do_initial_prune); - WRITE_BOOL_FIELD(do_exec_prune); + WRITE_NODE_FIELD(initial_pruning_steps); + WRITE_NODE_FIELD(exec_pruning_steps); WRITE_BITMAPSET_FIELD(execparamids); } diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c index 3b96492..6c2626e 100644 --- a/src/backend/nodes/readfuncs.c +++ b/src/backend/nodes/readfuncs.c @@ -2388,16 +2388,13 @@ _readPartitionedRelPruneInfo(void) READ_LOCALS(PartitionedRelPruneInfo); READ_UINT_FIELD(rtindex); - READ_NODE_FIELD(pruning_steps); READ_BITMAPSET_FIELD(present_parts); READ_INT_FIELD(nparts); - READ_INT_FIELD(nexprs); READ_INT_ARRAY(subplan_map, local_node->nparts); READ_INT_ARRAY(subpart_map, local_node->nparts); READ_OID_ARRAY(relid_map, local_node->nparts); - READ_BOOL_ARRAY(hasexecparam, local_node->nexprs); - READ_BOOL_FIELD(do_initial_prune); - READ_BOOL_FIELD(do_exec_prune); + READ_NODE_FIELD(initial_pruning_steps); + READ_NODE_FIELD(exec_pruning_steps); READ_BITMAPSET_FIELD(execparamids); READ_DONE(); diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c index ff3caf1..53f814f 100644 --- a/src/backend/partitioning/partprune.c +++ b/src/backend/partitioning/partprune.c @@ -85,21 +85,42 @@ typedef enum PartClauseMatchStatus } PartClauseMatchStatus; /* + * PartClauseTarget + * Identifies which qual clauses we can use for generating pruning steps + */ +typedef enum PartClauseTarget +{ + PARTTARGET_PLANNER, /* want to prune during planning */ + PARTTARGET_INITIAL, /* want to prune during executor startup */ + PARTTARGET_EXEC /* want to prune during each plan node scan */ +} PartClauseTarget; + +/* * GeneratePruningStepsContext * Information about the current state of generation of "pruning steps" * for a given set of clauses * - * gen_partprune_steps() initializes an instance of this struct, which is used - * throughout the step generation process. + * gen_partprune_steps() initializes and returns an instance of this struct. + * + * Note that has_mutable_op, has_mutable_arg, and has_exec_param are set if + * we found any potentially-useful-for-pruning clause having those properties, + * whether or not we actually used the clause in the steps list. This + * definition allows us to skip the PARTTARGET_EXEC pass in some cases. */ typedef struct GeneratePruningStepsContext { - /* Input data: */ - bool forplanner; /* true when generating steps to be used - * during query planning */ - /* Working state and result data: */ + /* Copies of input arguments for gen_partprune_steps: */ + RelOptInfo *rel; /* the partitioned relation */ + PartClauseTarget target; /* use-case we're generating steps for */ + /* Result data: */ + List *steps; /* list of PartitionPruneSteps */ + bool has_mutable_op; /* clauses include any stable operators */ + bool has_mutable_arg; /* clauses include any mutable comparison + * values, *other than* exec params */ + bool has_exec_param; /* clauses include any PARAM_EXEC params */ + bool contradictory; /* clauses were proven self-contradictory */ + /* Working state: */ int next_step_id; - List *steps; /* output, list of PartitionPruneSteps */ } GeneratePruningStepsContext; /* The result of performing one PartitionPruneStep */ @@ -121,22 +142,20 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root, int *relid_subplan_map, List *partitioned_rels, List *prunequal, Bitmapset **matchedsubplans); -static List *gen_partprune_steps(RelOptInfo *rel, List *clauses, - bool forplanner, bool *contradictory); +static void gen_partprune_steps(RelOptInfo *rel, List *clauses, + PartClauseTarget target, + GeneratePruningStepsContext *context); static List *gen_partprune_steps_internal(GeneratePruningStepsContext *context, - RelOptInfo *rel, List *clauses, - bool *contradictory); + List *clauses); static PartitionPruneStep *gen_prune_step_op(GeneratePruningStepsContext *context, StrategyNumber opstrategy, bool op_is_ne, List *exprs, List *cmpfns, Bitmapset *nullkeys); static PartitionPruneStep *gen_prune_step_combine(GeneratePruningStepsContext *context, List *source_stepids, PartitionPruneCombineOp combineOp); -static PartitionPruneStep *gen_prune_steps_from_opexps(PartitionScheme part_scheme, - GeneratePruningStepsContext *context, +static PartitionPruneStep *gen_prune_steps_from_opexps(GeneratePruningStepsContext *context, List **keyclauses, Bitmapset *nullkeys); -static PartClauseMatchStatus match_clause_to_partition_key(RelOptInfo *rel, - GeneratePruningStepsContext *context, +static PartClauseMatchStatus match_clause_to_partition_key(GeneratePruningStepsContext *context, Expr *clause, Expr *partkey, int partkeyidx, bool *clause_is_not_null, PartClauseInfo **pc, List **clause_steps); @@ -169,8 +188,7 @@ static PruneStepResult *get_matching_range_bounds(PartitionPruneContext *context FmgrInfo *partsupfunc, Bitmapset *nullkeys); static Bitmapset *pull_exec_paramids(Expr *expr); static bool pull_exec_paramids_walker(Node *node, Bitmapset **context); -static bool analyze_partkey_exprs(PartitionedRelPruneInfo *pinfo, List *steps, - int partnatts); +static Bitmapset *get_partkey_exec_paramids(List *steps); static PruneStepResult *perform_pruning_base_step(PartitionPruneContext *context, PartitionPruneStepOp *opstep); static PruneStepResult *perform_pruning_combine_step(PartitionPruneContext *context, @@ -178,7 +196,7 @@ static PruneStepResult *perform_pruning_combine_step(PartitionPruneContext *cont PruneStepResult **step_results); static bool match_boolean_partition_clause(Oid partopfamily, Expr *clause, Expr *partkey, Expr **outconst); -static bool partkey_datum_from_expr(PartitionPruneContext *context, +static void partkey_datum_from_expr(PartitionPruneContext *context, Expr *expr, int stateidx, Datum *value, bool *isnull); @@ -347,10 +365,11 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel, Index rti = lfirst_int(lc); RelOptInfo *subpart = find_base_rel(root, rti); PartitionedRelPruneInfo *pinfo; - int partnatts = subpart->part_scheme->partnatts; List *partprunequal; - List *pruning_steps; - bool contradictory; + List *initial_pruning_steps; + List *exec_pruning_steps; + Bitmapset *execparamids; + GeneratePruningStepsContext context; /* * Fill the mapping array. @@ -415,15 +434,16 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel, } /* - * Convert pruning qual to pruning steps. Since these steps will be - * used in the executor, we can pass 'forplanner' as false to allow - * steps to be generated that are unsafe for evaluation during - * planning, e.g. evaluation of stable functions. + * Convert pruning qual to pruning steps. We may need to do this + * twice, once to obtain executor startup pruning steps, and once for + * executor per-scan pruning steps. This first pass creates startup + * pruning steps and detects whether there's any possibly-useful quals + * that would require per-scan pruning. */ - pruning_steps = gen_partprune_steps(subpart, partprunequal, false, - &contradictory); + gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL, + &context); - if (contradictory) + if (context.contradictory) { /* * This shouldn't happen as the planner should have detected this @@ -437,20 +457,63 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel, return NIL; } + /* + * If no mutable operators or expressions appear in usable pruning + * clauses, then there's no point in running startup pruning, because + * plan-time pruning should have pruned everything prunable. + */ + if (context.has_mutable_op || context.has_mutable_arg) + initial_pruning_steps = context.steps; + else + initial_pruning_steps = NIL; + + /* + * If no exec Params appear in potentially-usable pruning clauses, + * then there's no point in even thinking about per-scan pruning. + */ + if (context.has_exec_param) + { + /* ... OK, we'd better think about it */ + gen_partprune_steps(subpart, partprunequal, PARTTARGET_EXEC, + &context); + + if (context.contradictory) + { + /* As above, skip run-time pruning if anything fishy happens */ + return NIL; + } + + exec_pruning_steps = context.steps; + + /* + * Detect which exec Params actually got used; the fact that some + * were in available clauses doesn't mean we actually used them. + * Skip per-scan pruning if there are none. + */ + execparamids = get_partkey_exec_paramids(exec_pruning_steps); + + if (bms_is_empty(execparamids)) + exec_pruning_steps = NIL; + } + else + { + /* No exec Params anywhere, so forget about scan-time pruning */ + exec_pruning_steps = NIL; + execparamids = NULL; + } + + if (initial_pruning_steps || exec_pruning_steps) + doruntimeprune = true; + /* Begin constructing the PartitionedRelPruneInfo for this rel */ pinfo = makeNode(PartitionedRelPruneInfo); pinfo->rtindex = rti; - pinfo->pruning_steps = pruning_steps; + pinfo->initial_pruning_steps = initial_pruning_steps; + pinfo->exec_pruning_steps = exec_pruning_steps; + pinfo->execparamids = execparamids; /* Remaining fields will be filled in the next loop */ pinfolist = lappend(pinfolist, pinfo); - - /* - * Determine which pruning types should be enabled at this level. This - * also records paramids relevant to pruning steps in 'pinfo'. - */ - doruntimeprune |= analyze_partkey_exprs(pinfo, pruning_steps, - partnatts); } if (!doruntimeprune) @@ -532,37 +595,25 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel, /* * gen_partprune_steps - * Process 'clauses' (a rel's baserestrictinfo list of clauses) and return - * a list of "partition pruning steps". + * Process 'clauses' (typically a rel's baserestrictinfo list of clauses) + * and create a list of "partition pruning steps". * - * 'forplanner' must be true when generating steps to be evaluated during - * query planning, false when generating steps to be used at run-time. + * 'target' tells whether to generate pruning steps for planning (use + * immutable clauses only), or for executor startup (use any allowable + * clause except ones containing PARAM_EXEC Params), or for executor + * per-scan pruning (use any allowable clause). * - * The result generated with forplanner=false includes all clauses that - * are selected with forplanner=true, because in some cases we need a - * combination of clauses to prune successfully. For example, if we - * are partitioning on a hash of columns A and B, and we have clauses - * "WHERE A=constant AND B=nonconstant", we can't do anything at plan - * time even though the first clause would be evaluable then. And we - * must include the first clause when called with forplanner=false, - * or we'll fail to prune at run-time either. This does mean that when - * called with forplanner=false, we may return steps that don't actually - * need to be executed at runtime; it's left to analyze_partkey_exprs() - * to (re)discover that. - * - * If the clauses in the input list are contradictory or there is a - * pseudo-constant "false", *contradictory is set to true upon return, - * else it's set false. + * 'context' is an output argument that receives the steps list as well as + * some subsidiary flags; see the GeneratePruningStepsContext typedef. */ -static List * -gen_partprune_steps(RelOptInfo *rel, List *clauses, bool forplanner, - bool *contradictory) +static void +gen_partprune_steps(RelOptInfo *rel, List *clauses, PartClauseTarget target, + GeneratePruningStepsContext *context) { - GeneratePruningStepsContext context; - - context.forplanner = forplanner; - context.next_step_id = 0; - context.steps = NIL; + /* Initialize all output values to zero/false/NULL */ + memset(context, 0, sizeof(GeneratePruningStepsContext)); + context->rel = rel; + context->target = target; /* * For sub-partitioned tables there's a corner case where if the @@ -593,9 +644,7 @@ gen_partprune_steps(RelOptInfo *rel, List *clauses, bool forplanner, } /* Down into the rabbit-hole. */ - (void) gen_partprune_steps_internal(&context, rel, clauses, contradictory); - - return context.steps; + (void) gen_partprune_steps_internal(context, clauses); } /* @@ -613,7 +662,7 @@ prune_append_rel_partitions(RelOptInfo *rel) { List *clauses = rel->baserestrictinfo; List *pruning_steps; - bool contradictory; + GeneratePruningStepsContext gcontext; PartitionPruneContext context; Assert(rel->part_scheme != NULL); @@ -630,15 +679,19 @@ prune_append_rel_partitions(RelOptInfo *rel) return bms_add_range(NULL, 0, rel->nparts - 1); /* - * Process clauses. If the clauses are found to be contradictory, we can - * return the empty set. Pass 'forplanner' as true to indicate to the - * pruning code that we only want pruning steps that can be evaluated - * during planning. + * Process clauses to extract pruning steps that are usable at plan time. + * If the clauses are found to be contradictory, we can return the empty + * set. */ - pruning_steps = gen_partprune_steps(rel, clauses, true, - &contradictory); - if (contradictory) + gen_partprune_steps(rel, clauses, PARTTARGET_PLANNER, + &gcontext); + if (gcontext.contradictory) return NULL; + pruning_steps = gcontext.steps; + + /* If there's nothing usable, return all partitions */ + if (pruning_steps == NIL) + return bms_add_range(NULL, 0, rel->nparts - 1); /* Set up PartitionPruneContext */ context.strategy = rel->part_scheme->strategy; @@ -655,8 +708,6 @@ prune_append_rel_partitions(RelOptInfo *rel) /* These are not valid when being called from the planner */ context.planstate = NULL; context.exprstates = NULL; - context.exprhasexecparam = NULL; - context.evalexecparams = false; /* Actual pruning happens here. */ return get_matching_partitions(&context, pruning_steps); @@ -667,7 +718,7 @@ prune_append_rel_partitions(RelOptInfo *rel) * Determine partitions that survive partition pruning * * Note: context->planstate must be set to a valid PlanState when the - * pruning_steps were generated with 'forplanner' = false. + * pruning_steps were generated with a target other than PARTTARGET_PLANNER. * * Returns a Bitmapset of the RelOptInfo->part_rels indexes of the surviving * partitions. @@ -786,16 +837,15 @@ get_matching_partitions(PartitionPruneContext *context, List *pruning_steps) * even across recursive calls. * * If we find clauses that are mutually contradictory, or a pseudoconstant - * clause that contains false, we set *contradictory to true and return NIL - * (that is, no pruning steps). Caller should consider all partitions as - * pruned in that case. Otherwise, *contradictory is set to false. + * clause that contains false, we set context->contradictory to true and + * return NIL (that is, no pruning steps). Caller should consider all + * partitions as pruned in that case. */ static List * gen_partprune_steps_internal(GeneratePruningStepsContext *context, - RelOptInfo *rel, List *clauses, - bool *contradictory) + List *clauses) { - PartitionScheme part_scheme = rel->part_scheme; + PartitionScheme part_scheme = context->rel->part_scheme; List *keyclauses[PARTITION_MAX_KEYS]; Bitmapset *nullkeys = NULL, *notnullkeys = NULL; @@ -803,8 +853,6 @@ gen_partprune_steps_internal(GeneratePruningStepsContext *context, List *result = NIL; ListCell *lc; - *contradictory = false; - memset(keyclauses, 0, sizeof(keyclauses)); foreach(lc, clauses) { @@ -820,7 +868,7 @@ gen_partprune_steps_internal(GeneratePruningStepsContext *context, (((Const *) clause)->constisnull || !DatumGetBool(((Const *) clause)->constvalue))) { - *contradictory = true; + context->contradictory = true; return NIL; } @@ -842,6 +890,12 @@ gen_partprune_steps_internal(GeneratePruningStepsContext *context, ListCell *lc1; /* + * We can share the outer context area with the recursive + * call, but contradictory had better not be true yet. + */ + Assert(!context->contradictory); + + /* * Get pruning step for each arg. If we get contradictory for * all args, it means the OR expression is false as a whole. */ @@ -851,11 +905,18 @@ gen_partprune_steps_internal(GeneratePruningStepsContext *context, bool arg_contradictory; List *argsteps; - argsteps = - gen_partprune_steps_internal(context, rel, - list_make1(arg), - &arg_contradictory); - if (!arg_contradictory) + argsteps = gen_partprune_steps_internal(context, + list_make1(arg)); + arg_contradictory = context->contradictory; + /* Keep context->contradictory clear till we're done */ + context->contradictory = false; + + if (arg_contradictory) + { + /* Just ignore self-contradictory arguments. */ + continue; + } + else all_args_contradictory = false; if (argsteps != NIL) @@ -869,34 +930,28 @@ gen_partprune_steps_internal(GeneratePruningStepsContext *context, else { /* - * No steps either means that arg_contradictory is - * true or the arg didn't contain a clause matching - * this partition key. + * The arg didn't contain a clause matching this + * partition key. We cannot prune using such an arg. + * To indicate that to the pruning code, we must + * construct a dummy PartitionPruneStepCombine whose + * source_stepids is set to an empty List. * - * In case of the latter, we cannot prune using such - * an arg. To indicate that to the pruning code, we - * must construct a dummy PartitionPruneStepCombine - * whose source_stepids is set to an empty List. * However, if we can prove using constraint exclusion * that the clause refutes the table's partition * constraint (if it's sub-partitioned), we need not * bother with that. That is, we effectively ignore * this OR arm. */ - List *partconstr = rel->partition_qual; + List *partconstr = context->rel->partition_qual; PartitionPruneStep *orstep; - /* Just ignore this argument. */ - if (arg_contradictory) - continue; - if (partconstr) { partconstr = (List *) expression_planner((Expr *) partconstr); - if (rel->relid != 1) + if (context->rel->relid != 1) ChangeVarNodes((Node *) partconstr, 1, - rel->relid, 0); + context->rel->relid, 0); if (predicate_refuted_by(partconstr, list_make1(arg), false)) @@ -909,11 +964,12 @@ gen_partprune_steps_internal(GeneratePruningStepsContext *context, } } - *contradictory = all_args_contradictory; - - /* Check if any contradicting clauses were found */ - if (*contradictory) + /* If all the OR arms are contradictory, we can stop */ + if (all_args_contradictory) + { + context->contradictory = true; return NIL; + } if (arg_stepids != NIL) { @@ -937,9 +993,10 @@ gen_partprune_steps_internal(GeneratePruningStepsContext *context, * recurse and later combine the component partitions sets * using a combine step. */ - argsteps = gen_partprune_steps_internal(context, rel, args, - contradictory); - if (*contradictory) + argsteps = gen_partprune_steps_internal(context, args); + + /* If any AND arm is contradictory, we can stop immediately */ + if (context->contradictory) return NIL; foreach(lc1, argsteps) @@ -969,17 +1026,16 @@ gen_partprune_steps_internal(GeneratePruningStepsContext *context, } /* - * Must be a clause for which we can check if one of its args matches - * the partition key. + * See if we can match this clause to any of the partition keys. */ for (i = 0; i < part_scheme->partnatts; i++) { - Expr *partkey = linitial(rel->partexprs[i]); + Expr *partkey = linitial(context->rel->partexprs[i]); bool clause_is_not_null = false; PartClauseInfo *pc = NULL; List *clause_steps = NIL; - switch (match_clause_to_partition_key(rel, context, + switch (match_clause_to_partition_key(context, clause, partkey, i, &clause_is_not_null, &pc, &clause_steps)) @@ -993,7 +1049,7 @@ gen_partprune_steps_internal(GeneratePruningStepsContext *context, */ if (bms_is_member(i, nullkeys)) { - *contradictory = true; + context->contradictory = true; return NIL; } generate_opsteps = true; @@ -1006,7 +1062,7 @@ gen_partprune_steps_internal(GeneratePruningStepsContext *context, /* check for conflicting IS NOT NULL */ if (bms_is_member(i, notnullkeys)) { - *contradictory = true; + context->contradictory = true; return NIL; } nullkeys = bms_add_member(nullkeys, i); @@ -1016,7 +1072,7 @@ gen_partprune_steps_internal(GeneratePruningStepsContext *context, /* check for conflicting IS NULL */ if (bms_is_member(i, nullkeys)) { - *contradictory = true; + context->contradictory = true; return NIL; } notnullkeys = bms_add_member(notnullkeys, i); @@ -1030,7 +1086,7 @@ gen_partprune_steps_internal(GeneratePruningStepsContext *context, case PARTCLAUSE_MATCH_CONTRADICT: /* We've nothing more to do if a contradiction was found. */ - *contradictory = true; + context->contradictory = true; return NIL; case PARTCLAUSE_NOMATCH: @@ -1091,8 +1147,7 @@ gen_partprune_steps_internal(GeneratePruningStepsContext *context, PartitionPruneStep *step; /* Strategy 2 */ - step = gen_prune_steps_from_opexps(part_scheme, context, - keyclauses, nullkeys); + step = gen_prune_steps_from_opexps(context, keyclauses, nullkeys); if (step != NULL) result = lappend(result, step); } @@ -1201,15 +1256,15 @@ gen_prune_step_combine(GeneratePruningStepsContext *context, * found for any subsequent keys; see specific notes below. */ static PartitionPruneStep * -gen_prune_steps_from_opexps(PartitionScheme part_scheme, - GeneratePruningStepsContext *context, +gen_prune_steps_from_opexps(GeneratePruningStepsContext *context, List **keyclauses, Bitmapset *nullkeys) { - ListCell *lc; + PartitionScheme part_scheme = context->rel->part_scheme; List *opsteps = NIL; List *btree_clauses[BTMaxStrategyNumber + 1], *hash_clauses[HTMaxStrategyNumber + 1]; int i; + ListCell *lc; memset(btree_clauses, 0, sizeof(btree_clauses)); memset(hash_clauses, 0, sizeof(hash_clauses)); @@ -1563,13 +1618,12 @@ gen_prune_steps_from_opexps(PartitionScheme part_scheme, * Output arguments: none set. */ static PartClauseMatchStatus -match_clause_to_partition_key(RelOptInfo *rel, - GeneratePruningStepsContext *context, +match_clause_to_partition_key(GeneratePruningStepsContext *context, Expr *clause, Expr *partkey, int partkeyidx, bool *clause_is_not_null, PartClauseInfo **pc, List **clause_steps) { - PartitionScheme part_scheme = rel->part_scheme; + PartitionScheme part_scheme = context->rel->part_scheme; Oid partopfamily = part_scheme->partopfamily[partkeyidx], partcoll = part_scheme->partcollation[partkeyidx]; Expr *expr; @@ -1588,7 +1642,7 @@ match_clause_to_partition_key(RelOptInfo *rel, partclause->op_is_ne = false; partclause->expr = expr; /* We know that expr is of Boolean type. */ - partclause->cmpfn = rel->part_scheme->partsupfunc[partkeyidx].fn_oid; + partclause->cmpfn = part_scheme->partsupfunc[partkeyidx].fn_oid; partclause->op_strategy = InvalidStrategy; *pc = partclause; @@ -1647,59 +1701,6 @@ match_clause_to_partition_key(RelOptInfo *rel, return PARTCLAUSE_NOMATCH; /* - * Matched with this key. Now check various properties of the clause - * to see if it's sane to use it for pruning. In most of these cases, - * we can return UNSUPPORTED because the same failure would occur no - * matter which partkey it's matched to. (In particular, now that - * we've successfully matched one side of the opclause to a partkey, - * there is no chance that matching the other side to another partkey - * will produce a usable result, since that'd mean there are Vars on - * both sides.) - */ - if (context->forplanner) - { - /* - * When pruning in the planner, we only support pruning using - * comparisons to constants. Immutable subexpressions will have - * been folded to constants already, and we cannot prune on the - * basis of anything that's not immutable. - */ - if (!IsA(expr, Const)) - return PARTCLAUSE_UNSUPPORTED; - - /* - * Also, the comparison operator itself must be immutable. - */ - if (op_volatile(opno) != PROVOLATILE_IMMUTABLE) - return PARTCLAUSE_UNSUPPORTED; - } - else - { - /* - * Otherwise, non-consts are allowed, but we can't prune using an - * expression that contains Vars. - */ - if (contain_var_clause((Node *) expr)) - return PARTCLAUSE_UNSUPPORTED; - - /* - * And we must reject anything containing a volatile function. - * Stable functions are OK though. (We need not check this for - * the comparison operator itself: anything that belongs to a - * partitioning operator family must be at least stable.) - */ - if (contain_volatile_functions((Node *) expr)) - return PARTCLAUSE_UNSUPPORTED; - } - - /* - * Only allow strict operators. This will guarantee nulls are - * filtered. - */ - if (!op_strict(opno)) - return PARTCLAUSE_UNSUPPORTED; - - /* * See if the operator is relevant to the partitioning opfamily. * * Normally we only care about operators that are listed as being part @@ -1740,6 +1741,95 @@ match_clause_to_partition_key(RelOptInfo *rel, } /* + * Only allow strict operators. This will guarantee nulls are + * filtered. (This test is likely useless, since btree and hash + * comparison operators are generally strict.) + */ + if (!op_strict(opno)) + return PARTCLAUSE_UNSUPPORTED; + + /* + * OK, we have a match to the partition key and a suitable operator. + * Examine the other argument to see if it's usable for pruning. + * + * In most of these cases, we can return UNSUPPORTED because the same + * failure would occur no matter which partkey it's matched to. (In + * particular, now that we've successfully matched one side of the + * opclause to a partkey, there is no chance that matching the other + * side to another partkey will produce a usable result, since that'd + * mean there are Vars on both sides.) + * + * Also, if we reject an argument for a target-dependent reason, set + * appropriate fields of *context to report that. We postpone these + * tests until after matching the partkey and the operator, so as to + * reduce the odds of setting the context fields for clauses that do + * not end up contributing to pruning steps. + * + * First, check for non-Const argument. (We assume that any immutable + * subexpression will have been folded to a Const already.) + */ + if (!IsA(expr, Const)) + { + Bitmapset *paramids; + + /* + * When pruning in the planner, we only support pruning using + * comparisons to constants. We cannot prune on the basis of + * anything that's not immutable. (Note that has_mutable_arg and + * has_exec_param do not get set for this target value.) + */ + if (context->target == PARTTARGET_PLANNER) + return PARTCLAUSE_UNSUPPORTED; + + /* + * We can never prune using an expression that contains Vars. + */ + if (contain_var_clause((Node *) expr)) + return PARTCLAUSE_UNSUPPORTED; + + /* + * And we must reject anything containing a volatile function. + * Stable functions are OK though. + */ + if (contain_volatile_functions((Node *) expr)) + return PARTCLAUSE_UNSUPPORTED; + + /* + * See if there are any exec Params. If so, we can only use this + * expression during per-scan pruning. + */ + paramids = pull_exec_paramids(expr); + if (!bms_is_empty(paramids)) + { + context->has_exec_param = true; + if (context->target != PARTTARGET_EXEC) + return PARTCLAUSE_UNSUPPORTED; + } + else + { + /* It's potentially usable, but mutable */ + context->has_mutable_arg = true; + } + } + + /* + * Check whether the comparison operator itself is immutable. (We + * assume anything that's in a btree or hash opclass is at least + * stable, but we need to check for immutability.) + */ + if (op_volatile(opno) != PROVOLATILE_IMMUTABLE) + { + context->has_mutable_op = true; + + /* + * When pruning in the planner, we cannot prune with mutable + * operators. + */ + if (context->target == PARTTARGET_PLANNER) + return PARTCLAUSE_UNSUPPORTED; + } + + /* * Now find the procedure to use, based on the types. If the clause's * other argument is of the same type as the partitioning opclass's * declared input type, we can use the procedure cached in @@ -1822,7 +1912,6 @@ match_clause_to_partition_key(RelOptInfo *rel, List *elem_exprs, *elem_clauses; ListCell *lc1; - bool contradictory; if (IsA(leftop, RelabelType)) leftop = ((RelabelType *) leftop)->arg; @@ -1833,54 +1922,8 @@ match_clause_to_partition_key(RelOptInfo *rel, return PARTCLAUSE_NOMATCH; /* - * Matched with this key. Check various properties of the clause to - * see if it can sanely be used for partition pruning (this is - * identical to the logic for a plain OpExpr). - */ - if (context->forplanner) - { - /* - * When pruning in the planner, we only support pruning using - * comparisons to constants. Immutable subexpressions will have - * been folded to constants already, and we cannot prune on the - * basis of anything that's not immutable. - */ - if (!IsA(rightop, Const)) - return PARTCLAUSE_UNSUPPORTED; - - /* - * Also, the comparison operator itself must be immutable. - */ - if (op_volatile(saop_op) != PROVOLATILE_IMMUTABLE) - return PARTCLAUSE_UNSUPPORTED; - } - else - { - /* - * Otherwise, non-consts are allowed, but we can't prune using an - * expression that contains Vars. - */ - if (contain_var_clause((Node *) rightop)) - return PARTCLAUSE_UNSUPPORTED; - - /* - * And we must reject anything containing a volatile function. - * Stable functions are OK though. (We need not check this for - * the comparison operator itself: anything that belongs to a - * partitioning operator family must be at least stable.) - */ - if (contain_volatile_functions((Node *) rightop)) - return PARTCLAUSE_UNSUPPORTED; - } - - /* - * Only allow strict operators. This will guarantee nulls are - * filtered. - */ - if (!op_strict(saop_op)) - return PARTCLAUSE_UNSUPPORTED; - - /* + * See if the operator is relevant to the partitioning opfamily. + * * In case of NOT IN (..), we get a '<>', which we handle if list * partitioning is in use and we're able to confirm that it's negator * is a btree equality operator belonging to the partitioning operator @@ -1911,12 +1954,89 @@ match_clause_to_partition_key(RelOptInfo *rel, } /* - * First generate a list of Const nodes, one for each array element - * (excepting nulls). + * Only allow strict operators. This will guarantee nulls are + * filtered. (This test is likely useless, since btree and hash + * comparison operators are generally strict.) + */ + if (!op_strict(saop_op)) + return PARTCLAUSE_UNSUPPORTED; + + /* + * OK, we have a match to the partition key and a suitable operator. + * Examine the array argument to see if it's usable for pruning. This + * is identical to the logic for a plain OpExpr. + */ + if (!IsA(rightop, Const)) + { + Bitmapset *paramids; + + /* + * When pruning in the planner, we only support pruning using + * comparisons to constants. We cannot prune on the basis of + * anything that's not immutable. (Note that has_mutable_arg and + * has_exec_param do not get set for this target value.) + */ + if (context->target == PARTTARGET_PLANNER) + return PARTCLAUSE_UNSUPPORTED; + + /* + * We can never prune using an expression that contains Vars. + */ + if (contain_var_clause((Node *) rightop)) + return PARTCLAUSE_UNSUPPORTED; + + /* + * And we must reject anything containing a volatile function. + * Stable functions are OK though. + */ + if (contain_volatile_functions((Node *) rightop)) + return PARTCLAUSE_UNSUPPORTED; + + /* + * See if there are any exec Params. If so, we can only use this + * expression during per-scan pruning. + */ + paramids = pull_exec_paramids(rightop); + if (!bms_is_empty(paramids)) + { + context->has_exec_param = true; + if (context->target != PARTTARGET_EXEC) + return PARTCLAUSE_UNSUPPORTED; + } + else + { + /* It's potentially usable, but mutable */ + context->has_mutable_arg = true; + } + } + + /* + * Check whether the comparison operator itself is immutable. (We + * assume anything that's in a btree or hash opclass is at least + * stable, but we need to check for immutability.) + */ + if (op_volatile(saop_op) != PROVOLATILE_IMMUTABLE) + { + context->has_mutable_op = true; + + /* + * When pruning in the planner, we cannot prune with mutable + * operators. + */ + if (context->target == PARTTARGET_PLANNER) + return PARTCLAUSE_UNSUPPORTED; + } + + /* + * Examine the contents of the array argument. */ elem_exprs = NIL; if (IsA(rightop, Const)) { + /* + * For a constant array, convert the elements to a list of Const + * nodes, one for each array element (excepting nulls). + */ Const *arr = (Const *) rightop; ArrayType *arrval = DatumGetArrayTypeP(arr->constvalue); int16 elemlen; @@ -1968,6 +2088,9 @@ match_clause_to_partition_key(RelOptInfo *rel, if (arrexpr->multidims) return PARTCLAUSE_UNSUPPORTED; + /* + * Otherwise, we can just use the list of element values. + */ elem_exprs = arrexpr->elements; } else @@ -2000,10 +2123,8 @@ match_clause_to_partition_key(RelOptInfo *rel, elem_clauses = list_make1(makeBoolExpr(OR_EXPR, elem_clauses, -1)); /* Finally, generate steps */ - *clause_steps = - gen_partprune_steps_internal(context, rel, elem_clauses, - &contradictory); - if (contradictory) + *clause_steps = gen_partprune_steps_internal(context, elem_clauses); + if (context->contradictory) return PARTCLAUSE_MATCH_CONTRADICT; else if (*clause_steps == NIL) return PARTCLAUSE_UNSUPPORTED; /* step generation failed */ @@ -2985,89 +3106,38 @@ pull_exec_paramids_walker(Node *node, Bitmapset **context) } /* - * analyze_partkey_exprs - * Loop through all pruning steps and identify which ones require - * executor startup-time or executor run-time pruning. - * - * Returns true if any executor partition pruning should be attempted at this - * level. Also fills fields of *pinfo to record how to process each step. + * get_partkey_exec_paramids + * Loop through given pruning steps and find out which exec Params + * are used. * - * Note: when this is called, not much of *pinfo is valid; but that's OK - * since we only use it as an output area. + * Returns a Bitmapset of Param IDs. */ -static bool -analyze_partkey_exprs(PartitionedRelPruneInfo *pinfo, List *steps, - int partnatts) +static Bitmapset * +get_partkey_exec_paramids(List *steps) { - bool doruntimeprune = false; + Bitmapset *execparamids = NULL; ListCell *lc; - /* - * Steps require run-time pruning if they contain EXEC_PARAM Params. - * Otherwise, if their expressions aren't simple Consts or they involve - * non-immutable comparison operators, they require startup-time pruning. - * (Otherwise, the pruning would have been done at plan time.) - * - * Notice that what we actually check for mutability is the comparison - * functions, not the original operators. This relies on the support - * functions of the btree or hash opfamily being marked consistently with - * the operators. - */ - pinfo->nexprs = list_length(steps) * partnatts; - pinfo->hasexecparam = (bool *) palloc0(sizeof(bool) * pinfo->nexprs); - pinfo->do_initial_prune = false; - pinfo->do_exec_prune = false; - pinfo->execparamids = NULL; - foreach(lc, steps) { PartitionPruneStepOp *step = (PartitionPruneStepOp *) lfirst(lc); ListCell *lc2; - ListCell *lc3; - int keyno; if (!IsA(step, PartitionPruneStepOp)) continue; - keyno = 0; - Assert(list_length(step->exprs) == list_length(step->cmpfns)); - forboth(lc2, step->exprs, lc3, step->cmpfns) + foreach(lc2, step->exprs) { Expr *expr = lfirst(lc2); - Oid fnoid = lfirst_oid(lc3); + /* We can be quick for plain Consts */ if (!IsA(expr, Const)) - { - Bitmapset *execparamids = pull_exec_paramids(expr); - bool hasexecparams; - int stateidx = PruneCxtStateIdx(partnatts, - step->step.step_id, - keyno); - - Assert(stateidx < pinfo->nexprs); - hasexecparams = !bms_is_empty(execparamids); - pinfo->hasexecparam[stateidx] = hasexecparams; - pinfo->execparamids = bms_join(pinfo->execparamids, - execparamids); - - if (hasexecparams) - pinfo->do_exec_prune = true; - else - pinfo->do_initial_prune = true; - - doruntimeprune = true; - } - else if (func_volatile(fnoid) != PROVOLATILE_IMMUTABLE) - { - /* No exec params here, but must do initial pruning */ - pinfo->do_initial_prune = true; - doruntimeprune = true; - } - keyno++; + execparamids = bms_join(execparamids, + pull_exec_paramids(expr)); } } - return doruntimeprune; + return execparamids; } /* @@ -3125,56 +3195,54 @@ perform_pruning_base_step(PartitionPruneContext *context, Expr *expr; Datum datum; bool isnull; + Oid cmpfn; expr = lfirst(lc1); stateidx = PruneCxtStateIdx(context->partnatts, opstep->step.step_id, keyno); - if (partkey_datum_from_expr(context, expr, stateidx, - &datum, &isnull)) - { - Oid cmpfn; + partkey_datum_from_expr(context, expr, stateidx, + &datum, &isnull); - /* - * Since we only allow strict operators in pruning steps, any - * null-valued comparison value must cause the comparison to - * fail, so that no partitions could match. - */ - if (isnull) - { - PruneStepResult *result; - - result = (PruneStepResult *) palloc(sizeof(PruneStepResult)); - result->bound_offsets = NULL; - result->scan_default = false; - result->scan_null = false; + /* + * Since we only allow strict operators in pruning steps, any + * null-valued comparison value must cause the comparison to fail, + * so that no partitions could match. + */ + if (isnull) + { + PruneStepResult *result; - return result; - } + result = (PruneStepResult *) palloc(sizeof(PruneStepResult)); + result->bound_offsets = NULL; + result->scan_default = false; + result->scan_null = false; - /* Set up the stepcmpfuncs entry, unless we already did */ - cmpfn = lfirst_oid(lc2); - Assert(OidIsValid(cmpfn)); - if (cmpfn != context->stepcmpfuncs[stateidx].fn_oid) - { - /* - * If the needed support function is the same one cached - * in the relation's partition key, copy the cached - * FmgrInfo. Otherwise (i.e., when we have a cross-type - * comparison), an actual lookup is required. - */ - if (cmpfn == context->partsupfunc[keyno].fn_oid) - fmgr_info_copy(&context->stepcmpfuncs[stateidx], - &context->partsupfunc[keyno], - context->ppccontext); - else - fmgr_info_cxt(cmpfn, &context->stepcmpfuncs[stateidx], - context->ppccontext); - } + return result; + } - values[keyno] = datum; - nvalues++; + /* Set up the stepcmpfuncs entry, unless we already did */ + cmpfn = lfirst_oid(lc2); + Assert(OidIsValid(cmpfn)); + if (cmpfn != context->stepcmpfuncs[stateidx].fn_oid) + { + /* + * If the needed support function is the same one cached in + * the relation's partition key, copy the cached FmgrInfo. + * Otherwise (i.e., when we have a cross-type comparison), an + * actual lookup is required. + */ + if (cmpfn == context->partsupfunc[keyno].fn_oid) + fmgr_info_copy(&context->stepcmpfuncs[stateidx], + &context->partsupfunc[keyno], + context->ppccontext); + else + fmgr_info_cxt(cmpfn, &context->stepcmpfuncs[stateidx], + context->ppccontext); } + values[keyno] = datum; + nvalues++; + lc1 = lnext(lc1); lc2 = lnext(lc2); } @@ -3392,16 +3460,17 @@ match_boolean_partition_clause(Oid partopfamily, Expr *clause, Expr *partkey, * partkey_datum_from_expr * Evaluate expression for potential partition pruning * - * Evaluate 'expr', whose ExprState is stateidx of the context exprstate - * array; set *value and *isnull to the resulting Datum and nullflag. - * Return true if evaluation was possible, otherwise false. + * Evaluate 'expr'; set *value and *isnull to the resulting Datum and nullflag. + * + * If expr isn't a Const, its ExprState is in stateidx of the context + * exprstate array. * * Note that the evaluated result may be in the per-tuple memory context of * context->planstate->ps_ExprContext, and we may have leaked other memory * there too. This memory must be recovered by resetting that ExprContext * after we're done with the pruning operation (see execPartition.c). */ -static bool +static void partkey_datum_from_expr(PartitionPruneContext *context, Expr *expr, int stateidx, Datum *value, bool *isnull) @@ -3413,35 +3482,20 @@ partkey_datum_from_expr(PartitionPruneContext *context, *value = con->constvalue; *isnull = con->constisnull; - return true; } else { + ExprState *exprstate; + ExprContext *ectx; + /* * We should never see a non-Const in a step unless we're running in * the executor. */ Assert(context->planstate != NULL); - /* - * When called from the executor we'll have a valid planstate so we - * may be able to evaluate an expression which could not be folded to - * a Const during planning. Since run-time pruning can occur both - * during initialization of the executor or while it's running, we - * must be careful here to evaluate expressions containing PARAM_EXEC - * Params only when told it's OK. - */ - if (context->evalexecparams || !context->exprhasexecparam[stateidx]) - { - ExprState *exprstate; - ExprContext *ectx; - - exprstate = context->exprstates[stateidx]; - ectx = context->planstate->ps_ExprContext; - *value = ExecEvalExprSwitchContext(exprstate, ectx, isnull); - return true; - } + exprstate = context->exprstates[stateidx]; + ectx = context->planstate->ps_ExprContext; + *value = ExecEvalExprSwitchContext(exprstate, ectx, isnull); } - - return false; } diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h index b363aba..465a975 100644 --- a/src/include/executor/execPartition.h +++ b/src/include/executor/execPartition.h @@ -62,24 +62,24 @@ typedef struct PartitionRoutingInfo * subpart_map Subpart index by partition index, or -1. * present_parts A Bitmapset of the partition indexes that we * have subplans or subparts for. - * context Contains the context details required to call - * the partition pruning code. - * pruning_steps List of PartitionPruneSteps used to - * perform the actual pruning. - * do_initial_prune true if pruning should be performed during - * executor startup (for this partitioning level). - * do_exec_prune true if pruning should be performed during - * executor run (for this partitioning level). + * initial_pruning_steps List of PartitionPruneSteps used to + * perform executor startup pruning. + * exec_pruning_steps List of PartitionPruneSteps used to + * perform per-scan pruning. + * initial_context If initial_pruning_steps isn't NIL, contains + * the details needed to execute those steps. + * exec_context If exec_pruning_steps isn't NIL, contains + * the details needed to execute those steps. */ typedef struct PartitionedRelPruningData { int *subplan_map; int *subpart_map; Bitmapset *present_parts; - PartitionPruneContext context; - List *pruning_steps; - bool do_initial_prune; - bool do_exec_prune; + List *initial_pruning_steps; + List *exec_pruning_steps; + PartitionPruneContext initial_context; + PartitionPruneContext exec_context; } PartitionedRelPruningData; /* diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h index 1cce762..1241245 100644 --- a/src/include/nodes/plannodes.h +++ b/src/include/nodes/plannodes.h @@ -1109,21 +1109,23 @@ typedef struct PartitionedRelPruneInfo { NodeTag type; Index rtindex; /* RT index of partition rel for this level */ - List *pruning_steps; /* List of PartitionPruneStep, see below */ Bitmapset *present_parts; /* Indexes of all partitions which subplans or - * subparts are present for. */ - int nparts; /* Length of subplan_map[] and subpart_map[] */ - int nexprs; /* Length of hasexecparam[] */ + * subparts are present for */ + int nparts; /* Length of the following arrays: */ int *subplan_map; /* subplan index by partition index, or -1 */ int *subpart_map; /* subpart index by partition index, or -1 */ Oid *relid_map; /* relation OID by partition index, or 0 */ - bool *hasexecparam; /* true if corresponding pruning_step contains - * any PARAM_EXEC Params. */ - bool do_initial_prune; /* true if pruning should be performed - * during executor startup. */ - bool do_exec_prune; /* true if pruning should be performed during - * executor run. */ - Bitmapset *execparamids; /* All PARAM_EXEC Param IDs in pruning_steps */ + + /* + * initial_pruning_steps shows how to prune during executor startup (i.e., + * without use of any PARAM_EXEC Params); it is NIL if no startup pruning + * is required. exec_pruning_steps shows how to prune with PARAM_EXEC + * Params; it is NIL if no per-scan pruning is required. + */ + List *initial_pruning_steps; /* List of PartitionPruneStep */ + List *exec_pruning_steps; /* List of PartitionPruneStep */ + Bitmapset *execparamids; /* All PARAM_EXEC Param IDs in + * exec_pruning_steps */ } PartitionedRelPruneInfo; /* diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h index 2f75717..b906ae1 100644 --- a/src/include/partitioning/partprune.h +++ b/src/include/partitioning/partprune.h @@ -44,10 +44,6 @@ struct RelOptInfo; * exprstates Array of ExprStates, indexed as per PruneCtxStateIdx; one * for each partition key in each pruning step. Allocated if * planstate is non-NULL, otherwise NULL. - * exprhasexecparam Array of bools, each true if corresponding 'exprstate' - * expression contains any PARAM_EXEC Params. (Can be NULL - * if planstate is NULL.) - * evalexecparams True if it's safe to evaluate PARAM_EXEC Params. */ typedef struct PartitionPruneContext { @@ -61,8 +57,6 @@ typedef struct PartitionPruneContext MemoryContext ppccontext; PlanState *planstate; ExprState **exprstates; - bool *exprhasexecparam; - bool evalexecparams; } PartitionPruneContext; /* diff --git a/src/test/regress/expected/partition_prune.out b/src/test/regress/expected/partition_prune.out index 723fc70..841bd8b 100644 --- a/src/test/regress/expected/partition_prune.out +++ b/src/test/regress/expected/partition_prune.out @@ -3149,6 +3149,45 @@ select * from mc3p where a < 3 and abs(b) = 1; Filter: ((a < 3) AND (abs(b) = 1)) (7 rows) +-- +-- Check that pruning with composite range partitioning works correctly when +-- a combination of runtime parameters is specified, not all of whose values +-- are available at the same time +-- +set plan_cache_mode = force_generic_plan; +prepare ps1 as + select * from mc3p where a = $1 and abs(b) < (select 3); +explain (analyze, costs off, summary off, timing off) +execute ps1(1); + QUERY PLAN +------------------------------------------------- + Append (actual rows=1 loops=1) + InitPlan 1 (returns $0) + -> Result (actual rows=1 loops=1) + Subplans Removed: 2 + -> Seq Scan on mc3p1 (actual rows=1 loops=1) + Filter: ((a = $1) AND (abs(b) < $0)) +(6 rows) + +deallocate ps1; +prepare ps2 as + select * from mc3p where a <= $1 and abs(b) < (select 3); +explain (analyze, costs off, summary off, timing off) +execute ps2(1); + QUERY PLAN +------------------------------------------------- + Append (actual rows=2 loops=1) + InitPlan 1 (returns $0) + -> Result (actual rows=1 loops=1) + Subplans Removed: 1 + -> Seq Scan on mc3p0 (actual rows=1 loops=1) + Filter: ((a <= $1) AND (abs(b) < $0)) + -> Seq Scan on mc3p1 (actual rows=1 loops=1) + Filter: ((a <= $1) AND (abs(b) < $0)) +(8 rows) + +deallocate ps2; +reset plan_cache_mode; drop table mc3p; -- Ensure runtime pruning works with initplans params with boolean types create table boolvalues (value bool not null); diff --git a/src/test/regress/sql/partition_prune.sql b/src/test/regress/sql/partition_prune.sql index 2373bd8..071e28d 100644 --- a/src/test/regress/sql/partition_prune.sql +++ b/src/test/regress/sql/partition_prune.sql @@ -809,6 +809,24 @@ insert into mc3p values (0, 1, 1), (1, 1, 1), (2, 1, 1); explain (analyze, costs off, summary off, timing off) select * from mc3p where a < 3 and abs(b) = 1; +-- +-- Check that pruning with composite range partitioning works correctly when +-- a combination of runtime parameters is specified, not all of whose values +-- are available at the same time +-- +set plan_cache_mode = force_generic_plan; +prepare ps1 as + select * from mc3p where a = $1 and abs(b) < (select 3); +explain (analyze, costs off, summary off, timing off) +execute ps1(1); +deallocate ps1; +prepare ps2 as + select * from mc3p where a <= $1 and abs(b) < (select 3); +explain (analyze, costs off, summary off, timing off) +execute ps2(1); +deallocate ps2; +reset plan_cache_mode; + drop table mc3p; -- Ensure runtime pruning works with initplans params with boolean types diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c index a6c6de7..a06f28a 100644 *** a/src/backend/commands/explain.c --- b/src/backend/commands/explain.c *************** static void show_sort_keys(SortState *so *** 82,87 **** --- 82,89 ---- ExplainState *es); static void show_merge_append_keys(MergeAppendState *mstate, List *ancestors, ExplainState *es); + static void show_pruning_info(PartitionPruneInfo *pinfo, List *ancestors, + ExplainState *es); static void show_agg_keys(AggState *astate, List *ancestors, ExplainState *es); static void show_grouping_sets(PlanState *planstate, Agg *agg, *************** ExplainNode(PlanState *planstate, List * *** 1841,1849 **** --- 1843,1857 ---- show_sort_keys(castNode(SortState, planstate), ancestors, es); show_sort_info(castNode(SortState, planstate), es); break; + case T_Append: + show_pruning_info(((Append *) plan)->part_prune_info, + ancestors, es); + break; case T_MergeAppend: show_merge_append_keys(castNode(MergeAppendState, planstate), ancestors, es); + show_pruning_info(((MergeAppend *) plan)->part_prune_info, + ancestors, es); break; case T_Result: show_upper_qual((List *) ((Result *) plan)->resconstantqual, *************** show_merge_append_keys(MergeAppendState *** 2198,2203 **** --- 2206,2258 ---- } /* + * Show runtime pruning info, if any + */ + static void + show_pruning_info(PartitionPruneInfo *ppinfo, List *ancestors, + ExplainState *es) + { + ListCell *lc; + + if (ppinfo == NULL) + return; + Assert(IsA(ppinfo, PartitionPruneInfo)); + foreach(lc, ppinfo->prune_infos) + { + List *prune_infos = lfirst(lc); + ListCell *l2; + + foreach(l2, prune_infos) + { + PartitionedRelPruneInfo *pinfo = lfirst(l2); + + if (pinfo->initial_pruning_steps || + pinfo->exec_pruning_steps) + { + Index rti = pinfo->rtindex; + RangeTblEntry *rte; + char *refname; + + rte = rt_fetch(rti, es->rtable); + refname = (char *) list_nth(es->rtable_names, rti - 1); + if (refname == NULL) + refname = rte->eref->aliasname; + + if (es->format == EXPLAIN_FORMAT_TEXT) + { + appendStringInfoSpaces(es->str, es->indent * 2); + appendStringInfo(es->str, + "Runtime Pruning: %s has %d initial steps, %d exec steps\n", + refname, + list_length(pinfo->initial_pruning_steps), + list_length(pinfo->exec_pruning_steps)); + } + } + } + } + } + + /* * Show the grouping keys for an Agg node. */ static void diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c index a6c6de7..19708ab 100644 *** a/src/backend/commands/explain.c --- b/src/backend/commands/explain.c *************** static void show_sort_keys(SortState *so *** 82,87 **** --- 82,89 ---- ExplainState *es); static void show_merge_append_keys(MergeAppendState *mstate, List *ancestors, ExplainState *es); + static void show_pruning_info(PartitionPruneInfo *pinfo, List *ancestors, + ExplainState *es); static void show_agg_keys(AggState *astate, List *ancestors, ExplainState *es); static void show_grouping_sets(PlanState *planstate, Agg *agg, *************** ExplainNode(PlanState *planstate, List * *** 1841,1849 **** --- 1843,1857 ---- show_sort_keys(castNode(SortState, planstate), ancestors, es); show_sort_info(castNode(SortState, planstate), es); break; + case T_Append: + show_pruning_info(((Append *) plan)->part_prune_info, + ancestors, es); + break; case T_MergeAppend: show_merge_append_keys(castNode(MergeAppendState, planstate), ancestors, es); + show_pruning_info(((MergeAppend *) plan)->part_prune_info, + ancestors, es); break; case T_Result: show_upper_qual((List *) ((Result *) plan)->resconstantqual, *************** show_merge_append_keys(MergeAppendState *** 2198,2203 **** --- 2206,2258 ---- } /* + * Show runtime pruning info, if any + */ + static void + show_pruning_info(PartitionPruneInfo *ppinfo, List *ancestors, + ExplainState *es) + { + ListCell *lc; + + if (ppinfo == NULL) + return; + Assert(IsA(ppinfo, PartitionPruneInfo)); + foreach(lc, ppinfo->prune_infos) + { + List *prune_infos = lfirst(lc); + ListCell *l2; + + foreach(l2, prune_infos) + { + PartitionedRelPruneInfo *pinfo = lfirst(l2); + + if (pinfo->do_initial_prune || + pinfo->do_exec_prune) + { + Index rti = pinfo->rtindex; + RangeTblEntry *rte; + char *refname; + + rte = rt_fetch(rti, es->rtable); + refname = (char *) list_nth(es->rtable_names, rti - 1); + if (refname == NULL) + refname = rte->eref->aliasname; + + if (es->format == EXPLAIN_FORMAT_TEXT) + { + appendStringInfoSpaces(es->str, es->indent * 2); + appendStringInfo(es->str, + "Runtime Pruning: %s has %d initial steps, %d exec steps\n", + refname, + pinfo->do_initial_prune ? list_length(pinfo->pruning_steps) : 0, + pinfo->do_exec_prune ? list_length(pinfo->pruning_steps) : 0); + } + } + } + } + } + + /* * Show the grouping keys for an Agg node. */ static void
On Sat, 18 May 2019 at 09:30, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Since time is growing short before beta1 wrap, I'm going to go ahead > and push this in hopes of getting buildfarm cycles on it. But feel > free to review and look for further improvements. There's plenty > of time to rethink the data-structure details for the v11 backpatch, > also. Thanks for working on this. I did have a look and was a bit disappointed to see the code will end up calling pull_exec_paramids() 3 times for any expression with exec param IDs. I can knock that down to just 2 by caching the exec Param IDs of the used quals in GeneratePruningStepsContext. I imagine you didn't do this since its possible we reject the qual even after has_exec_param is set, but if we just delay recording the param IDs until after we're certain we're using the qual then it should be safe to do this. This also gets rid of the entire get_partkey_exec_paramids() function, so has the additional saving of not looping through all the steps once more. I had thought reducing duplicate work like this was one of the driving factors for this change, although I understand the main one for this was fixing the bug Amit was working on. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Вложения
David Rowley <david.rowley@2ndquadrant.com> writes: > Thanks for working on this. I did have a look and was a bit > disappointed to see the code will end up calling pull_exec_paramids() > 3 times for any expression with exec param IDs. I can knock that down > to just 2 by caching the exec Param IDs of the used quals in > GeneratePruningStepsContext. I imagine you didn't do this since its > possible we reject the qual even after has_exec_param is set, Exactly. > but if > we just delay recording the param IDs until after we're certain we're > using the qual then it should be safe to do this. I thought about that too, but it seems awfully messy/fragile; it's not clear to me at what point this code is really guaranteed to have accepted a clause. The recursion for AND/OR, in particular, seems to make that kind of questionable. If we could be sure about that I'd be inclined to apply the principle to the has_mutable_arg and has_exec_param flags as well as exec_paramids, and change the order of tests in match_clause_to_partition_key back to what it was, relying on the outer control layer to not propagate that data into the overall state until we knew whether the clause was really going to contribute to the steps. regards, tom lane
On 2019/05/18 23:58, Tom Lane wrote: > David Rowley <david.rowley@2ndquadrant.com> writes: >> Thanks for working on this. +1. Thanks a lot Tom for the cleanup. >> I did have a look and was a bit >> disappointed to see the code will end up calling pull_exec_paramids() >> 3 times for any expression with exec param IDs. I can knock that down >> to just 2 by caching the exec Param IDs of the used quals in >> GeneratePruningStepsContext. I imagine you didn't do this since its >> possible we reject the qual even after has_exec_param is set, > > Exactly. Yeah, the point at which we can be sure that any given expression (Params, etc.) will be in a pruning step is when we actually build the step containing that expression. So, going over the resulting pruning steps to collect execparamids is as good as collecting them in the context struct (a Bitmapset field of it) as the steps are built. For example, gen_prune_step_op() itself may add pull_exec_paramids(exprs) to context->execparamids when building the individual steps, instead of get_partkey_exec_paramids() going over steps and collecting them from the individual expressions of PartitionPruneStepOp steps. However, doing that only changes the place where that work is done, not reduce the amount of work that needs to be done. >> but if >> we just delay recording the param IDs until after we're certain we're >> using the qual then it should be safe to do this. > > I thought about that too, but it seems awfully messy/fragile; it's not > clear to me at what point this code is really guaranteed to have accepted > a clause. The recursion for AND/OR, in particular, seems to make that > kind of questionable. > > If we could be sure about that I'd be inclined to apply the principle > to the has_mutable_arg and has_exec_param flags as well as exec_paramids, > and change the order of tests in match_clause_to_partition_key back to > what it was, relying on the outer control layer to not propagate that > data into the overall state until we knew whether the clause was really > going to contribute to the steps. Per what I mentioned above, this idea means we'll want to update all of has_mutable_arg, has_exec_param and exec_paramids in gen_prune_step_op(), by which point it's clear that the clause(s) having those properties really do contribute to steps. However, make_partitionedrel_pruneinfo() throws these steps away if it can determine from those properties that run-time pruning will be useless. So, I was thinking maybe there is no point in building the steps in the first place. IOW, setting those properties in match_clause_to_partition_key() as is done now and skipping the part of gen_partprune_steps_internal() where clauses are turned into steps, if the combination of context->target and the properties suggests that those step will be useless anyway. However, inventing a new meaning for gen_partprune_steps_internal() returning NIL isn't without problems, because it's interpreted in a certain way when it's invoked recursively for an individual arm of an OR clause. I couldn't find an easy way to get around that, so I had to scrap the idea. Thanks, Amit