Обсуждение: literal vs dynamic partition constraint in plan execution

Поиск
Список
Период
Сортировка

literal vs dynamic partition constraint in plan execution

От
Luca Ferrari
Дата:
Ok, the title is a little buzz, however I've got a partitioned table
and one "leaf" has a set of checks against a timestamp field to ensure
that tuples within such table belongs to the year and month:

testdb=# \d respi.y2019m08
...
Partition of: respi.y2019 FOR VALUES IN ('8')
Check constraints:
    "y2019_mis_ora_check" CHECK (date_part('year'::text, mis_ora) =
2019::double precision)
    "y2019_mis_ora_check1" CHECK (mis_ora >= make_timestamp(2019, 1,
1, 0, 0, 0::double precision) AND mis_ora <= make_timestamp(2019, 12,
31, 23, 59, 59::double precision))
    "y2019m08_mis_ora_check" CHECK (date_part('month'::text, mis_ora)
= 8::double precision)
    "y2019m08_mis_ora_check1" CHECK (date_part('year'::text, mis_ora)
= 2019::double precision)
    "y2019m08_mis_ora_check2" CHECK (mis_ora >= make_timestamp(2019,
8, 1, 0, 0, 0::double precision) AND mis_ora <= make_timestamp(2019,
8, 31, 23, 59, 59::double precision))
    "y2019m08_mis_ora_check3" CHECK (mis_ora >= make_timestamp(2019,
8, 1, 0, 0, 0::double precision) AND mis_ora <= make_timestamp(2019,
8, 31, 23, 59, 59::double precision))

So y2019m08 accepts only tuples where 'mis_ora' has a timestamp that
is contained into the eigth month of the year.
Now if I look at the plan for this query everything works as expected
(I disabled parallel scans for better see the plan):

testdb=# explain select * from respi.root where ts >= '2019-08-28
23:35:00.007245' and mis_ora >= '2019-08-29 16:28:48.711482'   order
by ts;

----------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=353986.27..353991.59 rows=2129 width=40)
   Sort Key: y2019m08.ts
   ->  Append  (cost=0.00..353868.58 rows=2129 width=40)
         ->  Seq Scan on y2019m08  (cost=0.00..353409.93 rows=1 width=40)
               Filter: ((ts >= '2019-08-28 23:35:00.007245'::timestamp
without time zone) AND (mis_ora >= '2019-08-29
16:28:48.711482'::timestamp without time zone))
         ->  Seq Scan on y2019m09  (cost=0.00..28.00 rows=133 width=40)
               Filter: ((ts >= '2019-08-28 23:35:00.007245'::timestamp
without time zone) AND (mis_ora >= '2019-08-29
16:28:48.711482'::timestamp without time zone))
         ->  Seq Scan on y2019m10  (cost=0.00..28.00 rows=133 width=40)
               Filter: ((ts >= '2019-08-28 23:35:00.007245'::timestamp
without time zone) AND (mis_ora >= '2019-08-29
16:28:48.711482'::timestamp without time zone))
         ->  Seq Scan on y2019m11  (cost=0.00..28.00 rows=133 width=40)
               Filter: ((ts >= '2019-08-28 23:35:00.007245'::timestamp
without time zone) AND (mis_ora >= '2019-08-29
16:28:48.711482'::timestamp without time zone))
         ->  Seq Scan on y2019m12  (cost=0.00..28.00 rows=133 width=40)
               Filter: ((ts >= '2019-08-28 23:35:00.007245'::timestamp
without time zone) AND (mis_ora >= '2019-08-29
16:28:48.711482'::timestamp without time zone))


The "as I expected" means that the system starts scanning from
y2019m08 and following (in time) tables, and does not scan previous
time tables.
This works if the mis_ora is compared against a literal timestamp, but
if I simply change it with a dynamic timestamp:

testdb=# explain select * from respi.root where ts >= '2019-08-28
23:35:00.007245' and   mis_ora >= current_timestamp   order by ts;
   QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=4654860.37..4654865.25 rows=1952 width=36)
   Sort Key: r.ts
   ->  Nested Loop  (cost=0.00..4654753.69 rows=1952 width=36)
         Join Filter: (r.sen_id = s.sen_id)
         ->  Append  (cost=0.00..4638927.56 rows=3204 width=32)
               ->  Seq Scan on y2018m01 r  (cost=0.00..31.00 rows=133 width=32)
                     Filter: ((ts >= '2019-08-28
23:35:00.007245'::timestamp without time zone) AND (mis_ora >=
CURRENT_TIMESTAMP))
               ->  Seq Scan on y2018m02 r_1  (cost=0.00..31.00
rows=133 width=32)
                     Filter: ((ts >= '2019-08-28
23:35:00.007245'::timestamp without time zone) AND (mis_ora >=
CURRENT_TIMESTAMP))
               ->  Seq Scan on y2018m03 r_2  (cost=0.00..31.00
rows=133 width=32)
                     Filter: ((ts >= '2019-08-28
23:35:00.007245'::timestamp without time zone) AND (mis_ora >=
CURRENT_TIMESTAMP))
               ->  Seq Scan on y2018m04 r_3  (cost=0.00..31.00
rows=133 width=32)
                     Filter: ((ts >= '2019-08-28
23:35:00.007245'::timestamp without time zone) AND (mis_ora >=
CURRENT_TIMESTAMP))

also the tables for the past year are scanned. Moreover, the planner
thinks I will get 133 rows out of, for instance, y2018m01 which is
impossible.
So, do I have defined the constraint on each table in a wrong manner?


testdb=# select version();
                                                 version
---------------------------------------------------------------------------------------------------------
 PostgreSQL 11.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5
20150623 (Red Hat 4.8.5-36), 64-bit
(1 row)

testdb=# show constraint_exclusion ;
 constraint_exclusion
----------------------
 partition



Thanks,
Luca



Re: literal vs dynamic partition constraint in plan execution

От
Luca Ferrari
Дата:
On Thu, Aug 29, 2019 at 4:45 PM Luca Ferrari <fluca1978@gmail.com> wrote:
>
> Ok, the title is a little buzz, however I've got a partitioned table
> and one "leaf" has a set of checks against a timestamp field to ensure
> that tuples within such table belongs to the year and month:

Of course, all the siblings have similar constraints. So my partition
starts at a table named "root", then it it has a level for the year,
and each year has subpartitions for months:
- root
   - y2018
     - y2018m01, y2018m02, ...
  - y2019
   - y2019m01, y2019m02 ....

All partitions have been created equally, and constraints seem fine to me:

testdb=# \d respi.y2018m01
...
Partition of: respi.y2018 FOR VALUES IN ('1')
Check constraints:
    "y2018_mis_ora_check" CHECK (date_part('year'::text, mis_ora) =
2018::double precision)
    "y2018_mis_ora_check1" CHECK (mis_ora >= make_timestamp(2018, 1,
1, 0, 0, 0::double precision) AND mis_ora <= make_timestamp(2018, 12,
31, 23, 59, 59::double precision))
    "y2018m01_mis_ora_check" CHECK (date_part('year'::text, mis_ora) =
2018::double precision)
    "y2018m01_mis_ora_check1" CHECK (mis_ora >= make_timestamp(2018,
1, 1, 0, 0, 0::double precision) AND mis_ora <= make_timestamp(2018,
1, 31, 23, 59, 59::double precision))



testdb=># \d+ respi.y2018
...
Partition of: respi.root FOR VALUES IN ('2018')
Partition constraint: ((date_part('year'::text, mis_ora) IS NOT NULL)
AND (date_part('year'::text, mis_ora) = '2018'::double precision))
Partition key: LIST (date_part('month'::text, mis_ora))
Check constraints:
    "y2018_mis_ora_check" CHECK (date_part('year'::text, mis_ora) =
2018::double precision)
    "y2018_mis_ora_check1" CHECK (mis_ora >= make_timestamp(2018, 1,
1, 0, 0, 0::double precision) AND mis_ora <= make_timestamp(2018, 12,
31, 23, 59, 59::double precision))
Partitions: respi.y2018m01 FOR VALUES IN ('1'),
            respi.y2018m02 FOR VALUES IN ('2'),
            respi.y2018m03 FOR VALUES IN ('3'),
            respi.y2018m04 FOR VALUES IN ('4'),
            respi.y2018m05 FOR VALUES IN ('5'),
            respi.y2018m06 FOR VALUES IN ('6'),
            respi.y2018m07 FOR VALUES IN ('7'),
            respi.y2018m08 FOR VALUES IN ('8'),
            respi.y2018m09 FOR VALUES IN ('9'),
...

With the above constraint, all the branch starting at y2018 should be
excluded when selecting with
mis_ora >= CURRENT_TIMESTAMP
(the date of the server is right, of course).
Why is instead scanned (as reported by the execution plan in the
previous email)?

Thanks,
Luca



Re: literal vs dynamic partition constraint in plan execution

От
Luca Ferrari
Дата:
On Fri, Aug 30, 2019 at 8:29 AM Luca Ferrari <fluca1978@gmail.com> wrote:
> testdb=># \d+ respi.y2018
> ...
> Partition of: respi.root FOR VALUES IN ('2018')
> Partition constraint: ((date_part('year'::text, mis_ora) IS NOT NULL)
> AND (date_part('year'::text, mis_ora) = '2018'::double precision))
> Partition key: LIST (date_part('month'::text, mis_ora))
> Check constraints:
>     "y2018_mis_ora_check" CHECK (date_part('year'::text, mis_ora) =
> 2018::double precision)
>     "y2018_mis_ora_check1" CHECK (mis_ora >= make_timestamp(2018, 1,
> 1, 0, 0, 0::double precision) AND mis_ora <= make_timestamp(2018, 12,
> 31, 23, 59, 59::double precision))
> Partitions: respi.y2018m01 FOR VALUES IN ('1'),
>             respi.y2018m02 FOR VALUES IN ('2'),
>             respi.y2018m03 FOR VALUES IN ('3'),
>             respi.y2018m04 FOR VALUES IN ('4'),
>             respi.y2018m05 FOR VALUES IN ('5'),
>             respi.y2018m06 FOR VALUES IN ('6'),
>             respi.y2018m07 FOR VALUES IN ('7'),
>             respi.y2018m08 FOR VALUES IN ('8'),
>             respi.y2018m09 FOR VALUES IN ('9'),
> ...
>


While the condition
mis_ora >= current_timestamp
does not cut off the 2018 branch, the following does

=# explain select * from respi.root where ts >= '2019-08-28 23:35:00.007245'
and  extract( year from mis_ora ) = extract( year from current_timestamp )
and extract( month from mis_ora ) >= extract( month from
current_timestamp )    order by ts;

 Sort  (cost=7246692.21..7246692.28 rows=26 width=36)
   Sort Key: r.ts
   ->  Nested Loop  (cost=0.00..7246691.60 rows=26 width=36)
         Join Filter: (r.sen_id = s.sen_id)
         ->  Seq Scan on sensori s  (cost=0.00..13.57 rows=329 width=16)
               Filter: interesting
         ->  Materialize  (cost=0.00..7246465.93 rows=43 width=32)
               ->  Append  (cost=0.00..7246465.72 rows=43 width=32)
                     Subplans Removed: 31
                     ->  Seq Scan on y2019m08 r  (cost=0.00..623008.30
rows=2 width=32)
                           Filter: ((ts >= '2019-08-28
23:35:00.007245'::timestamp without time zone) AND
(date_part('month'::text, mis_ora) >= date_part('month'::text,
CURRENT_TIMESTAMP)) AND (date_part('year'::text, mis_ora) =
date_part('year'::text, CURRENT_TIMESTAMP)))
                     ->  Seq Scan on y2019m09 r_1  (cost=0.00..49.00
rows=1 width=32)
                           Filter: ((ts >= '2019-08-28
23:35:00.007245'::timestamp without time zone) AND
(date_part('month'::text, mis_ora) >= date_part('month'::text,
CURRENT_TIMESTAMP)) AND (date_part('year'::text, mis_ora) =
date_part('year'::text, CURRENT_TIMESTAMP)))
                     ->  Seq Scan on y2019m10 r_2  (cost=0.00..49.00
rows=1 width=32)
                           Filter: ((ts >= '2019-08-28
23:35:00.007245'::timestamp without time zone) AND
(date_part('month'::text, mis_ora) >= date_part('month'::text,
CURRENT_TIMESTAMP)) AND (date_part('year'::text, mis_ora) =
date_part('year'::text, CURRENT_TIMESTAMP)))
                     ->  Seq Scan on y2019m11 r_3  (cost=0.00..49.00
rows=1 width=32)
                           Filter: ((ts >= '2019-08-28
23:35:00.007245'::timestamp without time zone) AND
(date_part('month'::text, mis_ora) >= date_part('month'::text,
CURRENT_TIMESTAMP)) AND (date_part('year'::text, mis_ora) =
date_part('year'::text, CURRENT_TIMESTAMP)))

The fact that making explicit the condition against the year and the
month, which are the top level partition constraint, makes me think
that the executor will try to go down all the branches to the leaf if
the condition is not filtered at the top level. Even if I don't
understand why.

Luca



I've done a simple test case, and find out that probably the problem I
got was due to the partition schema I'm using.
I want a table to be partitioned by a timestamp field with a first
level partition by year, and a second level by month. Therefore, I did
a BY LIST partitioning, but that produces a wrong constraint check
when executing a query.
This is a reproducible example.

BEGIN;

CREATE TABLE root( pk int generated always as identity, v int, ts
timestamp default current_timestamp )
PARTITION BY LIST( extract( year from ts ) );

CREATE TABLE y2018
PARTITION OF root
FOR VALUES IN ( 2018 );

CREATE TABLE y2019
PARTITION OF root
FOR VALUES IN ( 2019 );

 ALTER TABLE y2018    ADD CHECK( ts >= make_timestamp( 2018, 1, 1, 0, 0, 0 )
 AND ts <= make_timestamp( 2018, 12, 31, 23, 59, 59 ) );

 ALTER TABLE y2019    ADD CHECK( ts >= make_timestamp( 2019, 1, 1, 0, 0, 0 )
 AND ts <= make_timestamp( 2019, 12, 31, 23, 59, 59 ) );

INSERT INTO root( v )
SELECT generate_series( 1, 100 ); -- same ts here

COMMIT;

Now if I try to explain a query with the current timestamp (which is
of course in 2019):

testdb=# explain select * from root where ts = current_timestamp;
QUERY PLAN
-------------------------------------------------------------
Append  (cost=0.00..75.59 rows=18 width=16)
->  Seq Scan on y2018  (cost=0.00..37.75 rows=9 width=16)
Filter: (ts = CURRENT_TIMESTAMP)
->  Seq Scan on y2019  (cost=0.00..37.75 rows=9 width=16)
Filter: (ts = CURRENT_TIMESTAMP)
(5 rows)

I got y2018 scanned too, which of course could not be the case since
y2018 cannot contain values that are equal to current_timestamp.
However, if I use a literal the query works fine:

testdb=# explain select * from root where ts = '2019-09-01 09:00:00.000000';
QUERY PLAN
---------------------------------------------------------------------------
Append  (cost=0.00..33.17 rows=9 width=16)
->  Seq Scan on y2019  (cost=0.00..33.12 rows=9 width=16)
Filter: (ts = '2019-09-01 09:00:00'::timestamp without time zone)
(3 rows)


Now, if I change the partition schema using a range, the query works
fine with current_timestamp too:

CREATE TABLE root( pk int generated always as identity, v int, ts
timestamp default current_timestamp )
PARTITION BY RANGE( ts );

CREATE TABLE y2018
PARTITION OF root
FOR VALUES FROM ('2018-01-01 00:00:00.000000')
TO ('2018-12-31 23:59:59.000000');

CREATE TABLE y2019
PARTITION OF root
FOR VALUES FROM ('2019-01-01 00:00:00.000000')
TO ('2019-12-31 23:59:59.000000');


testdb=# explain select * from root where ts = current_timestamp;
                         QUERY PLAN
-------------------------------------------------------------
 Append  (cost=0.00..75.59 rows=18 width=16)
   Subplans Removed: 1
   ->  Seq Scan on y2019  (cost=0.00..37.75 rows=9 width=16)
         Filter: (ts = CURRENT_TIMESTAMP)
(4 rows)



So my end with this is that:
- the list partitioning prevents the current_timestamp to be compared
against the list of possible values (extract year from
current_timestamp) and therefore the planner has no chance but to get
into all the tables, even if the constraints on the ts field
explicitly state some tables can be removed;
- in range partitioning, since the partition is built on the very
range of values, the planner gets the correct path.

I still don't get why using a literal in the first case can lead to a
"more correct" plan.
And I'm curious to know if there's a way to force constraints in the
list partitioning to make the planner really aware of tables that can
be excluded.

Luca



Luca Ferrari <fluca1978@gmail.com> writes:
> I've done a simple test case, and find out that probably the problem I
> got was due to the partition schema I'm using.
> I want a table to be partitioned by a timestamp field with a first
> level partition by year, and a second level by month. Therefore, I did
> a BY LIST partitioning,

> CREATE TABLE root( pk int generated always as identity, v int, ts
> timestamp default current_timestamp )
> PARTITION BY LIST( extract( year from ts ) );

No, that's not going to work, unless your queries all explicitly use
"extract( year from ts ) = something" in their WHERE clauses.  There
is nothing in the system that would derive a constraint like that
from a constraint that just mentions ts.

In your example, the partition routing logic was entirely ineffective
because of this, so you tried to make up for that by adding
CHECK constraints.  But pruning based on CHECK constraints is done
at planning time, so it can't do anything with run-time-mutable
values such as CURRENT_TIMESTAMP.

> CREATE TABLE root( pk int generated always as identity, v int, ts
> timestamp default current_timestamp )
> PARTITION BY RANGE( ts );

> CREATE TABLE y2018
> PARTITION OF root
> FOR VALUES FROM ('2018-01-01 00:00:00.000000')
> TO ('2018-12-31 23:59:59.000000');

This is mostly the right way to do it; you forgot that range bounds
use "low <= variable < high" logic.  So the correct way to declare
the partitions is like

regression=# CREATE TABLE y2018
regression-# PARTITION OF root
regression-# FOR VALUES FROM ('2018-01-01') TO ('2019-01-01');
CREATE TABLE
regression=# CREATE TABLE y2019
PARTITION OF root
FOR VALUES FROM ('2019-01-01') TO ('2020-01-01');
CREATE TABLE

That's easier and doesn't leave a gap between partitions.

This is already sufficient to allow partition routing based on
equality or simple inequality involving "ts", so you don't need
extra CHECK constraints.

> I still don't get why using a literal in the first case can lead to a
> "more correct" plan.

With a literal, plan-time pruning based on the CHECK constraints
was possible.

> And I'm curious to know if there's a way to force constraints in the
> list partitioning to make the planner really aware of tables that can
> be excluded.

No.  The short answer here is that your query WHERE clauses have to
be things that the planner or partition routing code can relate to
the partitioning rules.  In the case of LIST partitioning, that means
there had better be WHERE constraints on the values specified in
the LIST clause, not values that perhaps could be shown to be related
to those values given extensive knowledge about the behaviors of
certain functions.  By and large, the system doesn't have such
knowledge.

            regards, tom lane



Re: literal vs dynamic partition constraint in plan execution

От
Michael Lewis
Дата:
I think I would expect this behavior with how you have defined the constraints as the function results rather than just being strict comparisons to the timestamp field.

Instead of this-
Check constraints:
>     "y2018_mis_ora_check" CHECK (date_part('year'::text, mis_ora) =
> 2018::double precision)

I would expect this-
Check constraints:
>     "y2018_mis_ora_check" CHECK mis_ora) >= make_timestamp(2018, 1,
> 1, 0, 0, 0::double precision) AND < make_timestamp(2019, 1,
> 1, 0, 0, 0::double precision)