Обсуждение: [GENERAL] Partitioning

Поиск
Список
Период
Сортировка

[GENERAL] Partitioning

От
Krithika Venkatesh
Дата:
Hi,

I have a table that is partitioned on a numeric column (ID).

Partitioning works when I query the table with no joins. 

SELECT * FROM TABLE A a WHERE ID IN (SELECT ID FROM TABLE B b WHERE CREATED_TS = CURRENT_TIMESTAMP)

Partitioning doesn't work when I do join.

SELECT A.* FROM TABLE A a INNER JOIN TABLE B b ON  a.ID = b.ID.

Is there any other option that would work.

Thanks in Advance..


Re: [GENERAL] Partitioning (constraint exclusion involving joins)

От
Justin Pryzby
Дата:
On Tue, Jul 25, 2017 at 06:21:43PM +0530, Krithika Venkatesh wrote:
> I have a table that is partitioned on a numeric column (ID).
>
> Partitioning works when I query the table with no joins.
>
> SELECT * FROM TABLE A a WHERE ID IN (SELECT ID FROM TABLE B b WHERE
> CREATED_TS = CURRENT_TIMESTAMP)
>
> Partitioning doesn't work when I do join.
>
> SELECT A.* FROM TABLE A a INNER JOIN TABLE B b ON  a.ID = b.ID.

I think you mean "constraint exclusion doesn't work when yo do a join",

which is because it only works on simple values compiled before the planner
gets to see them:

main=# explain SELECT COUNT(1) FROM eric_enodeb_metrics WHERE start_time>now(); -- -'999 minutes'::interval;
                                                       QUERY PLAN
 

-------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=62.44..62.45 rows=1 width=8)
   ->  Append  (cost=0.00..62.40 rows=14 width=0)
         ->  Seq Scan on eric_enodeb_metrics  (cost=0.00..0.00 rows=1 width=0)
               Filter: (start_time > now())
         ->  Index Only Scan using eric_enodeb_201607_idx on eric_enodeb_201607  (cost=0.42..4.44 rows=1 width=0)
               Index Cond: (start_time > now())
         ->  Index Only Scan using eric_enodeb_201608_idx on eric_enodeb_201608  (cost=0.42..4.44 rows=1 width=0)
               Index Cond: (start_time > now())
         ->  Index Only Scan using eric_enodeb_201609_idx on eric_enodeb_201609  (cost=0.42..4.44 rows=1 width=0)
               Index Cond: (start_time > now())

https://www.postgresql.org/docs/current/static/ddl-partitioning.html
|The following caveats apply to constraint exclusion:
| Constraint exclusion only works when the query's WHERE clause contains
|constants (or externally supplied parameters). For example, a comparison
|against a non-immutable function such as CURRENT_TIMESTAMP cannot be optimized,
|since the planner cannot know which partition the function value might fall
|into at run time.
[..]


.. and see an early mail on its implementation, here:
https://www.postgresql.org/message-id/1121251997.3970.237.camel@localhost.localdomain

Justin


Re: [GENERAL] Partitioning (constraint exclusion involving joins)

От
Krithika Venkatesh
Дата:
We understand the constraints exclusion will work only on constant values. But in our case we will never pass a constant value to the partitioning key when we query the partition tables. Will the partition be beneficial in this case. If yes, can you please explain.

Thanks

On 25-Jul-2017 6:46 PM, "Justin Pryzby" <pryzby@telsasoft.com> wrote:
On Tue, Jul 25, 2017 at 06:21:43PM +0530, Krithika Venkatesh wrote:
> I have a table that is partitioned on a numeric column (ID).
>
> Partitioning works when I query the table with no joins.
>
> SELECT * FROM TABLE A a WHERE ID IN (SELECT ID FROM TABLE B b WHERE
> CREATED_TS = CURRENT_TIMESTAMP)
>
> Partitioning doesn't work when I do join.
>
> SELECT A.* FROM TABLE A a INNER JOIN TABLE B b ON  a.ID = b.ID.

I think you mean "constraint exclusion doesn't work when yo do a join",

which is because it only works on simple values compiled before the planner
gets to see them:

main=# explain SELECT COUNT(1) FROM eric_enodeb_metrics WHERE start_time>now(); -- -'999 minutes'::interval;
                                                       QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=62.44..62.45 rows=1 width=8)
   ->  Append  (cost=0.00..62.40 rows=14 width=0)
         ->  Seq Scan on eric_enodeb_metrics  (cost=0.00..0.00 rows=1 width=0)
               Filter: (start_time > now())
         ->  Index Only Scan using eric_enodeb_201607_idx on eric_enodeb_201607  (cost=0.42..4.44 rows=1 width=0)
               Index Cond: (start_time > now())
         ->  Index Only Scan using eric_enodeb_201608_idx on eric_enodeb_201608  (cost=0.42..4.44 rows=1 width=0)
               Index Cond: (start_time > now())
         ->  Index Only Scan using eric_enodeb_201609_idx on eric_enodeb_201609  (cost=0.42..4.44 rows=1 width=0)
               Index Cond: (start_time > now())

https://www.postgresql.org/docs/current/static/ddl-partitioning.html
|The following caveats apply to constraint exclusion:
| Constraint exclusion only works when the query's WHERE clause contains
|constants (or externally supplied parameters). For example, a comparison
|against a non-immutable function such as CURRENT_TIMESTAMP cannot be optimized,
|since the planner cannot know which partition the function value might fall
|into at run time.
[..]


.. and see an early mail on its implementation, here:
https://www.postgresql.org/message-id/1121251997.3970.237.camel@localhost.localdomain

Justin

Re: [GENERAL] Partitioning

От
George Neuner
Дата:
On Tue, 25 Jul 2017 18:21:43 +0530, Krithika Venkatesh
<krithikavenkatesh31@gmail.com> wrote:

>I have a table that is partitioned on a numeric column (ID).
>
>Partitioning works when I query the table with no joins.
>
>SELECT * FROM TABLE A a WHERE ID IN (SELECT ID FROM TABLE B b WHERE
>CREATED_TS = CURRENT_TIMESTAMP)
>
>Partitioning doesn't work when I do join.
>
>SELECT A.* FROM TABLE A a INNER JOIN TABLE B b ON  a.ID = b.ID.
>
>Is there any other option that would work.
>
>Thanks in Advance..

The subselect is constraining the set of ID value(s) to be matched in
A, which (at least potentially) permits identifying the relevant
partition(s).

The join must include all partitions of A because the set of ID values
to be matched with B are not constrained.

Also, the join query is not equivalent because it does not include the
timestamp constraint on B.  I don't think that will make any
difference to the query plan ... AFAICS, it still needs to consider
all partitions of A ... but it may improve performance.

George