Hi,
I'm trying to implement some range partitioning on timeseries data. But it
looks some queries involving date_trunc() doesn't make use of partitioning.
BEGIN;
CREATE TABLE test (
time TIMESTAMP WITHOUT TIME ZONE NOT NULL,
value FLOAT NOT NULL
) PARTITION BY RANGE (time);
CREATE INDEX test_time_idx ON test(time DESC);
CREATE TABLE test_y2010 PARTITION OF test FOR VALUES FROM ('2020-01-01') TO ('2021-01-01');
CREATE TABLE test_y2011 PARTITION OF test FOR VALUES FROM ('2021-01-01') TO ('2022-01-01');
CREATE VIEW vtest AS SELECT DATE_TRUNC('year', time) AS time, SUM(value) AS value FROM test GROUP BY 1;
EXPLAIN (COSTS OFF) SELECT * FROM vtest WHERE time >= TIMESTAMP '2021-01-01';
ROLLBACK;
The plan query all partitions:
HashAggregate
Group Key: (date_trunc('year'::text, test."time"))
-> Append
-> Seq Scan on test_y2010 test_1
Filter: (date_trunc('year'::text, "time") >= '2021-01-01 00:00:00'::timestamp without time zone)
-> Seq Scan on test_y2011 test_2
Filter: (date_trunc('year'::text, "time") >= '2021-01-01 00:00:00'::timestamp without time zone)
The view is there so show the use case, but we get almost similar plan with SELECT * FROM test WHERE DATE_TRUNC('year',
time)>= TIMESTAMP '2021-01-01';
I tested a variation with timescaledb which seem using trigger based
partitioning:
BEGIN;
CREATE EXTENSION IF NOT EXISTS timescaledb;
CREATE TABLE test (
time TIMESTAMP WITHOUT TIME ZONE NOT NULL,
value FLOAT NOT NULL
);
SELECT create_hypertable('test', 'time', chunk_time_interval => INTERVAL '1 year');
CREATE VIEW vtest AS SELECT time_bucket('1 year', time) AS time, SUM(value) AS value FROM test GROUP BY 1;
-- insert some data as partitions are created on the fly
INSERT INTO test VALUES (TIMESTAMP '2020-01-15', 1.0), (TIMESTAMP '2021-12-15', 2.0);
\d+ test
EXPLAIN (COSTS OFF) SELECT * FROM vtest WHERE time >= TIMESTAMP '2021-01-01';
ROLLBACK;
The plan query a single partition:
GroupAggregate
Group Key: (time_bucket('1 year'::interval, _hyper_1_2_chunk."time"))
-> Result
-> Index Scan Backward using _hyper_1_2_chunk_test_time_idx on _hyper_1_2_chunk
Index Cond: ("time" >= '2021-01-01 00:00:00'::timestamp without time zone)
Filter: (time_bucket('1 year'::interval, "time") >= '2021-01-01 00:00:00'::timestamp without time zone)
Note single partition query only works with time_bucket(), not with date_trunc(), I guess
there is some magic regarding this in time_bucket() implementation.
I wonder if there is a way with a reasonable amount of SQL code to achieve this
with vanilla postgres ?
Maybe by taking assumption that DATE_TRUNC(..., time) <= time ?
Thanks!