Linux, CentOS 5.2, Postgres 8.3.4, 8.3.5. System tables and user tables li=
sted below have been VACUUM'd, ANALYZE'd and REINDEX'd.
Summary:
Simple update / delete queries that hit a parent table fa=E7ade of a large =
partitioned database are taking 15 to 20 seconds to plan (and a couple ms t=
o execute).
Worse, the backend will consume about 7GB of memory while planning (measure=
d with top as: Resident memory - shared memory ; it is released after its d=
one).
The particular update or delete has a where clause that causes it to only a=
ffect one table out of many, and going directly against the child table as =
named will parse and plan the query in less than 1ms and consumes very litt=
le memory. Triggers / rules are not used to modify behavior at all.
Workaround: Execute all queries against the table partitions, not the fa=
=E7ade. Non-trivial for anything spanning more than 1 partition.
Table information and definitions at the end.
Queries below, run locally with \timing on. The time it takes to explain t=
hem is the same as it is to execute, there is nothing special about the act=
ual explain process causing a delay. I have simplified the case to minimal=
ist forms that demonstrate the issue.
Total tables in the system:
select count (*) from pg_tables;
count
-------
53427
Child tables of the table in question:
select count (*) from pg_tables where tablename like pp_logs%';
count
-------
6062
A simple select against one such table (a very small one). ~1 sec to plan,=
a couple ms to execute. No noticeable jump in memory use while planning.
explain analyze select att from log.pp_logs WHERE s_id=3D23 AND date=3D'200=
8-12-01' AND p_id =3D 3;
=
QUERY PLAN
---------------------------------------------------------------------------=
---------------------------------------------------------------------------=
-----------
Result (cost=3D0.00..24.12 rows=3D3 width=3D1) (actual time=3D0.267..0.26=
7 rows=3D0 loops=3D1)
-> Append (cost=3D0.00..24.12 rows=3D3 width=3D1) (actual time=3D0.266=
..0.266 rows=3D0 loops=3D1)
-> Seq Scan on p_p_logs (cost=3D0.00..4.65 rows=3D1 width=3D1) (=
actual time=3D0.048..0.048 rows=3D0 loops=3D1)
Filter: ((s_id =3D 23) AND (date =3D '2008-12-01'::date) AND=
(p_id =3D 3))
-> Seq Scan on p_p_logs_023_2008_12_01 p_p_logs (cost=3D0.00..19=
.47 rows=3D2 width=3D1) (actual time=3D0.218..0.218 rows=3D0 loops=3D1)
Filter: ((s_id =3D 23) AND (date =3D '2008-12-01'::date) AND=
(p_id =3D 3))
Total runtime: 4.393 ms
(7 rows)
Time: 1134.866 ms
The same in an UPDATE form, 20 seconds to plan, a couple ms to execute. 7GB=
of memory used while planning (then released).
The memory and time consumed does not differ for explain versus explain ana=
lyze.
explain analyze UPDATE log.p_p_logs SET att=3Dtrue
WHERE s_id=3D23 AND date=3D'2008-12-01' AND p_id =3D 3::int;
QU=
ERY PLAN
---------------------------------------------------------------------------=
---------------------------------------------------------------------------=
-------
Append (cost=3D0.00..24.12 rows=3D3 width=3D181) (actual time=3D0.269..0.=
269 rows=3D0 loops=3D1)
-> Seq Scan on p_p_logs (cost=3D0.00..4.65 rows=3D1 width=3D124) (actu=
al time=3D0.045..0.045 rows=3D0 loops=3D1)
Filter: ((s_id =3D 23) AND (date =3D '2008-12-01'::date) AND (p_id=
=3D 3))
-> Seq Scan on p_p_logs_023_2008_12_01 p_p_logs (cost=3D0.00..19.47 ro=
ws=3D2 width=3D181) (actual time=3D0.221..0.221 rows=3D0 loops=3D1)
Filter: ((s_id =3D 23) AND (date =3D '2008-12-01'::date) AND (p_id=
=3D 3))
Total runtime: 4.246 ms
(6 rows)
Time: 17194.092 ms
A DELETE form. Similar to the above, uses ~ 6.7GB memory.
explain DELETE from log.p_p_logs WHERE s_id=3D23 AND date=3D'2008-12-01' AN=
D p_id=3D-321;
QUERY PLAN
---------------------------------------------------------------------------=
--------------------------------------
Append (cost=3D0.00..24.12 rows=3D3 width=3D6)
-> Seq Scan on p_p_logs (cost=3D0.00..4.65 rows=3D1 width=3D6)
Filter: ((s_id =3D 23) AND (date =3D '2008-12-01'::date) AND (p_id=
=3D (-321)))
-> Seq Scan on p_p_logs_023_2008_12_01 p_p_logs (cost=3D0.00..19.47 ro=
ws=3D2 width=3D6)
Filter: ((s_id =3D 23) AND (date =3D '2008-12-01'::date) AND (p_id=
=3D (-321)))
(5 rows)
Time: 16680.702 ms
We have no triggers or rules for INSERT on the parent table fa=E7ade. No r=
ules or triggers at all in relation to any partitioned tables, in fact. Fo=
r INSERT we go directly to the child tables. It would appear that we have =
to do this for DELETE and UPDATE as well, and SELECT is expensive too. How=
ever, what seems most worrying here is how much more expensive, and HUGELY =
memory consuming it is for DELETE and UPDATE than SELECT. I would expect a=
ll of these to have the same ~1 second of time to identify the candidate ta=
bles based on the table constraints. But it seems like this table identifi=
cation process is somewhat slow for SELECT, and extremely bad for DELETE an=
d UPDATE. Setting work_mem or maintenance_work_mem to 10MB does not change=
the ~7GB of RAM used to plan the query (our current settings are 800MB and=
400MB, respectively).
When we go directly to the partition corresponding to the query, there is v=
irtually no query planning time at all. For example:
explain analyze SELECT att from p_log.p_p_logs_023_2008_12_01 WHERE p_id =
=3D 3;
QUERY PLAN
---------------------------------------------------------------------------=
----------------------------------------------------
Seq Scan on p_p_logs_023_2008_12_01 (cost=3D0.00..17.34 rows=3D2 width=3D=
1) (actual time=3D0.125..0.125 rows=3D0 loops=3D1)
Filter: (p_id =3D 3)
Total runtime: 0.148 ms
(3 rows)
Time: 0.861 ms
explain analyze DELETE from p_log.p_p_logs_023_2008_12_01 WHERE p_id =3D 3;
QUERY PLAN
---------------------------------------------------------------------------=
----------------------------------------------------
Seq Scan on p_p_logs_023_2008_12_01 (cost=3D0.00..17.34 rows=3D2 width=3D=
6) (actual time=3D0.125..0.125 rows=3D0 loops=3D1)
Filter: (p_id =3D 3)
Total runtime: 0.144 ms
(3 rows)
Time: 0.454 ms
explain analyze UPDATE p_log.p_p_logs_023_2008_12_01 SET att=3Dtrue
WHERE p_id =3D 3::int;
QUERY PLAN
---------------------------------------------------------------------------=
------------------------------------------------------
Seq Scan on p_p_logs_023_2008_12_01 (cost=3D0.00..17.34 rows=3D2 width=3D=
181) (actual time=3D0.119..0.119 rows=3D0 loops=3D1)
Filter: (p_id =3D 3)
Total runtime: 0.153 ms
(3 rows)
Time: 0.645 ms
Table definition, minus columns not in the above queries (about 12 other co=
lumns, unimportant here):
p_log.p_p_logs_023_2008_12_01
Table "p_log.p_p_logs_023_2008_12_01"
Column | Type | Modifiers
---------------------+-----------------------------+---------------
s_id | bigint |
p_id | bigint |
date | date |
att | boolean | default false
Check constraints:
"p_p_logs_023_2008_12_01_check" CHECK (s_id =3D 23 AND date =3D '2008-1=
2-01'::date)
Inherits: log.p_p_logs
All ~6000 of the child tables are of this form, partitioned by one day, and=
one s_id.