We have two database servers running streaming replication between them:
Primary server
==============
OS: Linux version 2.6.18-371.3.1.el5
PostgreSQL: 9.1.11
HOT standby server
==================
OS: Linux version 2.6.32-431.11.2.el6.x86_64
PostgreSQL: 9.1.11
Since July 1, one SP suddenly runs slowly in HOT STANDBY server. After
investigation, I can narrow the problem to one particular query in SP. The
weird things are:
(1) The SP takes about 25 seconds to run in HOT STANDBY only, but only 0.5
second in primary
(2) If I extract the query in the SP and run it in a psql session, it runs
fine even in HOT STANDBY
(3) The SP is:
CREATE OR REPLACE FUNCTION tmp_test (p_beacon_id bigint, p_rpt_start_ts
bigint, p_rpt_end_ts bigint) RETURNS bigint AS $$
DECLARE
--
v_min_locate_id bigint;
--
BEGIN
--
SELECT MIN(locate_id) INTO v_min_locate_id
FROM event_startstop
WHERE beacon_id = p_beacon_id
AND locate_id IS NOT NULL
AND network_timestamp BETWEEN p_rpt_start_ts AND p_rpt_end_ts;
--
RETURN v_min_locate_id;
--
EXCEPTION
WHEN OTHERS THEN
RAISE EXCEPTION 'tmp_test %, %', SQLSTATE, SQLERRM;
END
$$ LANGUAGE 'plpgsql' STABLE;
(4) explain analyze buffers in HOT STANDBY:
DB=# explain (analyze, buffers true) select * from tmp_test (55627,
1403989199, 1404187199);
QUERY PLAN
---------------------------------------------------------------------------------------------------------------
Function Scan on tmp_test (cost=0.25..0.26 rows=1 width=8) (actual
time=25300.000..25300.002 rows=1 loops=1)
Buffers: shared hit=25357218 read=880466 written=4235
Total runtime: 25300.067 ms
(3 rows)
(5) if running the SQL from psql:
DB=# explain (analyze, buffers true) select min(locate_id) from
event_startstop where beacon_id=55627 and locate_id is not null and
network_timestamp between 1403989199 and 1404187199;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=79.11..79.12 rows=1 width=8) (actual time=0.342..0.342
rows=1 loops=1)
Buffers: shared hit=8 read=7
-> Append (cost=0.00..79.06 rows=20 width=8) (actual time=0.190..0.326
rows=11 loops=1)
Buffers: shared hit=8 read=7
-> Seq Scan on event_startstop (cost=0.00..0.00 rows=1 width=8)
(actual time=0.002..0.002 rows=0 loops=1)
Filter: ((locate_id IS NOT NULL) AND (network_timestamp >=
1403989199) AND (network_timestamp <= 1404187199) AND (beacon_id = 55627))
-> Bitmap Heap Scan on event_startstop_201406_b54to56k
event_startstop (cost=4.71..79.06 rows=19 width=8) (actual
time=0.186..0.310 rows=11 loops=1)
Recheck Cond: ((beacon_id = 55627) AND (network_timestamp >=
1403989199) AND (network_timestamp <= 1404187199))
Filter: (locate_id IS NOT NULL)
Buffers: shared hit=8 read=7
-> Bitmap Index Scan on
event_startstop_201406_b54to56k_bidntslid_idx (cost=0.00..4.71 rows=19
width=0) (actual time=0.170..0.170 rows=11 loops=1)
Index Cond: ((beacon_id = 55627) AND (network_timestamp
>= 1403989199) AND (network_timestamp <= 1404187199))
Buffers: shared hit=5 read=1
Total runtime: 0.485 ms
(14 rows)
Time: 159.359 ms
(6) the event_startstop is a parent table with 406 children tables
--
View this message in context:
http://postgresql.1045698.n5.nabble.com/stored-procedure-suddenly-runs-slowly-in-HOT-STANDBY-but-fast-in-primary-tp5810599.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.