Обсуждение: speeding up grafana sensor-data query on raspberry pi 3
Hi, I am currently trying to migrate an influxdb 1.7 smarthome database to postgresql (13.9) running on my raspberry 3. It works quite well, but for the queries executed by grafana I get a bit highter execution times than I'd hoped for. Example: table smartmeter with non-null column ts (timestamp with time zone) and brinc index on ts, no pk to avoid a btree index. Sensor values are stored every 5s, so for 1 month there are about 370k rows - and in total the table currently holds about 3M rows. The query to display the values for 1 month takes ~3s, with the bitmap heap scan as well as aggregation taking up most of the time, with sorting in between. Is there anything that could be improved? With influxdb I was able to view 3 and 6 months graphs, with postgresql it simply takes too long. I am currently running the 32-bit ARMv6 build, would it be a big improvement running ARMv8/64-bit? Thank you in advance, Clemens smarthomedb=> explain analyze SELECT floor(extract(epoch from ts)/10800)*10800 AS "time", AVG(stromL1) as l1, AVG(stromL2) as l2, AVG(stroml3) as l3 FROM smartmeter WHERE ts BETWEEN '2023-03-16 T09:51:28.397Z' AND '2023-04-16T08:51:28.397Z' GROUP BY time order by time; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------- GroupAggregate (cost=117490.70..132536.10 rows=376135 width=32) (actual time=2061.253..2974.336 rows=236 loops=1) Group Key: ((floor((date_part('epoch'::text, ts) / '10800'::double precision)) * '10800'::double precision)) -> Sort (cost=117490.70..118431.04 rows=376135 width=20) (actual time=2058.407..2410.467 rows=371810 loops=1) Sort Key: ((floor((date_part('epoch'::text, ts) / '10800'::double precision)) * '10800'::double precision)) Sort Method: external merge Disk: 10960kB -> Bitmap Heap Scan on smartmeter (cost=112.09..74944.93 rows=376135 width=20) (actual time=88.336..1377.862 rows=371810 loops=1) Recheck Cond: ((ts >= '2023-03-16 10:51:28.397+01'::timestamp with time zone) AND (ts <= '2023-04-16 10:51:28.397+02'::timestamp with time zone)) Rows Removed by Index Recheck: 2131 Heap Blocks: lossy=4742 -> Bitmap Index Scan on smartmeter_ts_idx (cost=0.00..18.05 rows=377166 width=0) (actual time=1.376..1.377 rows=47420 loops=1) Index Cond: ((ts >= '2023-03-16 10:51:28.397+01'::timestamp with time zone) AND (ts <= '2023-04-16 10:51:28.397+02'::timestamp with time zone)) Planning Time: 0.419 ms JIT: Functions: 9 Options: Inlining false, Optimization false, Expressions true, Deforming true Timing: Generation 8.734 ms, Inlining 0.000 ms, Optimization 2.388 ms, Emission 83.137 ms, Total 94.259 ms Execution Time: 2990.772 ms (17 Zeilen)
On Sun, Apr 16, 2023 at 07:00:33PM +0200, Clemens Eisserer wrote: > Hi, > > I am currently trying to migrate an influxdb 1.7 smarthome database to > postgresql (13.9) running on my raspberry 3. > It works quite well, but for the queries executed by grafana I get a > bit highter execution times than I'd hoped for. Suggestions: - enable track_io_timing and show explain (analyze,buffers,settings) - or otherwise show your non-default settings; - show \d of your table(s) - show the query plan for the 6 months query . The query plan may be different, or (if you can run it with "analyze") it may be illuminating to see how the query "scales". - consider trying postgres 15 (btw, v16 will have a beta release next month) https://wiki.postgresql.org/wiki/Slow_Query_Questions -- Justin
On Mon, 17 Apr 2023 at 05:00, Clemens Eisserer <linuxhippy@gmail.com> wrote: > Example: > table smartmeter with non-null column ts (timestamp with time zone) > and brinc index on ts, no pk to avoid a btree index. > Sensor values are stored every 5s, so for 1 month there are about 370k > rows - and in total the table currently holds about 3M rows. > The query to display the values for 1 month takes ~3s, with the bitmap > heap scan as well as aggregation taking up most of the time, with > sorting in between. I know you likely don't have much RAM to spare here, but more work_mem might help, even just 16MBs might be enough. This would help the Sort and to a lesser extent the Bitmap Heap Scan too. Also, if you'd opted to use PostgreSQL 14 or 15, then you could have performed CREATE STATISTICS on your GROUP BY clause expression and then run ANALYZE. That might cause the planner to flip to a Hash Aggregate which would eliminate the Sort before aggregation. You'd only need to sort 236 rows after the Hash Aggregate for the ORDER BY. Plus, what Justin said. David
Hi, On 2023-04-16 19:00:33 +0200, Clemens Eisserer wrote: > I am currently trying to migrate an influxdb 1.7 smarthome database to > postgresql (13.9) running on my raspberry 3. > It works quite well, but for the queries executed by grafana I get a > bit highter execution times than I'd hoped for. > > Example: > table smartmeter with non-null column ts (timestamp with time zone) > and brinc index on ts, no pk to avoid a btree index. > Sensor values are stored every 5s, so for 1 month there are about 370k > rows - and in total the table currently holds about 3M rows. > The query to display the values for 1 month takes ~3s, with the bitmap > heap scan as well as aggregation taking up most of the time, with > sorting in between. > > Is there anything that could be improved? > With influxdb I was able to view 3 and 6 months graphs, with > postgresql it simply takes too long. > > I am currently running the 32-bit ARMv6 build, would it be a big > improvement running ARMv8/64-bit? Yes, I suspect so. On a 64bit system most of the datatypes you're dealing with are going to be pass-by-value, i.e. not incur memory allocation overhead. Whereas timestamps, doubles, etc will all require allocations on a 32bit system. > smarthomedb=> explain analyze SELECT floor(extract(epoch from > ts)/10800)*10800 AS "time", AVG(stromL1) as l1, AVG(stromL2) as l2, > AVG(stroml3) as l3 FROM smartmeter WHERE ts BETWEEN '2023-03-16 > T09:51:28.397Z' AND '2023-04-16T08:51:28.397Z' GROUP BY time order by time; > > QUERY PLAN > --------------------------------------------------------------------------------------------------------------------------------------------------------------------- > GroupAggregate (cost=117490.70..132536.10 rows=376135 width=32) > (actual time=2061.253..2974.336 rows=236 loops=1) > Group Key: ((floor((date_part('epoch'::text, ts) / '10800'::double > precision)) * '10800'::double precision)) > -> Sort (cost=117490.70..118431.04 rows=376135 width=20) (actual > time=2058.407..2410.467 rows=371810 loops=1) > Sort Key: ((floor((date_part('epoch'::text, ts) / > '10800'::double precision)) * '10800'::double precision)) Given the number of rows you're sorting on a somewhat slow platform, the complexity of the expression here might be a relevant factor. Particularly on a 32bit system (see above), due to the memory allocations we'll end up doing. I don't know how much control over the query generation you have. Consider rewriting floor(extract(epoch from ts)/10800)*10800 AS "time" to something like date_bin('3h', ts, '2001-01-01 00:00') > Sort Method: external merge Disk: 10960kB > -> Bitmap Heap Scan on smartmeter (cost=112.09..74944.93 > rows=376135 width=20) (actual time=88.336..1377.862 rows=371810 > loops=1) Given the time spent in the bitmap heap scan, it might be beneficial to increase effective_io_concurrency some. > Recheck Cond: ((ts >= '2023-03-16 > 10:51:28.397+01'::timestamp with time zone) AND (ts <= '2023-04-16 > 10:51:28.397+02'::timestamp with time zone)) > Rows Removed by Index Recheck: 2131 > Heap Blocks: lossy=4742 The lossiness might also incur some overhead, so increasing work_mem a bit will help some. Greetings, Andres Freund
Is an option partitioning the table by month? If your report is month based, you can improve performance by partitioning.
Felipph
Em dom., 16 de abr. de 2023 às 19:10, Andres Freund <andres@anarazel.de> escreveu:
Hi,
On 2023-04-16 19:00:33 +0200, Clemens Eisserer wrote:
> I am currently trying to migrate an influxdb 1.7 smarthome database to
> postgresql (13.9) running on my raspberry 3.
> It works quite well, but for the queries executed by grafana I get a
> bit highter execution times than I'd hoped for.
>
> Example:
> table smartmeter with non-null column ts (timestamp with time zone)
> and brinc index on ts, no pk to avoid a btree index.
> Sensor values are stored every 5s, so for 1 month there are about 370k
> rows - and in total the table currently holds about 3M rows.
> The query to display the values for 1 month takes ~3s, with the bitmap
> heap scan as well as aggregation taking up most of the time, with
> sorting in between.
>
> Is there anything that could be improved?
> With influxdb I was able to view 3 and 6 months graphs, with
> postgresql it simply takes too long.
>
> I am currently running the 32-bit ARMv6 build, would it be a big
> improvement running ARMv8/64-bit?
Yes, I suspect so. On a 64bit system most of the datatypes you're dealing with
are going to be pass-by-value, i.e. not incur memory allocation
overhead. Whereas timestamps, doubles, etc will all require allocations on a
32bit system.
> smarthomedb=> explain analyze SELECT floor(extract(epoch from
> ts)/10800)*10800 AS "time", AVG(stromL1) as l1, AVG(stromL2) as l2,
> AVG(stroml3) as l3 FROM smartmeter WHERE ts BETWEEN '2023-03-16
> T09:51:28.397Z' AND '2023-04-16T08:51:28.397Z' GROUP BY time order by time;
>
> QUERY PLAN
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------
> GroupAggregate (cost=117490.70..132536.10 rows=376135 width=32)
> (actual time=2061.253..2974.336 rows=236 loops=1)
> Group Key: ((floor((date_part('epoch'::text, ts) / '10800'::double
> precision)) * '10800'::double precision))
> -> Sort (cost=117490.70..118431.04 rows=376135 width=20) (actual
> time=2058.407..2410.467 rows=371810 loops=1)
> Sort Key: ((floor((date_part('epoch'::text, ts) /
> '10800'::double precision)) * '10800'::double precision))
Given the number of rows you're sorting on a somewhat slow platform, the
complexity of the expression here might be a relevant factor. Particularly on
a 32bit system (see above), due to the memory allocations we'll end up doing.
I don't know how much control over the query generation you have. Consider
rewriting
floor(extract(epoch from ts)/10800)*10800 AS "time"
to something like
date_bin('3h', ts, '2001-01-01 00:00')
> Sort Method: external merge Disk: 10960kB
> -> Bitmap Heap Scan on smartmeter (cost=112.09..74944.93
> rows=376135 width=20) (actual time=88.336..1377.862 rows=371810
> loops=1)
Given the time spent in the bitmap heap scan, it might be beneficial to
increase effective_io_concurrency some.
> Recheck Cond: ((ts >= '2023-03-16
> 10:51:28.397+01'::timestamp with time zone) AND (ts <= '2023-04-16
> 10:51:28.397+02'::timestamp with time zone))
> Rows Removed by Index Recheck: 2131
> Heap Blocks: lossy=4742
The lossiness might also incur some overhead, so increasing work_mem a bit
will help some.
Greetings,
Andres Freund
Hi again, Thanks for the suggestions. - I increased work_mem to 64M, which caused disk-based sorting to be replaced with quicksort and resulted in a modest speedup. However I have to admit I didn't understand why more work_mem speeds up the heap scan. - the suggestion regarding "create statistics on group by" is awesome, to get rid of sorting is probably the best that could happen to the query. - ARMv8 instead of ARMv6 could have a positive impact I'll mirate to a containerized postgresql-version running on raspbian os 64-bit as I find some time to spare and report back. Thanks again, Clemens Am So., 16. Apr. 2023 um 22:50 Uhr schrieb David Rowley <dgrowleyml@gmail.com>: > > On Mon, 17 Apr 2023 at 05:00, Clemens Eisserer <linuxhippy@gmail.com> wrote: > > Example: > > table smartmeter with non-null column ts (timestamp with time zone) > > and brinc index on ts, no pk to avoid a btree index. > > Sensor values are stored every 5s, so for 1 month there are about 370k > > rows - and in total the table currently holds about 3M rows. > > The query to display the values for 1 month takes ~3s, with the bitmap > > heap scan as well as aggregation taking up most of the time, with > > sorting in between. > > I know you likely don't have much RAM to spare here, but more work_mem > might help, even just 16MBs might be enough. This would help the Sort > and to a lesser extent the Bitmap Heap Scan too. > > Also, if you'd opted to use PostgreSQL 14 or 15, then you could have > performed CREATE STATISTICS on your GROUP BY clause expression and > then run ANALYZE. That might cause the planner to flip to a Hash > Aggregate which would eliminate the Sort before aggregation. You'd > only need to sort 236 rows after the Hash Aggregate for the ORDER BY. > > Plus, what Justin said. > > David
Hi, I was too lazy to switch to a ARMv8 rootfs, instead I used an ARMv7 postgresql 15.2 docker image running via podman to try the suggestions. The performance improvements offered by the new postgresql features is really impressive! 5318.382 ms: original query 2372.618 ms: with date_bin 2154.530 ms: date_bin + work_mem=64mb (quicksort instead of disk-based sorting) 0826.196 ms: date_bin + work-mem + create-statistics 0589.445 ms: date_bin + work-mem + create-statistics + max_workers=2 (instead of 1) So evaluating the complex/old expression indeed was really slow, using date_bin already reduced query time to 50%. Hash based aggregation further more than halfed the remaining 2,1s - down to ~825ms! 826ms for one month, and ~8s for a whole year is actually great - as far as I can remember even influxdb, which is actually optimized for this kind of data, didn't perform nearly as well as postgresql - awesome! Thanks again for all the suggestions and for such a great dbms! - Clemens Am Di., 18. Apr. 2023 um 14:14 Uhr schrieb Clemens Eisserer <linuxhippy@gmail.com>: > > Hi again, > > Thanks for the suggestions. > > - I increased work_mem to 64M, which caused disk-based sorting to be > replaced with quicksort and resulted in a modest speedup. However I > have to admit I didn't understand why more work_mem speeds up the heap > scan. > - the suggestion regarding "create statistics on group by" is awesome, > to get rid of sorting is probably the best that could happen to the > query. > - ARMv8 instead of ARMv6 could have a positive impact > > I'll mirate to a containerized postgresql-version running on raspbian > os 64-bit as I find some time to spare and report back. > > Thanks again, Clemens > > Am So., 16. Apr. 2023 um 22:50 Uhr schrieb David Rowley <dgrowleyml@gmail.com>: > > > > On Mon, 17 Apr 2023 at 05:00, Clemens Eisserer <linuxhippy@gmail.com> wrote: > > > Example: > > > table smartmeter with non-null column ts (timestamp with time zone) > > > and brinc index on ts, no pk to avoid a btree index. > > > Sensor values are stored every 5s, so for 1 month there are about 370k > > > rows - and in total the table currently holds about 3M rows. > > > The query to display the values for 1 month takes ~3s, with the bitmap > > > heap scan as well as aggregation taking up most of the time, with > > > sorting in between. > > > > I know you likely don't have much RAM to spare here, but more work_mem > > might help, even just 16MBs might be enough. This would help the Sort > > and to a lesser extent the Bitmap Heap Scan too. > > > > Also, if you'd opted to use PostgreSQL 14 or 15, then you could have > > performed CREATE STATISTICS on your GROUP BY clause expression and > > then run ANALYZE. That might cause the planner to flip to a Hash > > Aggregate which would eliminate the Sort before aggregation. You'd > > only need to sort 236 rows after the Hash Aggregate for the ORDER BY. > > > > Plus, what Justin said. > > > > David