Re: Sampling-based timing for EXPLAIN ANALYZE

Поиск
Список
Период
Сортировка
От Lukas Fittl
Тема Re: Sampling-based timing for EXPLAIN ANALYZE
Дата
Msg-id CAP53Pkw97c7q-8e6BBab4aXS=1ROMtKpnHEeK0RDnPx7yiirog@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Sampling-based timing for EXPLAIN ANALYZE  (Jelte Fennema <me@jeltef.nl>)
Список pgsql-hackers
On Fri, Jan 6, 2023 at 1:19 AM Jelte Fennema <me@jeltef.nl> wrote:
Nice addition! And the code looks pretty straight forward.

Thanks for reviewing!

The current patch triggers warnings: https://cirrus-ci.com/task/6016013976731648 Looks like you need to add void as the argument.

Fixed in v2 attached. This also adds a simple regression test, as well as fixes the parallel working handling.

Do you have some performance comparison between TIMING ON and TIMING SAMPLING?

Here are some benchmarks of auto_explain overhead on my ARM-based M1 Macbook for the following query run with pgbench on a scale factor 100 data set:

SELECT COUNT(*) FROM pgbench_branches JOIN pgbench_accounts USING (bid) JOIN pgbench_tellers USING (bid) WHERE bid = 42;

(the motivation is to use a query that is more complex than the standard pgbench select-only test query)

avg latency (best of 3), -T 300, -c 4, -s 100, shared_buffers 2GB, fsync off, max_parallel_workers_per_gather 0:

master, log_timing = off: 871 ms (878 / 877 / 871)
patch, log_timing = off: 869 ms (882 / 880 / 869)
patch, log_timing = on: 890 ms (917 / 930 / 890)
patch, log_timing = sampling, samplefreq = 1000: 869 ms (887 / 869 / 894)

Additionally, here is Andres' benchmark from [1], with the sampling option added:

%  psql -Xc 'DROP TABLE IF EXISTS t; CREATE TABLE t AS SELECT * FROM generate_series(1, 100000) g(i);' postgres && pgbench -n -r -t 100 -f <(echo -e "SELECT COUNT(*) FROM t;EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t;EXPLAIN (ANALYZE, TIMING SAMPLING) SELECT COUNT(*) FROM t;EXPLAIN (ANALYZE, TIMING ON) SELECT COUNT(*) FROM t;") postgres |grep '^ '
DROP TABLE
SELECT 100000
         3.507           0  SELECT COUNT(*) FROM t;
         3.476           0  EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t;
         3.576           0  EXPLAIN (ANALYZE, TIMING SAMPLING) SELECT COUNT(*) FROM t;
         5.096           0  EXPLAIN (ANALYZE, TIMING ON) SELECT COUNT(*) FROM t;

My pg_test_timing data for reference:

% pg_test_timing
Testing timing overhead for 3 seconds.
Per loop time including overhead: 23.65 ns
Histogram of timing durations:
  < us   % of total      count
     1     97.64472  123876325
     2      2.35421    2986658
     4      0.00022        277
     8      0.00016        202
    16      0.00064        815
    32      0.00005         64

In InstrStartSampling there's logic to increase/decrease the frequency of an already existing timer. It's not clear to me when this can occur. I'd expect sampling frequency to remain constant throughout an explain plan. If it's indeed needed, I think a code comment would be useful to explain why this edge case is necessary.

Clarified in a code comment in v2. This is needed for handling nested statements which could have different sampling frequencies for each nesting level, i.e. a function might want to sample it's queries at a higher frequency than its caller.

Thanks,

--
Lukas Fittl
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Drouvot, Bertrand"
Дата:
Сообщение: Helper functions for wait_for_catchup() in Cluster.pm
Следующее
От: "Takamichi Osumi (Fujitsu)"
Дата:
Сообщение: RE: Time delayed LR (WAS Re: logical replication restrictions)