Re: few ideas for pgbench

Поиск
Список
Период
Сортировка
От Pavel Stehule
Тема Re: few ideas for pgbench
Дата
Msg-id CAFj8pRCXtLCyYD2TN+DR9X0h4mNonZCxZQPqUv0fr69MUqq4UQ@mail.gmail.com
обсуждение исходный текст
Ответ на few ideas for pgbench  (Pavel Stehule <pavel.stehule@gmail.com>)
Ответы Re: few ideas for pgbench  (Fabien COELHO <coelho@cri.ensmp.fr>)
Список pgsql-hackers


pá 7. 5. 2021 v 9:46 odesílatel Fabien COELHO <fabien.coelho@mines-paristech.fr> napsal:

Hello,

>>> When I run pgbench, I usually work with more releases together, so the
>>> server version is important info.
>>
>> Ok. Yes.

Here is a putative patch for this simple part.

+1


>>> What about CSV? Any run can produce one row.
>>
>> Yep, CSV is simple and nice. It depends on what information you would
>> like. For instance for progress report (-P 1) or logs/sampling (-l) would
>> be relevant candidates for CSV. Not so much for the final report, though.
>
> I think so there can be almost all information. We have to ensure
> consistency of columns.
>
> The basic usage can be
>
> for ....
> do
>  pg_bench ... >> logfile
> done
>
> and log file can looks like
>
> start time, rowno, serverver, clientver, connections, scale, readonly,
> jobs, tps, latency, ...
>
> The header row can be optional

Hmmm. It is less clear how to do that with minimal code impact on the
code, as some options which change the report structure, eg when using
multiple scripts (-b/-f) or having detailed per-op informations (-r), as
show below:

  sh> pgbench -P 1 -T 10 -M prepared -c 2 -b se@9 -b si -r
  starting vacuum...end.
  progress: 1.0 s, 10666.9 tps, lat 0.186 ms stddev 0.454
  progress: 2.0 s, 9928.0 tps, lat 0.201 ms stddev 0.466
  progress: 3.0 s, 10314.8 tps, lat 0.193 ms stddev 0.469
  progress: 4.0 s, 10042.7 tps, lat 0.198 ms stddev 0.466
  progress: 5.0 s, 11084.3 tps, lat 0.180 ms stddev 0.408
  progress: 6.0 s, 9804.1 tps, lat 0.203 ms stddev 0.474
  progress: 7.0 s, 10271.5 tps, lat 0.194 ms stddev 0.463
  progress: 8.0 s, 10511.5 tps, lat 0.190 ms stddev 0.424
  progress: 9.0 s, 10005.7 tps, lat 0.199 ms stddev 0.501
  progress: 10.0 s, 10512.4 tps, lat 0.190 ms stddev 0.428
  pgbench (PostgreSQL) 14.0
  server version: 13.2
  transaction type: multiple scripts
  scaling factor: 1
  query mode: prepared
  number of clients: 2
  number of threads: 1
  duration: 10 s
  number of transactions actually processed: 103144
  latency average = 0.193 ms
  latency stddev = 0.455 ms
  initial connection time = 5.043 ms
  tps = 10319.361549 (without initial connection time)
  SQL script 1: <builtin: select only>
  - weight: 9 (targets 90.0% of total)
  - 92654 transactions (89.8% of total, tps = 9269.856947)
  - latency average = 0.052 ms
  - latency stddev = 0.018 ms
  - statement latencies in milliseconds:
          0.000  \set aid random(1, 100000 * :scale)
          0.052  SELECT abalance FROM pgbench_accounts WHERE aid = :aid;
  SQL script 2: <builtin: simple update>
  - weight: 1 (targets 10.0% of total)
  - 10490 transactions (10.2% of total, tps = 1049.504602)
  - latency average = 1.436 ms
  - latency stddev = 0.562 ms
  - statement latencies in milliseconds:
          0.001  \set aid random(1, 100000 * :scale)
          0.000  \set bid random(1, 1 * :scale)
          0.000  \set tid random(1, 10 * :scale)
          0.000  \set delta random(-5000, 5000)
          0.027  BEGIN;
          0.065  UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;
          0.045  SELECT abalance FROM pgbench_accounts WHERE aid = :aid;
          0.048  INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);
          1.249  END

The nature of columns would change depending on options, eg "initial
connection time" does not make sense under -C, so that under a loop around
pgbench scenario rows would not necessarily be consistent…

Also, I'm not sure whether such a report can/should include all inputs
options.

Finally it is unclear how to add such a feature with minimal impact on the
source code.

It is a question if this is possible without more changes or without compatibility break :( Probably not. All output should be centralized.


What I usually do is to put each pgbench run output in a separate file and
write a small shell/perl/python script to process these, possibly
generating CSV on the way.

The goal of my proposal was a reduction of necessity to write auxiliary scripts. The produced document should not be "nice", but should be very easy to import it to some analytical tools.

There is an analogy with Postgres's CSV logs. It is the same. We can see the result of pgbench like some log.

Pavel

 

--
Fabien.

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andrey Lepikhov
Дата:
Сообщение: Re: Removing unneeded self joins
Следующее
От: Peter Smith
Дата:
Сообщение: Re: AlterSubscription_refresh "wrconn" wrong variable?