Re: pgbench randomness initialization

Поиск
Список
Период
Сортировка
От Fabien COELHO
Тема Re: pgbench randomness initialization
Дата
Msg-id alpine.DEB.2.10.1604071147420.11001@sto
обсуждение исходный текст
Ответ на pgbench randomness initialization  (Andres Freund <andres@anarazel.de>)
Ответы Re: pgbench randomness initialization
Re: pgbench randomness initialization
Список pgsql-hackers
Hello Andres,

> et al I was wondering why it's a good idea for pgbench to do
>     INSTR_TIME_SET_CURRENT(start_time);
>     srandom((unsigned int) INSTR_TIME_GET_MICROSEC(start_time));
> to initialize randomness and then
>     for (i = 0; i < nthreads; i++)
>         thread->random_state[0] = random();
>         thread->random_state[1] = random();
>         thread->random_state[2] = random();
> to initialize the individual thread random state which is then used by
> pg_erand48().
>
> To me it seems better to instead initialize srandom() with a known value
> (say, uh, 0). Or even better don't use random() at all, and fill a
> global pg_erand48() with a known state; and use pg_erand48() to
> initialize the thread states.
>
> Obviously that doesn't make pgbench entirely reproducible, but it seems
> a lot better than now. Individual threads would do work in a
> reproducible order.
>
> I see very little reason to have the current behaviour, or at the very
> least not by default.

I think that it depends on what you want, which may vary:
 (1) "exactly" reproducible runs, but one run may hit a particular     steady state not representative of what happens
ingeneral.
 
 (2) runs which really vary from one to the next, so as     to have an idea about how much it may vary, what is the
performancestability.
 

Currently pgbench focusses on (2), which may or may not be fine depending 
on what you are doing. From a personal point of view I think that (2) is 
more significant to collect performance data, even if the results are more 
unstable: that simply reflects reality and its intrinsic variations, so 
I'm fine that as the default.

Now for those interested in (1) for some reason, I would suggest to rely a 
PGBENCH_RANDOM_SEED environment variable or --random-seed option which 
could be used to have a oxymoronic "deterministic randomness", if desired.
I do not think that it should be the default, though.

-- 
Fabien.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: Move PinBuffer and UnpinBuffer to atomics
Следующее
От: Andres Freund
Дата:
Сообщение: Re: pgbench randomness initialization