Re: [PATCH] random_normal function

Поиск
Список
Период
Сортировка
От Paul Ramsey
Тема Re: [PATCH] random_normal function
Дата
Msg-id D1722BB1-78A1-48ED-BE61-EF49648BBD6B@cleverelephant.ca
обсуждение исходный текст
Ответ на Re: [PATCH] random_normal function  (Mark Dilger <mark.dilger@enterprisedb.com>)
Ответы Re: [PATCH] random_normal function  (Joe Conway <mail@joeconway.com>)
Список pgsql-hackers

> On Dec 9, 2022, at 10:39 AM, Mark Dilger <mark.dilger@enterprisedb.com> wrote:
>
>> On Dec 8, 2022, at 1:53 PM, Paul Ramsey <pramsey@cleverelephant.ca> wrote:
>>
>> Just a utility function to generate random numbers from a normal
>> distribution. I find myself doing this several times a year, and I am
>> sure I must not be the only one.
>
> Thanks for the patch.  What do you think about these results?

Angels on pins time! :)

> +-- The semantics of a negative stddev are not well defined
> +SELECT random_normal(mean := 0, stddev := -1);
> +    random_normal
> +---------------------
> + -1.0285744583010896
> +(1 row)

Question is does a negative stddev make enough sense? It is functionally using fabs(stddev),

SELECT avg(random_normal(mean := 0, stddev := -1)) from generate_series(1,1000);
         avg
---------------------
 0.03156106778729526

So could toss an invalid parameter on negative? Not sure if that's more helpful than just being mellow about this
input.


> +
> +SELECT random_normal(mean := 0, stddev := '-Inf');
> + random_normal
> +---------------
> +      Infinity
> +(1 row)

The existing logic around means and stddevs and Inf is hard to tease out:

SELECT avg(v),stddev(v) from (VALUES ('Inf'::float8, '-Inf'::float8)) a(v);
   avg    | stddev
----------+--------
 Infinity |

The return of NULL of stddev would seem to argue that null in this case means "does not compute" at some level. So
returnNULL on Inf stddev? 

> +
> +-- This result may be defensible...
> +SELECT random_normal(mean := '-Inf', stddev := 'Inf');
> + random_normal
> +---------------
> +     -Infinity
> +(1 row)
> +
> +-- but if so, why is this NaN?
> +SELECT random_normal(mean := 'Inf', stddev := 'Inf');
> + random_normal
> +---------------
> +           NaN
> +(1 row)

An Inf mean only implies that one value in the distribution is Inf, but running the function in reverse (generating
values)and only generating one value from the distribution implies we have to always return Inf (except in this case
stddevis also Inf, so I'd go with NULL, assuming we accept the NULL premise above. 

How do you read the tea leaves?

P.





В списке pgsql-hackers по дате отправления:

Предыдущее
От: Nathan Bossart
Дата:
Сообщение: Re: add \dpS to psql
Следующее
От: Joe Conway
Дата:
Сообщение: Re: [PATCH] random_normal function