Обсуждение: sampling.c and potential divisions by 0 ang log(0) with tablesample and ANALYZE in 9.5

Поиск
Список
Период
Сортировка

sampling.c and potential divisions by 0 ang log(0) with tablesample and ANALYZE in 9.5

От
Michael Paquier
Дата:
Hi all,
(Petr in CC)

Coverity is pointing out that anl_random_fract and
sampler_random_fract can return 0, causing in some code paths math
errors, aka division by 0 or even log(0) in the case of TABLESAMPLE or
even ANALYZE.

In 9.4, anl_random_fract is careful enough to use random() + 1 to
prevent that, but that's not the case of 9.5 where we begin to use
pg_erand48, that returns a double in range [0.0,1.0).

I think that we should change the returned double to be (0.0,1.0]
instead like in the patch attached (bernouilli and system methods need
a brush-up as well). I haven't updated tsm_system_rows and
tsm_system_time but their regression diffs are attached.

This bug can be triggered when using TABLESAMPLE, now ANALYZE is more
worrying because it could happen during an auto-analyze.
Thoughts?
--
Michael

Вложения

Re: sampling.c and potential divisions by 0 ang log(0) with tablesample and ANALYZE in 9.5

От
Petr Jelinek
Дата:
On 2015-06-25 10:01, Michael Paquier wrote:
> Hi all,
> (Petr in CC)
>
> Coverity is pointing out that anl_random_fract and
> sampler_random_fract can return 0, causing in some code paths math
> errors, aka division by 0 or even log(0) in the case of TABLESAMPLE or
> even ANALYZE.
>
> In 9.4, anl_random_fract is careful enough to use random() + 1 to
> prevent that, but that's not the case of 9.5 where we begin to use
> pg_erand48, that returns a double in range [0.0,1.0).
>
> I think that we should change the returned double to be (0.0,1.0]

Agreed.

> instead like in the patch attached (bernouilli and system methods need
> a brush-up as well). I haven't updated tsm_system_rows and
> tsm_system_time but their regression diffs are attached.
>

I don't see need for the other 1 - x changes outside of the
sampler_random_fract() tbh. Yes it changes the regression tests output
but I don't think it changes correctness of the results.

--
  Petr Jelinek                  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services

Вложения

Re: sampling.c and potential divisions by 0 ang log(0) with tablesample and ANALYZE in 9.5

От
Michael Paquier
Дата:
On Sat, Jun 27, 2015 at 11:27 AM, Petr Jelinek <petr@2ndquadrant.com> wrote:
> On 2015-06-25 10:01, Michael Paquier wrote:
>> instead like in the patch attached (bernouilli and system methods need
>> a brush-up as well). I haven't updated tsm_system_rows and
>> tsm_system_time but their regression diffs are attached.
>>
>
> I don't see need for the other 1 - x changes outside of the
> sampler_random_fract() tbh. Yes it changes the regression tests output but I
> don't think it changes correctness of the results.

OK, thanks for the confirmation. I thought first that things were
wanted this way in HEAD for the tablesample samplings, but if it does
not matter much, let's move on with your patch then it looks good to
me.
--
Michael
Petr Jelinek <petr@2ndquadrant.com> writes:
> On 2015-06-25 10:01, Michael Paquier wrote:
>> I think that we should change the returned double to be (0.0,1.0]

> Agreed.

I find this to be a pretty bad idea.  That definition is simply weird;
where else in the world will you find a random number generator that does
that?  What are the odds that any callers are actually designed for that
behavior?

Another problem is that we consider anl_random_fract() to be an exported
API, and the very longstanding definition of that is that the result is
in (0,1), excluding both endpoints.  Whatever we do with
sampler_random_fract(), we'd better make sure that anl_random_fract()
preserves that behavior, else we are likely to break third-party modules.

A simple fix would be to adjust sampler_random_fract to disallow 0
as result, say by repeating the pg_erand48 call if it produces 0.
I'm not sure if that would throw off any of the math in the new
tablesample-related callers.  If it would, I'm inclined to fix the
problem call-site-by-call-site, rather than inventing a definition
of sampler_random_fract() that fails to satisfy the POLA.

            regards, tom lane
I wrote:
> Petr Jelinek <petr@2ndquadrant.com> writes:
>> On 2015-06-25 10:01, Michael Paquier wrote:
>>> I think that we should change the returned double to be (0.0,1.0]

>> Agreed.

> I find this to be a pretty bad idea.  That definition is simply weird;
> where else in the world will you find a random number generator that does
> that?  What are the odds that any callers are actually designed for that
> behavior?

And, in fact, a bit of looking quickly finds a counterexample, in
analyze.c:

            int     k = (int) (targrows * sampler_random_fract(rstate.randstate));

            Assert(k >= 0 && k < targrows);

You can't just whack this around to satisfy some new call sites without
considering the behavior of existing use-cases.

            regards, tom lane

Re: Re: sampling.c and potential divisions by 0 ang log(0) with tablesample and ANALYZE in 9.5

От
Petr Jelinek
Дата:
On 2015-06-30 18:21, Tom Lane wrote:
> I wrote:
>> Petr Jelinek <petr@2ndquadrant.com> writes:
>>> On 2015-06-25 10:01, Michael Paquier wrote:
>>>> I think that we should change the returned double to be (0.0,1.0]
>
>>> Agreed.
>
>> I find this to be a pretty bad idea.  That definition is simply weird;
>> where else in the world will you find a random number generator that does
>> that?  What are the odds that any callers are actually designed for that
>> behavior?
>
> And, in fact, a bit of looking quickly finds a counterexample, in
> analyze.c:
>
>              int     k = (int) (targrows * sampler_random_fract(rstate.randstate));
>
>              Assert(k >= 0 && k < targrows);
>
> You can't just whack this around to satisfy some new call sites without
> considering the behavior of existing use-cases.
>

Right, very good point.

So, we used to protect against this problem by using long value and doing:
((double) random() + 1) / ((double) MAX_RANDOM_VALUE + 2)

Maybe best solution is to have pg_lrand48() variant which accepts seed
as parameter same way pg_erand48() does and use the same logic we used
to have before sampling was added.

The attached patch does that and it does not even need changes in the
regression tests output which I consider to be a nice bonus.

--
  Petr Jelinek                  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services

Вложения
Petr Jelinek <petr@2ndquadrant.com> writes:
> Right, very good point.

> So, we used to protect against this problem by using long value and doing:
> ((double) random() + 1) / ((double) MAX_RANDOM_VALUE + 2)

> Maybe best solution is to have pg_lrand48() variant which accepts seed
> as parameter same way pg_erand48() does and use the same logic we used
> to have before sampling was added.

I'm unimpressed with this coding because it presumes that MAX_RANDOM_VALUE
(which is defined as the maximum value returned by random()) has something
to do with the output range of pg_lrand48().  While that might
accidentally fail to fail on all known platforms, it's not good, because
those functions don't have the same provenance and so there's no good
reason to assume that they produce the same output range.

Rather than expanding the API of port.h still further, I continue to think
that the best answer is just to repeat if we get a zero from pg_erand48.

            regards, tom lane

Re: Re: sampling.c and potential divisions by 0 ang log(0) with tablesample and ANALYZE in 9.5

От
Petr Jelinek
Дата:
On 2015-06-30 20:52, Tom Lane wrote:
> Petr Jelinek <petr@2ndquadrant.com> writes:
>> Right, very good point.
>
>> So, we used to protect against this problem by using long value and doing:
>> ((double) random() + 1) / ((double) MAX_RANDOM_VALUE + 2)
>
>> Maybe best solution is to have pg_lrand48() variant which accepts seed
>> as parameter same way pg_erand48() does and use the same logic we used
>> to have before sampling was added.
>
> I'm unimpressed with this coding because it presumes that MAX_RANDOM_VALUE
> (which is defined as the maximum value returned by random()) has something
> to do with the output range of pg_lrand48().  While that might
> accidentally fail to fail on all known platforms, it's not good, because
> those functions don't have the same provenance and so there's no good
> reason to assume that they produce the same output range.

Well, by this logic all the other places which are using
MAX_RANDOM_VALUE would be broken as well as our port version of random()
just calls pg_lrand48(). Not to mention that we hardwire
MAX_RANDOM_VALUE to be same as PG_INT32_MAX (which we could use in the
code I proposed instead of MAX_RANDOM_VALUE to be safe).

>
> Rather than expanding the API of port.h still further, I continue to think
> that the best answer is just to repeat if we get a zero from pg_erand48.
>

Looping over function call until you get value you like seems quite ugly
to me tbh.

--
  Petr Jelinek                  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services

Re: Re: sampling.c and potential divisions by 0 ang log(0) with tablesample and ANALYZE in 9.5

От
Michael Paquier
Дата:
On Wed, Jul 1, 2015 at 1:17 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Petr Jelinek <petr@2ndquadrant.com> writes:
>> On 2015-06-25 10:01, Michael Paquier wrote:
>>> I think that we should change the returned double to be (0.0,1.0]
>
>> Agreed.
>
> I find this to be a pretty bad idea.  That definition is simply weird;
> where else in the world will you find a random number generator that does
> that?  What are the odds that any callers are actually designed for that
> behavior?
>
> Another problem is that we consider anl_random_fract() to be an exported
> API, and the very longstanding definition of that is that the result is
> in (0,1), excluding both endpoints.  Whatever we do with
> sampler_random_fract(), we'd better make sure that anl_random_fract()
> preserves that behavior, else we are likely to break third-party modules.

Wait a minute... Yes you are right I clearly missed the fact that in
~9.4 the range of values returned by anl_random_fract() does not
include 1. I thought it did, visibly I misread the code...

> A simple fix would be to adjust sampler_random_fract to disallow 0
> as result, say by repeating the pg_erand48 call if it produces 0.
> I'm not sure if that would throw off any of the math in the new
> tablesample-related callers.  If it would, I'm inclined to fix the
> problem call-site-by-call-site, rather than inventing a definition
> of sampler_random_fract() that fails to satisfy the POLA.

Agreed. Disallowing 0 in sampler_random_fract looks like a good answer
to that. Looking at the tablesample code, for the bernouilli trial I
recall that the range of probability success and failure is actually
(0,1), (if p is a success rate, the failure is 1 - p). I am not sure
for Knuth Algo S though for the system sampling but that looks OK from
a pure logical viewpoint.
--
Michael

Вложения

Re: Re: sampling.c and potential divisions by 0 ang log(0) with tablesample and ANALYZE in 9.5

От
Michael Paquier
Дата:
On Wed, Jul 1, 2015 at 4:56 PM, Michael Paquier wrote:
> On Wed, Jul 1, 2015 at 1:17 AM, Tom Lane wrote:
>> A simple fix would be to adjust sampler_random_fract to disallow 0
>> as result, say by repeating the pg_erand48 call if it produces 0.
>> I'm not sure if that would throw off any of the math in the new
>> tablesample-related callers.  If it would, I'm inclined to fix the
>> problem call-site-by-call-site, rather than inventing a definition
>> of sampler_random_fract() that fails to satisfy the POLA.
>
> Agreed. Disallowing 0 in sampler_random_fract looks like a good answer
> to that. Looking at the tablesample code, for the bernouilli trial I
> recall that the range of probability success and failure is actually
> (0,1), (if p is a success rate, the failure is 1 - p). I am not sure
> for Knuth Algo S though for the system sampling but that looks OK from
> a pure logical viewpoint.

For the sake of the archives, this has been fixed by commit d7c19d68.

(Thanks Tom for pushing it!)
--
Michael

Re: Re: sampling.c and potential divisions by 0 ang log(0) with tablesample and ANALYZE in 9.5

От
Michael Paquier
Дата:
On Wed, Jul 1, 2015 at 3:52 AM, Tom Lane wrote:
> Rather than expanding the API of port.h still further
> [...]

The current trend, as far as I am understanding it, is not to increase
the amount of code in src/port, but actually to reduce it and to rely
a maximum on what modern platforms offer. See for example the recent
discussion about rint and numeric rounding here:

http://www.postgresql.org/message-id/flat/20150320194337.2573.72944@wrigleys.postgresql.org#20150320194337.2573.72944@wrigleys.postgresql.org
We are keeping rint around because MSVC just added it recently in
MS2013 even if the POSIX spec has it for a long time, but I am sure
that we would drop it asap from the code tree if we could.
--
Michael

Re: Re: sampling.c and potential divisions by 0 ang log(0) with tablesample and ANALYZE in 9.5

От
Petr Jelinek
Дата:
On 2015-07-02 02:51, Michael Paquier wrote:
> On Wed, Jul 1, 2015 at 3:52 AM, Tom Lane wrote:
>> Rather than expanding the API of port.h still further
>> [...]
>
> The current trend, as far as I am understanding it, is not to increase
> the amount of code in src/port, but actually to reduce it and to rely
> a maximum on what modern platforms offer. See for example the recent
> discussion about rint and numeric rounding here:
>
http://www.postgresql.org/message-id/flat/20150320194337.2573.72944@wrigleys.postgresql.org#20150320194337.2573.72944@wrigleys.postgresql.org
> We are keeping rint around because MSVC just added it recently in
> MS2013 even if the POSIX spec has it for a long time, but I am sure
> that we would drop it asap from the code tree if we could.
>

Which is completely moot point here because the sampling depends on
pg_erand48 on every platform no matter what the platform provides.

--
  Petr Jelinek                  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services