Обсуждение: BUG #15758: regr_* return wrong answers for some valid data

Поиск
Список
Период
Сортировка

BUG #15758: regr_* return wrong answers for some valid data

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      15758
Logged by:          Dingyuan Wang
Email address:      gumblex@aosc.io
PostgreSQL version: 11.2
Operating system:   Debian
Description:

This can reproduce on at least version 10 or 11, on Debian amd64.
Given the following data:

CREATE TABLE testdata (
    x double precision,
    t1 double precision,
    t2 double precision
);
INSERT INTO testdata VALUES (79.7439999999999998, 1506705739.38499999,
1506705817.38100004);
INSERT INTO testdata VALUES (123.903999999999996, 1506705766.89499998,
1506705824.0079999);
INSERT INTO testdata VALUES (87.2960099999999954, 1506705746.29299998,
1506705818.24300003);
INSERT INTO testdata VALUES (116.352000000000004, 1506705761.87299991,
1506705823.05500007);
INSERT INTO testdata VALUES (67.7120100000000065, 1506705734.74300003,
1506705815.5539999);
INSERT INTO testdata VALUES (72.9600099999999969, 1506705735.35100007,
1506705815.6500001);
INSERT INTO testdata VALUES (101.632000000000005, 1506705756.25999999,
1506705820.70600009);
INSERT INTO testdata VALUES (108.927999999999997, 1506705761.30699992,
1506705821.59500003);
INSERT INTO testdata VALUES (94.0799999999999983, 1506705747.37199998,
1506705819.82999992);

This select:

SELECT
  regr_slope(x, t1) a1, regr_intercept(x, t1) b1, regr_r2(x, t1) r1,
  regr_slope(x, t2) a2, regr_intercept(x, t2) b2, regr_r2(x, t2) r2,
  regr_slope(t1, x) a3, regr_intercept(t1, x) b3, regr_r2(t1, x) r3,
  regr_slope(t2, x) a4, regr_intercept(t2, x) b4, regr_r2(t2, x) r4
FROM testdata;

Outputs:

null | null | null |
0.131221234798431 | -197711616 | 0.020541283900245 |
0.613632754607929 | 1506705691.81008 |  1 |
0.156539327890021 | 1506705804.72837 | 0.020541283900245

Some numbers are obviously wrong. Using Python's Decimal arithmetic, the
reference answer should be:

1.5950991071, -2403344901.6928107374, 0.9788051681
6.3409218262, -9553903722.1136632550, 0.9926034164
0.6136328230, 1506705691.8100808484, 0.9788051681
0.1565392925, 1506705804.7283713567, 0.9926034164

I understand that there may be some floating point overflows/underflows, but
the major spreadsheets can calculate this well (to some precision). I doubt
other regr_* functions have similar problems.


Re: BUG #15758: regr_* return wrong answers for some valid data

От
Jeff Janes
Дата:
On Tue, Apr 16, 2019 at 3:55 AM PG Bug reporting form <noreply@postgresql.org> wrote:
The following bug has been logged on the website:

Bug reference:      15758
Logged by:          Dingyuan Wang
Email address:      gumblex@aosc.io
PostgreSQL version: 11.2
Operating system:   Debian
Description:       

This can reproduce on at least version 10 or 11, on Debian amd64.
Given the following data:

This will be improved in version 12 when that is released later this year.  See the below commit:

commit e954a727f0c8872bf5203186ad0f5312f6183746
Author: Dean Rasheed <dean.a.rasheed@gmail.com>
Date:   Sat Oct 6 11:20:09 2018 +0100

    Improve the accuracy of floating point statistical aggregates.



Cheers,

Jeff

Re: BUG #15758: regr_* return wrong answers for some valid data

От
Dingyuan Wang
Дата:
Pg 12 does fix the problem, and the numbers are slightly better than 
that of Gnumeric and LibreOffice Calc.

2019/4/16 21:41, Jeff Janes:
> On Tue, Apr 16, 2019 at 3:55 AM PG Bug reporting form 
> <noreply@postgresql.org <mailto:noreply@postgresql.org>> wrote:
> 
>     The following bug has been logged on the website:
> 
>     Bug reference:      15758
>     Logged by:          Dingyuan Wang
>     Email address: gumblex@aosc.io <mailto:gumblex@aosc.io>
>     PostgreSQL version: 11.2
>     Operating system:   Debian
>     Description:
> 
>     This can reproduce on at least version 10 or 11, on Debian amd64.
>     Given the following data:
> 
> 
> This will be improved in version 12 when that is released later this 
> year.  See the below commit:
> 
> commit e954a727f0c8872bf5203186ad0f5312f6183746
> Author: Dean Rasheed <dean.a.rasheed@gmail.com 
> <mailto:dean.a.rasheed@gmail.com>>
> Date:   Sat Oct 6 11:20:09 2018 +0100
> 
>      Improve the accuracy of floating point statistical aggregates.
> 
> 
> 
> Cheers,
> 
> Jeff