Hello-
While computing standard deviation on a float8 column, I noticed that
sometimes STDDEV returned "NaN". I've tracked down the cause and
thought I'd show everyone. This may or may not be a bug, I don't know.
Notice that the second insert statement is putting a ::float4 into a
float8 column.
The reason for the NaN is probably due to some precision issue between
float4 and float8 which is causing the "variance" of the mixed ::float4
::float8 column to be negative.
template1=# create table test (a float4, b float8);
CREATE
template1=# insert into test (a, b) values (1/11::float4, 1/11::float8);
INSERT 62077086 1
template1=# insert into test (a, b) values (1/11::float4, 1/11::float4);
INSERT 62077087 1
template1=# select * from test;
a | b
-----------+--------------------
0.0909091 | 0.0909090909090909
0.0909091 | 0.0909090909090909
(2 rows)
template1=# select stddev(a), stddev(b) from test;
stddev | stddev
--------+--------
0 | NaN
(1 row)
template1=# select stddev(a::float4), stddev(b::float8) from test;
stddev | stddev
--------+--------
0 | NaN
(1 row)
By explicitly casting column b to ::float4, the NaN disappears.
template1=# select stddev(a::float4), stddev(b::float4) from test;
stddev | stddev
--------+--------
0 | 0
(1 row)
The variance of the columns shows the problem (standard deviation is the
sqrt of variance):
template1=# select variance(a), variance(b) from test;
variance | variance
----------+-----------------------
0 | -4.59091857411831e-19
(1 row)
template1=# select variance(a::float4), variance(b::float4) from test;
variance | variance
----------+----------
0 | 0
(1 row)
--
-**-*-*---*-*---*-*---*-----*-*-----*---*-*---*-----*-----*-*-----*---
Jon Lapham
Extracta Moléculas Naturais, Rio de Janeiro, Brasil
email: lapham@extracta.com.br web: http://www.extracta.com.br/
***-*--*----*-------*------------*--------------------*---------------