Why is the average in this array greater than the maximum?
I found myself with a very confusing array in Python. Below is the output from iPython when I work with it (with the pylab flag):
In : x = np.load('x.npy') In : x.shape : (504000,) In : x : 98.20354462, 98.26583099, 98.26529694, ..., 98.20297241, 98.29492188], dtype=float32) In : min(x), mean(x), max(x) : (97.950058, 98.689438, 98.329773)
I have no idea what's going on. Why does the mean () function provide what is obviously the wrong answer?
I don't even know where to start debugging this problem.
I am using Python 2.7.6.
I would like to share a file
Probably because of a copied round-off error when calculating mean (). float32 relative precision is ~ 1e-7 and you have 500000 elements -> ~ 5% rounding when calculating the sum directly ().
The algorithm for computing sum () and mean () is more complex (pairwise summation) in the latest version of Numpy 1.9.0:
_ '1.9.0' > x = numpy.random.random(500000).astype("float32") + 300 > min(x), numpy.mean(x), max(x) (300.0, 300.50024, 301.0)> import numpy > numpy.__version_
At the same time, you can use a higher precision battery type:
source to share
I've included a snippet from
below. You should try using
----- The arithmetic mean is the sum of the elements along the axis divided by the number of elements. Note that for floating-point input, the mean is computed using the same precision the input has. Depending on the input data, this can cause the results to be inaccurate, especially for `float32` (see example below). Specifying a higher-precision accumulator using the `dtype` keyword can alleviate this issue. In single precision, `mean` can be inaccurate: > a = np.zeros((2, 512*512), dtype=np.float32) > a[0, :] = 1.0 > a[1, :] = 0.1 > np.mean(a) 0.546875 Computing the mean in float64 is more accurate: > np.mean(a, dtype=np.float64) 0.55000000074505806
source to share