# Python numpy.var returning wrong values

I'm trying to do a simple variance calculation on a set of three numbers:

``````numpy.var([0.82159889, 0.26007962, 0.09818412])
```

```

which returns

``````0.09609366366174843
```

```

However, when calculating the variance, it should be

``````0.1441405
```

```

Seems like such a simple thing, but I haven't been able to find an answer yet.

+3

source to share

The documentation explains:

``````ddof : int, optional
"Delta Degrees of Freedom": the divisor used in the calculation is
``N - ddof``, where ``N`` represents the number of elements. By
default `ddof` is zero.
```

```

And you have:

``````>>> numpy.var([0.82159889, 0.26007962, 0.09818412], ddof=0)
0.09609366366174843
>>> numpy.var([0.82159889, 0.26007962, 0.09818412], ddof=1)
0.14414049549262264
```

```

Both conventions are common enough that you always need to check which one is used by whatever package you use, in whatever language.

+5

source

`np.var`

calculates the variance of the population by default.

The sum of squared errors can be calculated as follows:

``````>>> vals = [0.82159889, 0.26007962, 0.09818412]
>>> mean = sum(vals)/3.0
>>> mean
0.3932875433333333
>>> sum((mean-val)**2 for val in vals)
0.2882809909852453
>>> sse = sum((mean-val)**2 for val in vals)
```

```

This is the population dispersion:

``````>>> sse/3
0.09609366366174843
>>> np.var(vals)
0.09609366366174843
```

```

This is the sample variance:

``````>>> sse/(3-1)
0.14414049549262264
>>> np.var(vals, ddof=1)
0.14414049549262264
```

```
0

source

All Articles