Pandas Dataframe.describe (): What is the standard deviation?

Question

Pandas Dataframe.describe (): What is the standard deviation?

Using the python Pandas library, the Dataframe.describe () function prints the standard deviation of the dataset. However, the documentation page does not indicate whether the standard deviation is an "uncorrected" standard deviation or a "corrected" standard deviation.

Can someone tell me which one it is returning?

+3

python pandas dataframe standard-deviation

hlin117 08 Sep 14 at 6:02

source to share

2 answers

DataFrame.describe()

callsSeries.std()

to get the standard deviation. And as the documentation tells us ,

Returns the unbiased standard deviation along the requested axis.

Normalized to N-1 by default. This can be changed with the ddof argument

Thus, the standard deviation returned describe()

is essentially the "corrected sample standard deviation".

+3

Carsten 08 Sep 14 at 6:53

source to share

Andy Hayden · Accepted Answer · 2014-09-08T06:53:39+0000

This is the adjusted sample standard deviation.
You can verify this with a simple series and apply the formulas:

In [11]: s = pd.Series([1, 2])

In [12]: s.std()
Out[12]: 0.70710678118654757

In [13]: from math import sqrt
   ....:  sqrt(0.5)
Out[13]: 0.7071067811865476

and the formula for the corrected sample standard deviation:

In [14]: sqrt(1./(len(s)-1) * ((s - s.mean()) ** 2).sum())
Out[14]: 0.7071067811865476

Pandas Dataframe.describe (): What is the standard deviation?

More articles: