Python: Pandas use slice with .describe () versions greater than 0.20
Using this because it's convenient.
http://nbviewer.jupyter.org/gist/aflaxman/436cde71f85b93638959
df = pd.DataFrame({'A': [0,0,0,0,1,1],
'B': [1,2,3,4,5,6],
'C': [8,9,10,11,12,13]})
It's use for work!
Now:
>>> pandas.__version__
u'0.20.3'
df.groupby('A').describe().unstack()\
.loc[:,(slice(None),['count','mean']),]
gives:
TypeError: '['count', 'mean']' is an invalid key
+3
source to share
1 answer
For columns, remove unstack
because in version 0.20.0 groupby was changed to describe formatting :
df = df.groupby('A').describe().loc[:,(slice(None),['count','mean'])]
print (df)
B C
count mean count mean
A
0 4.0 2.5 4.0 9.5
1 2.0 5.5 2.0 12.5
MultiIndex is in index
, so it :
is removed first because it selects all index values.
Also added slice(None)
because it MultiIndex
has 3 levels
:
df = df.groupby('A').describe().unstack()\
.loc[(slice(None),['count','mean'],slice(None))]
print (df)
A
B count 0 4.0
1 2.0
mean 0 2.5
1 5.5
C count 0 4.0
1 2.0
mean 0 9.5
1 12.5
dtype: float64
Alternative solutions:
idx = pd.IndexSlice
df = df.groupby('A').describe().unstack()\
.loc[idx[:,['count','mean'],:]]
print (df)
A
B count 0 4.0
1 2.0
mean 0 2.5
1 5.5
C count 0 4.0
1 2.0
mean 0 9.5
1 12.5
dtype: float64
df = df.groupby('A').describe().unstack()\
.loc(axis=0)[:,['count','mean'],:]
print (df)
A
B count 0 4.0
1 2.0
mean 0 2.5
1 5.5
C count 0 4.0
1 2.0
mean 0 9.5
1 12.5
dtype: float64
More info in the pandas documentation - using slicers .
+3
source to share