Python: Pandas use slice with .describe () versions greater than 0.20

Question

Python: Pandas use slice with .describe () versions greater than 0.20

Using this because it's convenient.

http://nbviewer.jupyter.org/gist/aflaxman/436cde71f85b93638959

df = pd.DataFrame({'A': [0,0,0,0,1,1],
                   'B': [1,2,3,4,5,6],
                   'C': [8,9,10,11,12,13]})

It's use for work!

Now:

>>> pandas.__version__
u'0.20.3'

df.groupby('A').describe().unstack()\
    .loc[:,(slice(None),['count','mean']),]

gives:

TypeError: '['count', 'mean']' is an invalid key

+3

python pandas slice

Merlin 10 jul. 17 at 4:31

source to share

1 answer

jezrael · Accepted Answer · 2017-07-10T04:33:47+0000

For columns, remove unstack

because in version 0.20.0 groupby was changed to describe formatting :

df = df.groupby('A').describe().loc[:,(slice(None),['count','mean'])]
print (df)

      B          C      
  count mean count  mean
A                       
0   4.0  2.5   4.0   9.5
1   2.0  5.5   2.0  12.5

MultiIndex is in index

, so it :

is removed first because it selects all index values.

Also added slice(None)

because it MultiIndex

has 3 levels

:

df = df.groupby('A').describe().unstack()\
    .loc[(slice(None),['count','mean'],slice(None))]

print (df)

          A
B  count  0     4.0
          1     2.0
   mean   0     2.5
          1     5.5
C  count  0     4.0
          1     2.0
   mean   0     9.5
          1    12.5
dtype: float64

Alternative solutions:

idx = pd.IndexSlice
df = df.groupby('A').describe().unstack()\
    .loc[idx[:,['count','mean'],:]]

print (df)
          A
B  count  0     4.0
          1     2.0
   mean   0     2.5
          1     5.5
C  count  0     4.0
          1     2.0
   mean   0     9.5
          1    12.5
dtype: float64

df = df.groupby('A').describe().unstack()\
    .loc(axis=0)[:,['count','mean'],:]

print (df)
          A
B  count  0     4.0
          1     2.0
   mean   0     2.5
          1     5.5
C  count  0     4.0
          1     2.0
   mean   0     9.5
          1    12.5
dtype: float64

More info in the pandas documentation - using slicers .

Python: Pandas use slice with .describe () versions greater than 0.20

More articles: