Python pandas how can i pass colon ":" to pointer through variable

I am working on a method that will eventually work with slices of data from a large pandas multi-index. I can create masks for each indexer (basically lists of values ​​to define a slice):

df.loc[idx[a_mask,b_mask],idx[c_mask,d_mask]]

      

That would be nice, but in some scenarios I would really like to highlight everything along some of these axes, which is equivalent to:

df.loc[idx[a_mask,b_mask],idx[:,d_mask]]

      

Is there a way to pass this colon ":" that replaces c_mask in the second example as a variable? Ideally, I would just set the c_mask to ":", but of course that doesn't work (and it shouldn't, because if we had a column named ...). But is there a way to pass the value of a variable that binds the "integer axis" along one of these indexers?

I understand that I can create a mask that will select everything by collecting all values ​​along the corresponding axis, but this is non-trivial and adds a lot of code. Likewise, I could split the data access into 5 scenarios (one for each one: in it and one with four masks), but it doesn't seem to adhere to the DRY principle and is still fragile as it cannot handle the whole chunk the whole direction of choice.

So, all I can pass is via a variable that will select an integer direction in the indexer like a: would? Or is there a more elegant way to optionally select the entire direction?

+3


source to share


1 answer


idx[slice(None)]

equivalent idx[:]

So they are all equivalent.



In [11]: df = DataFrame({'A' : np.random.randn(9)},index=pd.MultiIndex.from_product([range(3),list('abc')],names=['first','second']))

In [12]: df
Out[12]: 
                     A
first second          
0     a      -0.668344
      b      -1.679159
      c       0.061876
1     a      -0.237272
      b       0.136495
      c      -1.296027
2     a       0.554533
      b       0.433941
      c      -0.014107

In [13]: idx = pd.IndexSlice

In [14]: df.loc[idx[:,'b'],]
Out[14]: 
                     A
first second          
0     b      -1.679159
1     b       0.136495
2     b       0.433941

In [15]: df.loc[idx[slice(None),'b'],]
Out[15]: 
                     A
first second          
0     b      -1.679159
1     b       0.136495
2     b       0.433941

In [16]: df.loc[(slice(None),'b'),]
Out[16]: 
                     A
first second          
0     b      -1.679159
1     b       0.136495
2     b       0.433941

      

+5


source







All Articles