Query () for multiindex columns in pandas DataFrame?

With a simple (sibling) column index, you can access a column in a pandas DataFrame using .query () like this:

df1 = pd.DataFrame(np.random.rand(10,2),index=range(10),columns=['A','B'])
df1.query('A > 0.5')

      

I am struggling to achieve a similar thing in a DataFrame using a multi-index column:

df2 = pd.DataFrame(np.random.rand(10,2),index=range(10),columns=[['A','B'],['C','D']])
df2.query('(A,C) > 0.5') # fails
df2.query('"(A,C)" > 0.5') # fails
df2.query('("A","C") > 0.5') # fails

      

Is it doable? Thank...

(As for the motivation: query () seems to allow for very concise selection in the mutli-index - column single-index dataset, like so:

df3 = pd.DataFrame(np.random.rand(6,2),index=[[0]*3+[1]*3,range(2,8)],columns=['A','B'])
df3.index.names=['one','two']
df3.query('one==0 & two<4 & A>0.5')

      

I would like to do something similar with a DF with multiple indices on both axes ...)

+3


source to share





All Articles