Query () for multiindex columns in pandas DataFrame?
With a simple (sibling) column index, you can access a column in a pandas DataFrame using .query () like this:
df1 = pd.DataFrame(np.random.rand(10,2),index=range(10),columns=['A','B'])
df1.query('A > 0.5')
I am struggling to achieve a similar thing in a DataFrame using a multi-index column:
df2 = pd.DataFrame(np.random.rand(10,2),index=range(10),columns=[['A','B'],['C','D']])
df2.query('(A,C) > 0.5') # fails
df2.query('"(A,C)" > 0.5') # fails
df2.query('("A","C") > 0.5') # fails
Is it doable? Thank...
(As for the motivation: query () seems to allow for very concise selection in the mutli-index - column single-index dataset, like so:
df3 = pd.DataFrame(np.random.rand(6,2),index=[[0]*3+[1]*3,range(2,8)],columns=['A','B'])
df3.index.names=['one','two']
df3.query('one==0 & two<4 & A>0.5')
I would like to do something similar with a DF with multiple indices on both axes ...)
+3
source to share
No one has answered this question yet
Check out similar questions: