How to use pivot to get 2 columns in a multiindex column in pandas

I have a data frame with 4 columns (a, b, c, d are the column names):

df = 
a   b   c    d
1   2   3    4
5   2   7    8

      

Can I use df.pivot()

to get 2 columns in a multiindex column? The following doesn't work:

df.pivot('a', ['b', 'c'])

      

I want to

b  2
c  3   7
a  
1  4   NA
5  NA  8

      

I know I can use pivot_table to do this easily ( pd.pivot_table(df, index='a', columns=['b', 'c'])

), but I am curious about the flexibility pivot

as the documentation is not clear.

+3


source to share


3 answers


Obviously the implementation bits are missing and I think you found them. We have a work in progress, but you're right, the documentation says that a column parameter can be an object, but nothing works. I trust @MaxU and @jezrael gave it a good try and none of us seem to be able to get it to work as the documentation says. I call this a bug! I can report this if someone else does not have it or not, before I get to it.


However, I found this to be quirky. My plan was to instead pass the list into an index parameter and then transpose. But instead of this line 'c'

and 'b'

are used as the index value ... this is not what I wanted.

What is this strange

df.pivot(['c', 'b'], 'a', 'd')

a    1    5
b  NaN  8.0
c  4.0  NaN

      


Also, it looks ok:

df.pivot('a', 'b', 'd')

b  2
a   
1  4
5  8

      

But the error here gets confused



print(df.pivot('a', ['b'], 'd'))

      

KeyError: 'Level b not found'

      

The quest continues ...


OP's own answer
ignore

Using pivot_table

df.pivot_table (values ​​= None, index = None, columns = None, aggfunc = 'mean', fill_value = None, marginins = False, dropna = True, margins_name = 'All')

df.pivot_table('d', 'a', ['b', 'c'])

b    2     
c    3    7
a          
1  4.0  NaN
5  NaN  8.0

      

+3


source


Nearest solution without aggregation set_index

+ unstack

:

df = df.set_index(['b','c','a'])['d'].unstack([0,1])
print (df)
b    2     
c    3    7
a          
1  4.0  NaN
5  NaN  8.0

      



Solution with pivot

, but a little crazy - you need to create MultiIndex

and transpose the last one:

df = df.set_index(['b','c'])
df = df.pivot(columns='a')['d'].T
print (df)
b    2     
c    3    7
a          
1  4.0  NaN
5  NaN  8.0

      

+2


source


we can also use pd.crosstab

:

In [80]: x
Out[80]:
   a  b  c  d
0  1  2  3  4
1  5  2  7  8

In [81]: pd.crosstab(x.a, [x.b, x.c], x.d, aggfunc='mean')
Out[81]:
b    2
c    3    7
a
1  4.0  NaN
5  NaN  8.0

      

+2


source







All Articles