How to use pivot to get 2 columns in a multiindex column in pandas

Question

How to use pivot to get 2 columns in a multiindex column in pandas

I have a data frame with 4 columns (a, b, c, d are the column names):

df = 
a   b   c    d
1   2   3    4
5   2   7    8

Can I use df.pivot()

to get 2 columns in a multiindex column? The following doesn't work:

df.pivot('a', ['b', 'c'])

I want to

I know I can use pivot_table to do this easily ( pd.pivot_table(df, index='a', columns=['b', 'c'])

), but I am curious about the flexibility pivot

as the documentation is not clear.

+3

pandas dataframe pivot

Zhang18 01 june 17 at 17:45

source to share

3 answers

Nearest solution without aggregation set_index

+ unstack

:

df = df.set_index(['b','c','a'])['d'].unstack([0,1])
print (df)
b    2     
c    3    7
a          
1  4.0  NaN
5  NaN  8.0

Solution with pivot

, but a little crazy - you need to create MultiIndex

and transpose the last one:

df = df.set_index(['b','c'])
df = df.pivot(columns='a')['d'].T
print (df)
b    2     
c    3    7
a          
1  4.0  NaN
5  NaN  8.0

+2

jezrael 01 june 17 at 17:55

source to share

we can also use pd.crosstab

:

In [80]: x
Out[80]:
   a  b  c  d
0  1  2  3  4
1  5  2  7  8

In [81]: pd.crosstab(x.a, [x.b, x.c], x.d, aggfunc='mean')
Out[81]:
b    2
c    3    7
a
1  4.0  NaN
5  NaN  8.0

+2

MaxU 01 june 17 at 18:20

source to share

piRSquared · Accepted Answer · 2017-06-01T18:05:59+0000

Obviously the implementation bits are missing and I think you found them. We have a work in progress, but you're right, the documentation says that a column parameter can be an object, but nothing works. I trust @MaxU and @jezrael gave it a good try and none of us seem to be able to get it to work as the documentation says. I call this a bug! I can report this if someone else does not have it or not, before I get to it.

However, I found this to be quirky. My plan was to instead pass the list into an index parameter and then transpose. But instead of this line 'c'

and 'b'

are used as the index value ... this is not what I wanted.

What is this strange

df.pivot(['c', 'b'], 'a', 'd')

a    1    5
b  NaN  8.0
c  4.0  NaN

Also, it looks ok:

df.pivot('a', 'b', 'd')

b  2
a   
1  4
5  8

But the error here gets confused

print(df.pivot('a', ['b'], 'd'))

KeyError: 'Level b not found'

The quest continues ...

OP's own answer
ignore

Using pivot_table

df.pivot_table (values = None, index = None, columns = None, aggfunc = 'mean', fill_value = None, marginins = False, dropna = True, margins_name = 'All')

df.pivot_table('d', 'a', ['b', 'c'])

b    2     
c    3    7
a          
1  4.0  NaN
5  NaN  8.0

How to use pivot to get 2 columns in a multiindex column in pandas

More articles: