Pandas reshaping by index with duplicate indexes
I have a dataframe like this:
questions = ['What color?', 'What day?', 'How cold?', 'What color?', 'What color?']
answers = ['red', 'tuesday', '45', 'blue', 'red']
ids = [0, 1, 2, 3, 0]
df = pd.DataFrame({'id': [0, 1, 2, 0, 0], 'questions': questions, 'answers': answers})
>>> id questions answers
0 What color? red
1 What day? tuesday
2 How cold? 45
0 What color? blue
0 What color? red
I want it:
How cold? What color? What day?
id
0 None red None
2 None None tuesday
3 45 None None
4 None blue None
0 None red None
I tried:
df.pivot(values='answers', index='id', columns='questions')
However, rotation always results in an error due to duplicates in the index.
+3
source to share
2 answers
You can achieve this using the pivot method:
df.pivot(columns="questions",values="answers")
Output
How cold? What color? What day?
0 NaN red NaN
1 NaN NaN tuesday
2 45 NaN NaN
3 NaN blue NaN
4 NaN red NaN
EDIT if you want to keep the indices you had you can do:
new_df = df.pivot(columns="questions",values="answers")
new_df.index = df.index
+5
source to share
If you need duplicates:
df['g'] = df.groupby('id').cumcount()
df = df.set_index(['id','g', 'questions']).unstack().reset_index(level=1, drop=True)
print (df)
questions How cold? What color? What day?
id
0 None red None
0 None blue None
0 None red None
1 None None tuesday
2 45 None None
+3
source to share