Pandas: split column of lists of unequal length into multiple columns
I have a Pandas framework that looks like this:
codes
1 [71020]
2 [77085]
3 [36415]
4 [99213, 99287]
5 [99233, 99233, 99233]
I am trying to split lists in df['codes']
into columns like below:
code_1 code_2 code_3
1 71020
2 77085
3 36415
4 99213 99287
5 99233 99233 99233
where columns that don't matter (because the list wasn't that long) are filled with spaces or NaN or something.
I've seen answers like this and others similar to it, and while they work on lists of the same length, they all throw errors when I try to use methods on lists of unequal length. Is there a good way to do this?
+3
source to share
2 answers
Try:
pd.DataFrame(df.codes.values.tolist()).add_prefix('code_')
code_0 code_1 code_2
0 71020 NaN NaN
1 77085 NaN NaN
2 36415 NaN NaN
3 99213 99287.0 NaN
4 99233 99233.0 99233.0
Enable index
pd.DataFrame(df.codes.values.tolist(), df.index).add_prefix('code_')
code_0 code_1 code_2
1 71020 NaN NaN
2 77085 NaN NaN
3 36415 NaN NaN
4 99213 99287.0 NaN
5 99233 99233.0 99233.0
We can attach all the formatting like this:
f = lambda x: 'code_{}'.format(x + 1)
pd.DataFrame(
df.codes.values.tolist(),
df.index, dtype=object
).fillna('').rename(columns=f)
code_1 code_2 code_3
1 71020
2 77085
3 36415
4 99213 99287
5 99233 99233 99233
+7
source to share