Pandas dataframe list list
I have a dataset that follows this format:
data =[[[1, 0, 1000], [2, 1000, 2000]], [[1, 0, 1500], [2, 1500, 2500], [2, 2500, 4000]]] var1 = [10.0, 20.0] var2 = ['ref1','ref2']
I want to convert it to dataframe:
dic = {'var1': var1, 'var2': var2, 'data': data}
import Pandas as pd
pd.DataFrame(dic)
Result:
However, I'm trying to get something like this:
I am trying to flatten a dictionary / list, but with no success:
pd.DataFrame([[col1, col2] for col1, d in dic.items() for col2 in d])
See the result:
The different sizes of the list made unpacking difficult for another level. I'm not sure if pandas can take care of this, it needs to be done before importing into pandas.
+3
source to share
2 answers
Creating a matching list works:
new_data = []
for x, v1, v2 in zip(data, var1, var2):
new_data.extend([y + [v1] + [v2] for y in x])
pd.DataFrame(new_data, columns=['data', 'min', 'max', 'var1', 'var2'])
gives:
data min max var1 var2
0 1 0 1000 10 ref1
1 2 1000 2000 10 ref1
2 1 0 1500 20 ref2
3 2 1500 2500 20 ref2
4 2 2500 4000 20 ref2
+2
source to share
I can iterate over the rows in your temporary DataFrame.
df = pd.DataFrame(dic)
result = []
for i,d in df.iterrows():
temp = pd.DataFrame(d['data'], columns=['data', 'min', 'max'])
temp['var1'] = d['var1']
temp['var2'] = d['var2']
result += [temp]
pd.concat(result)
This creates
data min max var1 var2
0 1 0 1000 10 ref1
1 2 1000 2000 10 ref1
0 1 0 1500 20 ref2
1 2 1500 2500 20 ref2
2 2 2500 4000 20 ref2
+1
source to share