How can I convert a dict inside a list to a DataFrame in python?

Beginner Python. I am struggling to dump list

dicts

in at pandas.DataFrame

once. My data has the following structure.

a = {'Scores': {'s1': [{'Math': '95',
'Science': '74.5',                  
'English': '60.5'},                         
{'Math': '87.9',              
'Science': '97.3',                  
'English': '78.3'}],                        
's2': [{'Math': '67.2',       
'Science': '74.2',                        
'English': '89'}]}}  

      

My columns pandas.DataFrame

should be Math, Science and English, and the rows should be grades. The columns are dynamically created, so I cannot explicitly specify the column names to call it. All I need are the values ​​of the keys S1 .... Sn.

This is what I have tried so far:

b = a.pop('Scores')
c = list(b.values())
df = pd.DataFrame(c)

      

This displays my framework as:

                                               0  \
0  {'Math': '95', 'Science': '74.5', 'English': '...
1  {'Math': '67.2', 'Science': '74.2', 'English':...

                                               1
0  {'Math': '87.9', 'Science': '97.3', 'English':...
1                                               None

      

Instead, I'm looking for:

Math  Science  English
95    74.5     60.5
87.9  97.3     78.3
67.2  74.2     89

      

Any help I can get would be grateful.

+3


source to share


3 answers


You can use the sum after iterating over the dict values.

Code:

import pandas as pd

data = sum([x for x in a['Scores'].values()], [])
print(pd.DataFrame(data, columns=['Math', 'Science', 'English']))

      

Test data:



a = {'Scores': {'s1': [{'Math': '95',
                        'Science': '74.5',
                        'English': '60.5'},
                       {'Math': '87.9',
                        'Science': '97.3',
                        'English': '78.3'}],
                's2': [{'Math': '67.2',
                        'Science': '74.2',
                        'English': '89'}]}}

      

Result:

   Math Science English
0  67.2    74.2      89
1    95    74.5    60.5
2  87.9    97.3    78.3

      

+4


source


You can simply extract all the scores with an insight / generator:



>>> pd.DataFrame(s for k, v in a['Scores'].items() for s in v)
  English  Math Science
0    60.5    95    74.5
1    78.3  87.9    97.3
2      89  67.2    74.2

      

+1


source


You yourself must apply

pd.Series(a['Scores']).apply(pd.Series).stack().apply(pd.Series)

     English  Math Science
s1 0    60.5    95    74.5
   1    78.3  87.9    97.3
s2 0      89  67.2    74.2

      

0


source







All Articles