Json formatting from pandas dataframe
I'm trying to create a JSON file from my dataframe that looks something like this:
{'249' : [
{'candidateId': 751,
'votes':7528,
'vote_pct':0.132
},
{'candidateId': 803,
'votes':7771,
'vote_pct':0.138
}...
],
'274': [
{'candidateId': 891,
....
My dataframe looks like this:
officeId candidateId votes vote_pct
0 249 751 7528 0.132198
1 249 803 7771 0.136465
2 249 818 7569 0.132918
3 249 827 9089 0.159610
4 249 856 2271 0.039881
5 249 877 7491 0.131548
6 249 878 8758 0.153798
7 249 895 6267 0.110054
8 249 1161 201 0.003530
9 274 736 4664 0.073833
10 274 737 6270 0.099256
11 274 757 4953 0.078407
12 274 769 5239 0.082935
13 274 770 7134 0.112933
14 274 783 7673 0.121466
15 274 862 6361 0.100697
16 274 901 7671 0.121434
Using the function, I can flip the dataframe index and return it as a JSON string for each office id like:
def clean_results(votes):
#trying to get a well structured json file
return votes.reset_index().to_json(orient='index', double_precision=2)
res_json = results.groupby(['officeId']).apply(clean_results)
But when I do that, I end up with a new dataframe with a JSON object for each officeID, and JSON uses the numbered index as the top level, like this:
{"0":{"index":0.0,"officeId":249.0,"candidateId":751.0,"total_votes":7528.0,"vote_pct":0.13},"1":{"index":1.0,"officeId":249.0,"candidateId":803.0,"total_votes":7771.0,"vote_pct":0.14},"2":...
+3
source to share
1 answer
This is one approach, maybe something cleaner.
results = {}
for key, df_gb in df.groupby('officeId'):
results[str(key)] = df_gb.to_dict('records')
import json
print json.dumps(results, indent=4)
####
{
"274": [
{
"votes": 4664.0,
"candidateId": 736.0,
"vote_pct": 0.07383300000000001,
"officeId": 274.0
},
{
"votes": 6270.0,
"candidateId": 737.0,
"vote_pct": 0.099255999999999997,
"officeId": 274.0
......
+3
source to share