Pandas json_normalize and null values ββin JSON
I have this JSON sample
{
"name":"John",
"age":30,
"cars": [
{ "name":"Ford", "models":[ "Fiesta", "Focus", "Mustang" ] },
{ "name":"BMW", "models":[ "320", "X3", "X5" ] },
{ "name":"Fiat", "models":[ "500", "Panda" ] }
]
}
When I need to convert JSON to pandas DataFrame I use the following code
import json
from pandas.io.json import json_normalize
from pprint import pprint
with open('example.json', encoding="utf8") as data_file:
data = json.load(data_file)
normalized = json_normalize(data['cars'])
This code works well, but in the case of some empty machines (null values), I cannot normalize_json.
Json example
{
"name":"John",
"age":30,
"cars": [
{ "name":"Ford", "models":[ "Fiesta", "Focus", "Mustang" ] },
null,
{ "name":"Fiat", "models":[ "500", "Panda" ] }
]
}
The error that was thrown
AttributeError: 'NoneType' object has no attribute 'keys'
I tried to ignore errors in json_normalize but didn't help
normalized = json_normalize(data['cars'], errors='ignore')
How do I handle null values ββin JSON?
source to share
I agree with Vozman and filling in empty {}
dictionaries will solve the problem. However, I had the same problem for my project and I created a package to work with these types of DataFrames. look at flat table , it uses json_normalize but expands rows and columns as well.
import flat_table df = pd.DataFrame(data) flat_table.normalize(df)
This will output the following. Lists are expanded to different lines, and dictionary keys are expanded to different columns.
index name_x age name_y models
0 0 John 30 Ford Fiesta
1 0 John 30 Ford Focus
2 0 John 30 Ford Mustang
3 1 John 30 NaN NaN
4 2 John 30 Fiat 500
5 2 John 30 Fiat Panda
source to share