How to remove comment lines from JSON file in python
I am getting a JSON file with the following format:
// 20170407
// http://info.employeeportal.org
{
"EmployeeDataList": [
{
"EmployeeCode": "200005ABH9",
"Skill": CT70,
"Sales": 0.0,
"LostSales": 1010.4
}
]
}
It is necessary to remove the extra lines of comments present in the file.
I've tried with the following code:
import json
import commentjson
with open('EmployeeDataList.json') as json_data:
employee_data = json.load(json_data)
'''employee_data = json.dump(json.load(json_data))'''
'''employee_data = commentjson.load(json_data)'''
print(employee_data)`
Still unable to remove comments from file and fetch JSON file in correct format.
Can't, where does it go wrong? Any direction in this regard is highly appreciated. thanks in advance
source to share
You are not using it commentjson
correctly. It has the same interface as the module json
:
import commentjson
with open('EmployeeDataList.json', 'r') as handle:
employee_data = commentjson.load(handle)
print(employee_data)
While your comments are simple enough in this case, you probably don't need to install an additional module to remove them:
import json
with open('EmployeeDataList.json', 'r') as handle:
fixed_json = ''.join(line for line in handle if not line.startswith('//'))
employee_data = json.loads(fixed_json)
print(employee_data)
Note that the difference between the two code snippets is what is json.loads
used instead json.load
, since you are processing a string instead of a file object.
source to share
The good thing is that this is not a valid format json
, so just open it as if the text document deleted anything from //
to \n
.
with open("EmployeeDataList.json", "r") as rf:
with open("output.json", "w") as wf:
for line in rf.readlines():
if line[0:2] == "//"
continue
wf.write(line)
source to share
Try JSON-minify :
JSON-minify minifies blocks of JSON-like content into valid JSON, removing all whitespace and JS-style comments (single-line // and multi-line / * .. * /).
source to share
Your file can be analyzed using HOCON .
pip install pyhocon
>>> from pyhocon import ConfigFactory
>>> conf = ConfigFactory.parse_file('data.txt')
>>> conf
ConfigTree([('EmployeeDataList',
[ConfigTree([('EmployeeCode', '200005ABH9'),
('Skill', 'CT70'),
('Sales', 0.0),
('LostSales', 1010.4)])])])
source to share
I usually read JSON as a regular file, remove the comments and then parse it as a JSON string. This can be done in one line with the following snippet:
with open(path,'r') as f: jsonDict = json.loads('\n'.join([row for row in f.readlines() if len(row.split('//')) == 1]))
IMHO this is very handy because it doesn't require CommentJSON or any other non-standard library.
source to share