How to remove comment lines from JSON file in python

Question

How to remove comment lines from JSON file in python

I am getting a JSON file with the following format:

// 20170407
// http://info.employeeportal.org

{
 "EmployeeDataList": [
{
 "EmployeeCode": "200005ABH9",
 "Skill": CT70,
 "Sales": 0.0,
 "LostSales": 1010.4
} 
 ]
}

It is necessary to remove the extra lines of comments present in the file.

I've tried with the following code:

import json
import commentjson

with open('EmployeeDataList.json') as json_data:
            employee_data = json.load(json_data)
            '''employee_data = json.dump(json.load(json_data))'''
            '''employee_data = commentjson.load(json_data)'''
            print(employee_data)`

Still unable to remove comments from file and fetch JSON file in correct format.

Can't, where does it go wrong? Any direction in this regard is highly appreciated. thanks in advance

+3

json python

user4569636 09 Apr '17 at 3:50

source to share

6 answers

Blender · Answer 1 · 2017-04-09T04:16:52+0000

You are not using it commentjson

correctly. It has the same interface as the module json

:

import commentjson

with open('EmployeeDataList.json', 'r') as handle:
    employee_data = commentjson.load(handle)

print(employee_data)

While your comments are simple enough in this case, you probably don't need to install an additional module to remove them:

import json

with open('EmployeeDataList.json', 'r') as handle:
    fixed_json = ''.join(line for line in handle if not line.startswith('//'))
    employee_data = json.loads(fixed_json)

print(employee_data)

Note that the difference between the two code snippets is what is json.loads

used instead json.load

, since you are processing a string instead of a file object.

kpie · Answer 2 · 2017-04-09T04:04:33+0000

If it's the same number of lines every time you can simply:

fh = open('EmployeeDataList.NOTjson',"r")
rawText = fh.read()
json_data = rawText[rawText.index("\n",3)+1:]

So json_data is now a string of text without the first three lines.

spicypumpkin · Answer 3 · 2017-04-09T04:05:10+0000

The good thing is that this is not a valid format json

, so just open it as if the text document deleted anything from //

to \n

.

with open("EmployeeDataList.json", "r") as rf:
    with open("output.json", "w") as wf:
        for line in rf.readlines():
            if line[0:2] == "//"
                continue
            wf.write(line)

hailinzeng · Answer 4 · 2017-04-09T04:06:17+0000

Try JSON-minify :

JSON-minify minifies blocks of JSON-like content into valid JSON, removing all whitespace and JS-style comments (single-line // and multi-line / * .. * /).

cricket_007 · Answer 5 · 2017-04-09T04:53:19+0000

Your file can be analyzed using HOCON .

pip install pyhocon

>>> from pyhocon import ConfigFactory
>>> conf = ConfigFactory.parse_file('data.txt')
>>> conf
ConfigTree([('EmployeeDataList',
             [ConfigTree([('EmployeeCode', '200005ABH9'),
                          ('Skill', 'CT70'),
                          ('Sales', 0.0),
                          ('LostSales', 1010.4)])])])

Jabba the hut · Answer 6 · 2018-02-10T13:15:39+0000

I usually read JSON as a regular file, remove the comments and then parse it as a JSON string. This can be done in one line with the following snippet:

with open(path,'r') as f: jsonDict = json.loads('\n'.join([row for row in f.readlines() if len(row.split('//')) == 1]))

IMHO this is very handy because it doesn't require CommentJSON or any other non-standard library.

How to remove comment lines from JSON file in python

More articles: