Comma separator between JSON objects with json.dump
I was messing around with outputting a json file with some file attributes in a directory. My problem is that there is no delimiter between each object when adding to file. I could just add a comma after each "f" and remove the last one, but that seems like a messy job.
import os
import os.path
import json
#Create and open file_data.txt and append
with open('file_data.txt', 'a') as outfile:
files = os.listdir(os.curdir)
for f in files:
extension = os.path.splitext(f)[1][1:]
base = os.path.splitext(f)[0]
name = f
data = {
"file_name" : name,
"extension" : extension,
"base_name" : base
}
json.dump(data, outfile)
Output:
{"file_name": "contributors.txt", "base_name": "contributors", "extension": "txt"}{"file_name": "read_files.py", "base_name": "read_files", "extension": "py"}{"file_name": "file_data.txt", "base_name": "file_data", "extension": "txt"}{"file_name": ".git", "base_name": ".git", "extension": ""}
I would like the JSON:
{"file_name": "contributors.txt", "base_name": "contributors", "extension": "txt"},{"file_name": "read_files.py", "base_name": "read_files", "extension": "py"},{"file_name": "file_data.txt", "base_name": "file_data", "extension": "txt"}{"file_name": ".git", "base_name": ".git", "extension": ""}
source to share
What you get is not a JSON object, but a stream of individual JSON objects.
Whatever you want, it's still not a JSON object, but a stream of individual JSON objects with commas in between. It won't be more collapsible. *
* The JSON specification is straightforward enough for manual parsing, and it should be pretty clear that an object followed by another object with a comma in between doesn't match any valid production.
If you are trying to create a JSON array, you can do that. The obvious way, if there are no memory issues, is to create a list of dicts and then dump everything at once:
output = []
for f in files:
# ...
output.append(data)
json.dump(output, outfile)
If the problem is memory, you have several options:
- For a quick and dirty solution, you can fake it by writing
[
,,
and]
by hand. (But note that JSON does not have an extra trailing comma after the last value, even though some decoders accept it.) - You can wrap your loop in a generator function that each gives
data
, and extendJSONEncoder
to convert iterators to arrays. (Note that this is actually used as an example in the docs on why and how to extendJSONEncoder
, although you could write a more memory efficient implementation.) - You can look for a third party JSON library that has a built-in iterative streaming API.
However, it is worth considering what you are trying to do. Perhaps a stream of individual JSON objects is actually the correct file format / protocol / API for what you are trying to do. Since JSON is self-delimiting, there is no reason to add a delimiter between individual values. (And it doesn't even help with certainty unless you use a delimiter that won't appear in all real JSON as it ,
is.) For example, you have exactly what the JSON-RPC should look like. If you're just asking for something different because you don't know how to parse such a file, it's pretty easy. For example (for simplicity, a string is used, not a file):
i = 0
d = json.JSONDecoder()
while True:
try:
obj, i = d.raw_decode(s, i)
except ValueError:
return
yield obj
source to share