Fast and efficient way to read large JSON files line by line in Java

I have 100 million records in a file JSON

, I need an efficient and fast method to read an array of arrays from a file JSON

into java

.

File

JSON

as follows:

[["XYZ",...,"ABC"],["XYZ",...,"ABC"],["XYZ",...,"ABC"],...,["XYZ",...,"ABC"],
 ["XYZ",...,"ABC"],["XYZ",...,"ABC"],["XYZ",...,"ABC"],...,["XYZ",...,"ABC"],
 ...
 ...
 ...
 ,["XYZ",...,"ABC"],["XYZ",...,"ABC"],["XYZ",...,"ABC"]]

      

I want to read this file JSON

line by line like this:

first read:

["XYZ",...,"ABC"]

      

then

["XYZ",...,"ABC"]

      

etc: '

...
...
...
["XYZ",...,"ABC"]

      

How to read a file JSON

like this, I know it doesn't completely look like a file JSON

, but I need to read this file in this format, which is saved as .JSON

+3


source to share


3 answers


You can use the JSON Processing API (JSR 353) to process your data in a streaming way:



import javax.json.Json;
import javax.json.stream.JsonParser;

...

String dataPath = "data.json";

try(JsonParser parser = Json.createParser(new FileReader(dataPath))) {
     List<String> row = new ArrayList<>();

     while(parser.hasNext()) {
         JsonParser.Event event = parser.next();
         switch(event) {
             case START_ARRAY:
                 continue;
             case VALUE_STRING:
                 row.add(parser.getString());
                 break;
             case END_ARRAY:
                 if(!row.isEmpty()) {
                     //Do something with the current row of data 
                     System.out.println(row);

                     //Reset it (prepare for the new row) 
                     row.clear();
                 }
                 break;
             default:
                 throw new IllegalStateException("Unexpected JSON event: " + event);
         }
     }
}

      

+3


source


Please take a look at the Jackson Streaming API.

I think you are looking at something like this -   https://www.ngdata.com/parsing-a-large-json-file-efficiently-and-easily/



and this is fooobar.com/questions/1198261 / ...

The main thing is if you have a large file, you need to read and process the file lazily, piece by piece.

+2


source


You can use JsonSurfer to retrieve the entire internal JSON array using JsonPath: $ [*]

    JsonSurfer surfer = JsonSurferJackson.INSTANCE;
    surfer.configBuilder().bind("$[*]", new JsonPathListener() {
        @Override
        public void onValue(Object value, ParsingContext context) {
            System.out.println(value);
        }
    }).buildAndSurf(json);

      

It won't load all the Json into memory. The JSON array will be processed one by one.

0


source







All Articles