Is it better to handle JSON in Java or Javascript?

Simple enough. I have an awful amount of JSON to process, only 100GB. This 100 GB is split into files that are typically 1 MB.

So this got me thinking, would it generally be easier to parse a JSON file into Javascript, or would I have similar results processing the file using one of the JSON jars?

Now, obviously I'll have to multithread all of this and so on.


source to share

4 answers

Use whichever technique you are most adept at, the chances of a huge performance difference are low. V8 (Google JavaScript engine is best known in Chrome browser and in NodeJS in non-interoperable environments, but which can also run standalone) is very strange, as is Sun / Oracle JVM with excellent AP optimization technology. You can even use JavaScript on the JVM if you like ( Rhino ).

Now, obviously I'll have to multi-mill the whole thing, etc.

It's not obvious at all. If the process is I / O bound (and if you're reading a thousand 100MB of files, it looks like it probably will, depending on what you do with them), adding multiple threads won't help you.



I think it would be simpler, faster and easier to scale (ThreadPoolExecutor) for processing in java. how did you plan to do it with javascript? standalone v8?



If you know that, I would use Node.js. Better to handle JSON objects in a Javascript-based environment



Both languages ​​run in virtual execution, so the execution speed will be more dependent on the virtual machine in use, and the latest virtual machines have gotten really fast, especially on the latest hardware.

As far as I know, javascript has no native support for streaming. Multithreading was implemented during "time sharing" to prevent blocking. This however does not seem to have anything to do with "webworkers" anymore You can also just split your files into different processes that will process the files independently, which will nevertheless generate a lot of concurrent disk access which will most likely be your bottleneck when processing your files.

So, I suggest you switch to the language with which you are most comfortable.

Btw. mind telling us what kind of processing you will be doing in json files?

If I were to implement this: to limit parallel I / O, I would have a 1st thread that would prefetch one file at a time and read it into memory and queue a worker to process that file (if processing is heavy threadpool will certainly improve processing speed).



All Articles