Mozilla Javascript Performance NEW OS.File vs OLD nsIFile over 3000 files

I have a directory containing small XML files (each file is 170 ~ 200 bytes) and I want to read all the content of each file and combine them into one XML file displayed in a tree.

OLD

FileUtils.File + NetUtil.asyncFetch + NetUtil.readInputStreamToString

Time to read 3000 XML files 1112.3642930000005 ms

NEW

OS.File.DirectoryIterator + OS.File.read

Time to read 3000 XML files 5330.708094999999ms

I noticed a huge difference in read times for one file: OLD has a time of 0.08-0.12 ms NEW has a time of 0.5 ~ 6.0 ms (6.0 this is not a typo, I have seen several peaks in time compared to OLD)

I know OLD is C ++ related, but at: https://developer.mozilla.org/en-US/docs/Mozilla/JavaScript_code_modules/OSFile.jsm

OS.File is a new API designed to efficiently manipulate files outside the main thread using privileged JavaScript code.

I don't see the effectiveness of the new API. Is there something wrong with my code?

nb: dbgPerf is a performance debug that collects time and comment in an array of objects and does all the calculations when I call the final function at the end of everything. this does not affect performance.

Code using nsIFile:

this._readDir2 = function (pathToTarget, callbackEndLoad) {

    var _content = '';
    dbgPerf.add("2 start read dir");

    var fuDir = new FileUtils.File(pathToTarget);
    var entries = fuDir.directoryEntries;
    var files = [];
    while (entries.hasMoreElements()) {

        var entry = entries.getNext();
        entry = entry.QueryInterface(OX.LIB.Ci.nsIFile);

        if (entry.isFile()) {

            var channel = NetUtil.newChannel(entry);
            files.push(channel);
            dbgPerf.add("ADD file" + entry.path);
        } else {
            dbgPerf.add("NOT a file" + entry.path);
        }
    }

    var totalFiles = files.length;
    var totalFetched = 0;

    for (var a = 0; a < files.length; a++) {

        var entry = files[a];

        dbgPerf.add("start asynch file " + entry.name);
        NetUtil.asyncFetch(entry, function (inputStream, status) {

            totalFetched++;

            if (!Components.isSuccessCode(status)) {
                dbgPerf.add('asyncFetch failed for reason ' + status);
                return;
            } else {

                _content += NetUtil.readInputStreamToString(inputStream, inputStream.available());
                dbgPerf.add("process end file " + entry.name);
            }

            if (totalFetched == files.length) {

                var parser = new DOMParser();

                _content = _content.replace(/<root>/g, '');
                _content = _content.replace(/<\/root>/g, '');
                _content = _content.replace(/<catalog>/g, '');
                _content = _content.replace(/<\/catalog>/g, '');
                _content = _content.replace(/<\?xml[\s\S]*?\?>/g, '');

                xmlDoc = parser.parseFromString('<?xml version="1.0" encoding="utf-8"?><root>' + _content + '</root>', "text/xml");
                //dbgPerf.add("2 fine parsing XML file " + arrFileData);

                var response = {};
                response.total = totalFiles;
                response.xml = xmlDoc;

                callbackEndLoad(response);
            }
        });
    }

    dbgPerf.add("2 AFTER REQUEST ALL FILE");
};

      

CODE USE OS.File:

this._readDir = function (pathToTarget, callbackEndLoad) {

    dbgPerf.add("1 inizio read dir");

    var xmlDoc;
    var arrFileData = '';

    var iterator = new OS.File.DirectoryIterator(pathToTarget);

    var files = [];
    iterator.forEach(function onEntry(entry) {
        if (!entry.isDir) {
            files.push(entry.path);
        }
    });

    var totalFetched = 0;

    files.forEach(function (fpath) {

        Task.spawn(function () {

            arrFileData += OS.File.read(fpath, {
                encoding: "utf-8"
            });

            totalFetched++;

            if (totalFetched == files.length) {

                var parser = new DOMParser();

                arrFileData = arrFileData.replace(/<root>/g, '');
                arrFileData = arrFileData.replace(/<\/root>/g, '');
                arrFileData = arrFileData.replace(/<catalog>/g, '');
                arrFileData = arrFileData.replace(/<\/catalog>/g, '');
                arrFileData = arrFileData.replace(/<\?xml[\s\S]*?\?>/g, '');

                xmlDoc = parser.parseFromString('<?xml version="1.0" encoding="utf-8"?><root>' + arrFileData + '</root>', "text/xml");
                dbgPerf.add("1 fine parsing XML file " + arrFileData);

                var response = {};
                response.xml = xmlDoc;

                callbackEndLoad(response);
            }
        });
    });
};

      

+3


source to share


3 answers


I am the author of OS.File.

We had some nsIFile and OS.File tests in those days. If you had to rewrite either nsIFile to run on a background thread (which is not possible by the XPConnect design) or OS.File to run on a main thread (which we made it impossible to block UX), in most cases I will remind you that you will find that OS .File is faster.

As mentioned, by design, OS.File is specifically designed to not do any work on the main thread. This is because I / O tasks have unpredictable duration - in extreme and unpredictable cases, the simple act of closing a file can block the thread for a few seconds, which is unacceptable on the main thread.

The consequence of this is that what you are comparing is actually the following:



  • Serialize the request and send it to the OS.File stream;
  • Do the actual I / O;
  • Serialize the response and send it to the main thread;
  • Wait for the next tick of the main thread (which is when the main thread actually receives a response);
  • deserialize the response;
  • Run the callback then

    and wait for the next tick of the main thread (as defined by Promise).

I / O efficiency is in step (2) as OS.File is often much smarter than nsIFile

that, so less I / O will run than nsIFile

. It's better for the battery, better for being a good citizen and playing well with other processes, and better than other I / O done on the same thread. responsiveness is due to the fact that we do as little work as possible on the main thread. But if your code is running on the main thread, the overall throughput will often be much lower nsIFile

due to steps (1), (3), (4), (5), (6).

Hope this answers your question.

PS Your fragments are wrong. First, they are inverted. Also, you forgot yield

in the call OS.File.read

.

+2


source


OS.File

efficient because it does not block. Of course, this results in decreased performance, but the user will enjoy a continuous experience and even an increase in perceived speed.



+1


source


What you have demonstrated is the way that the new OSFile approach is much slower than the old one, but which does not necessarily contradict the claim that the new method is more efficient.

The fact that the I / O is running on a different thread means that other parts of the application can still do useful work while the I / O thread is waiting for (often incredibly slow) storage to deliver data. This directly translates into noticeable improvements such as smoother UI, and so in almost all cases, users will benefit from this new approach.

However, the cost of these types of increased efficiency is that your code no longer has immediate access to the file it requested, so the total time you have to wait for data to be transferred to your code will be higher.

It might be worth trying the third approach, where you run your code from a worker - this will give you access to the synchronous file API and therefore may allow you to regain some of the speed you saw with the old nsIFile approach, while maintaining the advantage of not blocking the main flow.

https://developer.mozilla.org/en-US/docs/Mozilla/JavaScript_code_modules/OSFile.jsm/OS.File_for_workers

+1


source







All Articles