Writing to disk slows down over time with Node.js
I am trying to write large files (500MB) to disk using Node.js. I realized that although the first few files are written after a few seconds (usually 3 to 5 seconds) starting from the 10th file, things are slower (and not recoverable).
The setup consists of a server that accepts files over a TCP / IP socket and transmits them to disk:
var fs = require('fs'),
net = require('net'),
path = require('path');
var counter = 0;
net.createServer(function (socket) {
console.time('received');
console.time('written');
counter++;
var filename = path.join(__dirname, 'temp' + counter + '.tmp');
var file = fs.createWriteStream(filename, { encoding: 'utf8' });
socket.pipe(file);
socket.once('end', function () {
console.timeEnd('written');
});
file.once('finish', function () {
console.timeEnd('received');
});
}).listen(3000);
I am sending data from terminal using nc
like this:
$ while [ true ]; do `cat input.tmp | nc localhost 3000`; done
Running
$ time cat input.tmp > /dev/null
showed that the cat always reads the files at the same time. If I replace the output path of the Node.js script with /dev/null
, the writing will also happen at the same time.
So the problem appears to be with the actual writing to disk.
At first I thought it might be a concurrent read and write problem, but the problem even persists when I run
$ while [ true ]; do `cat input.tmp | nc localhost 3000; sleep 5`; done
If I run the same test with an even larger file (twice as large, i.e. 1 GB), then it takes about half the time until the writing gets slower.
UPDATE
I modified my Node.js app to write everything into one file that is added and included and continued ... the server now looks like this:
var fs = require('fs'),
net = require('net'),
path = require('path');
var filename = path.join(__dirname, 'temp.tmp');
var file = fs.createWriteStream(filename, { encoding: 'utf8' });
net.createServer(function (socket) {
console.time('received');
console.time('written');
socket.pipe(file, { end: false });
socket.once('end', function () {
console.timeEnd('written');
});
}).listen(3000);
Now the problem is gone, so apparently it is related to writing several files in a row. At least I can't see where I am writing multiple files at the same time (me?), So I can't think of why this should happen. Especially the use is sleep 5
to ensure that the OS has actually written everything to disk.
UPDATE 2
I originally tested using Node.js 0.10.32. As soon as I switch to 0.11.13, the effect does not completely disappear, but it takes longer until it happens. In the original setup, the problem happened in about 10 loops, with Node.js 0.11.13 doing it first in loop 30.
Any idea what might be causing this behavior?
source to share
I had a similar problem a while ago. The maximum number of concurrent I / Os is possible, so Node will start writing as many files at the same time as it can, and the rest will be queued until the slot becomes free.
file 1 |-----------------------------------|
file 2 |-----------------------------------|
file 3 |-----------------------------------|
file 4 |-------------------------------------|
The above is just an example, but it shows that writing 4 files in this case will take twice as long as writing only 3 files.
source to share