Are disk file operations faster when done in parallel?

Suppose N files are completely written to disk (i.e. cleared of all file buffers). For each file, we record a small (relative to the HDD search time) amount of data, for example. 64KB, s WriteFile

, and then call FlushFileBuffers

in that file to make sure the data for the file is completely erased to the hard drive.

If we sequentially write and delete files one by one, then I expect that it will take approximately time N*seekTime

+ N*writeTime

, where seekTime

is the time to place the hard disk head in the corresponding sector (which may take until the disk is fully rotated), and writeTime

is the time it takes to writing to disk sequentially 64 KB of data. With this personalized approach, we give the OS no place to optimize because we determine the sequence in which the files should be flushed.

With some OS support, performance gains can be achieved by reordering the order of file writes and resets so that given disk rotation (i.e., the current head position on the disk), file operations are rearranged to start from those for which there is almost no rotation (i.e. i.e. the closest to the current position of the disk head) and ending with those that require almost full rotation of the disk.

The question arises: does such an operating system (in particular, Windows) provide such optimization? In other words, is it possible to improve performance by running parallel work on write files and flush on N threads, one thread per file? Or would it cause additional override operations that would degrade performance (like a kind of context switches for a hard drive)?

+3


source to share


3 answers


You must first ask yourself and explain here why you need to flush. What you want to achieve does not necessarily actually happen.

If you really want to optimize your application in such a way as to get a certain pattern of access on a physical device, you will make your decision very hardware dependent. What looks like an optimization of your test cases might have the opposite effect in another scenario. For example, what about file fragmentation? What about raid disks? How about networked file systems? How about SSD drives? How about concurrent access to the same disk by other processes running on the same computer?



The key to fast disk access is buffering. Don't win unless you need to win.

+3


source


You need to navigate as it is the operating system, file system, and hardware. On my Linux system, a lot of file operations go through the page cache , so if two programs (or the same program are run twice) access the file at the same time, the last access may not involve the physical I / O drive. Linux and POSIX even have some system calls that help the page cache ( posix_fadvise (2) , madvise (2) , readahead (2) ...)

I don't know Windows, but I heard and believed in a rumor that says it is less efficient than Linux when cached like this.



Hardware limitations are often a very significant bottleneck. Replacing your drive with an SSD can be costly.

AFAIK, the old BSD and SunOS and Linux disk drivers did the optimization you suggest (reorganizing I / O to reduce seek and spin latency). Today it doesn't really matter (the disk controller itself will map "logical" sectors to "physical").

+2


source


I suppose Windows doesn't schedule IO scheduling at all, in fact it even breaks up large IOs into 256KB pieces. Linux has built-in I / O scheduling.

This suggests that some drivers and disks are doing some reordering. Typically, the I / O rate increases to a point with a higher queue depth. Crystal Disk test has QD32 mode.

SSDs do, of course, what is easy to see from high queue depth tests. SSD also has hardware parallelism. They speed up when you increase the queue depth for random reads.

What I found on my desktop drive on Windows is that sequential small write / write times are faster than the disk seek speed. Either the controller is write caching, or the disk geometry really lends itself to sequential writes, even if it is not cached.

+1


source







All Articles