How to use the Disk IO queue

I need to read small data sequences from a 3.7GB file. The positions I need to read are not contiguous , but I can order the IO to have the file read from start to finish.

The file is stored in the iSCSI SAN, which must be able to process / optimize the queues in the queue.

The question is, how can I make a one shot request from all the data / positions I need in one go? Is it possible? I don't think async IO is an option because the read is very small (20-200 bytes)

Currently, the code looks like this:

using (var fileStream = new FileStream(dataStorePath, FileMode.Open, FileAccess.Read, FileShare.Read))
{
    for (int i = 0; i < internalIds.Count();i++ )
    {
        fileStream.Position = seekPositions[i].SeekPosition;
        ... = Serializer.DeserializeWithLengthPrefix<...>(fileStream, PrefixStyle.Base128);

    }
    ...
}

      

I am looking for ways to improve this I / O because I am getting slightly better read performance. It seems that the entire search time when moving the head adds up.

+2


source to share


3 answers


Have you launched Performance Monitor (from Microsoft Sysinternals)?

I'm not sure what the problem is, but I'll guess. If you are reading from SAN, I think disk access will lead to network requests under the hood. The first read sends a request to find, read and buffer data, and then the Serializer builds the objects. By the time you sent your second request, the SAN disks were still spinning, so you'll have to wait for the data to close.

Have you tried multithreading? I'm curious about performance, if you set up a queue of file queues that you need to process in sequential order, start some threads, ask them to open the file separately (FileSharing.Read so they can access the file in one go) and then let them start grab work from the queue. Output the results to another collection. If order matters to the output, you sort the results in the original order in which you queued them.



--- EDIT ---

Have you tried the ReadFileScatter API ? Here's a P-invoke signature from pinvoke.net .

+1


source


Create one background thread as a proxy. Submit all your reads and sort and merge them. If two or more regions are close, read the full sector containing them and take the subsections of the data. Return data asynchronously.



0


source


For the record only:

In a POSIX environment, you can query multiple regions of a file with a single (sys-) call using the readv function . Another ption in a POSIX environment will non-blocking IO.

0


source







All Articles