How does read (2) work in Linux C?

According to the man page, we can specify the number of bytes we want to read from the file descriptor.

But in a read implementation, how many read requests will be made to perform the read?

For example, if I want to read 4MB, will it only create one request for 4MB, or split it into multiple small requests? e.g. 4KB per request?

+3


source to share


5 answers


  • read (2) is a system call, so it calls the vDSO shared library to send a system call (in the very old days it was an interrupt, but now there are faster ways to send system calls).

  • inside the kernel, the call is first handled by the vfs virtual file system; The virtual file system provides a common interface for inodes (structures that represent open files) and a common way to interact with the underlying file system.

  • vfs sends to the underlying filesystem (mount (8) will tell you which mount point exists and which filesystem is in use there). (see here for more information http://www.inf.fu-berlin.de/lehre/SS01/OS/Lectures/Lecture16.pdf )

  • the file system can do its own caching, so the number of disk reads depends on what is in the cache, and how the file system allocates blocks to store a specific file and how the file is divided into disk blocks - all questions to a specific file system)

  • If you want to do your own caching, open the file with the O_DIRECT flag; in this case, there is an attempt not to use the cache; however, all reads must be 512 offset aligned and have a multiple of 512 (this is required to transfer your buffer via DMA to the backup store http://www.quora.com/Why-does-O_DIRECT-require-IO-to-be- 512-byte-aligned )



+2


source


If there is data, the read will return as much data as is available and will fit into the buffer without waiting. If there is no data available, it will wait until something comes along and return what it can without waiting any longer.



How much it depends on what the file descriptor refers to. If it is for a socket, it will be whatever is in the socket buffer. If it is a file, it will be whatever is in the buffer cache.

+1


source


When you call read

, it only makes one request to fill the size of the buffer, and if it cannot fill the entire buffer (no more data or data arrives like in sockets), it returns the number of bytes it actually wrote in your buffer.

As the manual says:

RETURN VALUE

Upon successful completion, these functions return a non-negative integer indicating the number of bytes actually read. Otherwise, functions must return -1 and set errno to indicate an error.

+1


source


It depends on how deep you go.

The C library just passes the size you gave it straight to the kernel in one system call read()

, so at this level it's just one request.

Internally, for a regular file in standard buffered mode, the 4MB you requested will be copied from multiple pagecache pages (at 4kB) that are unlikely to be contiguous. Any file that is not actually in the pagecache must be read from disk. The file cannot be saved to the disk with the disk, so 4 MB can lead to multiple requests to the underlying block device.

+1


source


In fact, there is not a single correct answer, except as much as necessary, whichever layer occurs upon request. Typically, one box will be transferred to the core. This can lead to the fact that further requests will not come to other levels, because all the information is in memory. But if data is to be read from, say, software RAID, queries may be required for multiple physical devices to satisfy the query.

I don't think you can really give a better answer than "what the developer thought was the best way".

+1


source







All Articles