Vnode and file descriptor in xnu where file operation vector is stored

In xnu, we have an object vnode_t

that represents the file globally.

Each process can access the file (assuming it has permissions) by setting a new file descriptor and setting vnode under fg_data p>

fp->f_fglob->fg_data = vp;

      

vnode contains a list of basic actions for all relevant operations and is set according to the FS file. that is, the HFS + driver implements such a vector and sets its vnode accordingly.

int     (**v_op)(void *);       /* vnode operations vector */

      

it is a vector of function pointers for all actions that can run on a vnode.

In addition, we have a fileops structure, which is part of the file descriptor (fg_global), which describes a minimal subset of these functions:

Here's a typical definition:

const struct fileops vnops = {
 .fo_type = DTYPE_VNODE,
 .fo_read = vn_read,
 .fo_write = vn_write,
 .fo_ioctl = vn_ioctl,
 .fo_select = vn_select,
 .fo_close = vn_closefile,
 .fo_kqfilter = vn_kqfilt_add,
 .fo_drain = NULL,
};

      

and we install it here:

fp->f_fglob->fg_ops = &vnops;

      

I've seen that when reading a regular file on the local filesystem (HFS +), it works through file_descriptor, not vnode ...

 * frame #0: 0xffffff801313c67c kernel`vn_read(fp=0xffffff801f004d98, uio=0xffffff807240be70, flags=0, ctx=0xffffff807240bf10) at vfs_vnops.c:978 [opt]
frame #1: 0xffffff801339cc1a kernel`dofileread [inlined] fo_read(fp=0xffffff801f004d98, uio=0xffffff807240be70, flags=0, ctx=0xffffff807240bf10) at kern_descrip.c:5832 [opt]
frame #2: 0xffffff801339cbff kernel`dofileread(ctx=0xffffff807240bf10, fp=0xffffff801f004d98, bufp=140222138463456, nbyte=282, offset=<unavailable>, flags=<unavailable>, retval=<unavailable>) at sys_generic.c:365 [opt]
frame #3: 0xffffff801339c983 kernel`read_nocancel(p=0xffffff801a597658, uap=0xffffff801a553cc0, retval=<unavailable>) at sys_generic.c:215 [opt]
frame #4: 0xffffff8013425695 kernel`unix_syscall64(state=<unavailable>) at systemcalls.c:376 [opt]
frame #5: 0xffffff8012e9dd46 kernel`hndl_unix_scall64 + 22

      

My question is why this duality is needed, and in what cases the operation works through the vector file_descriptor (fg_ops) and what cases the operation is performed through the vector vnode (vp-> v_op).

thank

+3


source to share


1 answer


[...] in which cases it works through the vector file_descriptor (fg_ops) and in which cases the operation is performed through the vector vnode (vp-> v_op).

I'll start by answering this second part of the question first: if you still trace your call stack and look inside the function vn_read

, you'll see that it contains this line:

    error = VNOP_READ(vp, uio, ioflag, ctx);

      

The function VNOP_READ

(kpi_vfs.c), in turn, has the following meaning:

_err = (*vp->v_op[vnop_read_desc.vdesc_offset])(&a);

      

So the answer to your question is that for your typical file for dispatch operations, both tables are used .

With this in mind,



My question is why this duality is needed [...]

Not everything that a process can hold a file descriptor for is also represented on the file system. For example, pipes do not have to be named. Vnode makes no sense in this context. So in sys_pipe.c you will see another file table:

static const struct fileops pipeops = {
    .fo_type = DTYPE_PIPE,
    .fo_read = pipe_read,
    .fo_write = pipe_write,
    .fo_ioctl = pipe_ioctl,
    .fo_select = pipe_select,
    .fo_close = pipe_close,
    .fo_kqfilter = pipe_kqfilter,
    .fo_drain = pipe_drain,
};

      

A similar deal for sockets.

File descriptors track the process view state of a file or object, which allows file-like operations to be performed. Things like file position, etc. - different processes can open the same file and each of them must have its own read / write position, so vnode: fileglob is a 1: many relationship.

In the meantime, using vnodes to keep track of objects other than objects in the filesystem also doesn't make sense. In addition, the v_op table is filesystem specific, while vn_read / VNOP_READ contains code that applies to any file present on the filesystem.

So, in general, they are really just different layers in the I / O stack.

+3


source







All Articles