dup(2)
and shared file descriptors
In my entry on how sharing file descriptors with child processes is a clever Unix decision, I said:
This full sharing is probably easier to implement in the kernel than making an independent copy of the file descriptor (unless you also changed how
dup()
works). [...]
Currently, dup()
specifically shares the file offset between the
old file descriptor and the new duplicated version. This implies a
shared file descriptor state within the kernel for at least file
descriptors in the current process, and along with it some way to
keep track of when the last reference to a particular shared state
goes away (because only then can the kernel actually close the file
and potentially trigger things like pending deletes).
Once you have to have this shared descriptor state within a single process, it's relatively straightforward to extend this to multiple processes, especially in the kind of plain uniprocessor kernel environment that Unix had for a long time. Basically, instead of having a per-process data structure for shared file descriptor state, you have a single global one, and everyone manipulates entries in it. You need reference counting regardless of whether file descriptor state is shared within a process or across processes.
(Then each process has a mapping from file descriptor number to the
shared state. In early Unixes, this was a small fixed size array,
the u_ofile
array in the user
structure.
Naturally, early Unixes also had a fixed size array for the actual
file structures for open files, as seen in V7's c.c
and
param.h
.
You can see V7's shared file
structure here.)
PS: The other attraction of this in small kernel environments, as seen in the V7 implementation, is that if file descriptor state is shared across all processes, you need significantly fewer copies of the state for a given file that's passed to children a lot, as is common for standard input, standard output, and standard error.
|
|