2022-01-04
Some ways to implement /dev/fd
in Unix kernels
The idea of /dev/fd
, which gives filesystem names to file
descriptors, is the core of the modern implementation of process
substitution. There are several ways
to implement this idea in the Unix kernel, starting from an old,
simple, and brute force method to the modern methods that generally
use some form of virtual filesystem for reasons
that we'll get to.
The simple but brute force way to implement /dev/fd
is with a real
directory containing a bunch of miscellaneous character devices, somewhat
similar to /dev/null
. Inside the kernel, the device driver for these
miscellaneous devices can arrange to do the necessary magic when they're
opened, including failing to open if your process doesn't have that
particular file descriptor. This implementation has been possible for a
very long time (since before V7 Unix), but it has two drawbacks. First,
the /dev/fd
directory has to contain character device inodes for all
of the potentially available file descriptors, regardless of whether or
not the current process has those file descriptors available. Second,
you potentially need a lot of minor device numbers, since you need one
minor device number for every potential file descriptor number.
Together, these two issues generally made this brute force approach
unpopular and, I believe, pretty much never implemented in Unix. The
closest people came was /dev/stdin
, /dev/stdout
, and /dev/stderr
,
which were sometimes implemented this way. Having only these three
common file descriptors available wasn't anywhere near as useful, but
it could be a lot more feasible.
The second possible approach is to have /dev/fd
be a virtual
filesystem but the nodes in the filesystem be miscellaneous character
devices. Modern Unixes generally allow really large minor device
numbers, so that side's not a problem, and as a virtual filesystem
/dev/fd
can materialize only the file descriptors that the current
process actually has. I'm not certain if anyone actually implements
/dev/fd
this way. Although FreeBSD can sometimes have character
devices appear in /dev/fd
, I think that FreeBSD's fdescfs
is implemented differently and the character device stat()
result
is basically an illusion.
(For FreeBSD fdescfs, see fdesc_vnops.c.)
The third approach is to have both /dev/fd
and /dev/fd/N
be completely virtual, as a full virtual filesystem or as part
of one. Modern Linux effectively works this way; /dev/fd
is a symbolic link to /proc/self/fd
, which is a procfs directory with
magical contents. Linux makes this very magical; the files in
/proc/[pid]/fd are nominally symbolic links (which is what stat()
and
ls
will report), but when you open them they have special behavior
instead of being followed as normal symlinks would be.
(We'll wave our hands about how the virtual filesystem reaches into the depths of the kernel to get access to your process's file descriptors. Let's just assume that the kernel developers make it all work.)
Since both of the good approaches to /dev/fd
need some sort of
virtual filesystem, both of them had to wait for the idea of
the virtual filesystem switch to be invented.
Before the days of the VFS, the only possible implementation of
/dev/fd
was the unattractive brute force one of a real directory
with a lot of character devices in it.