2023-10-24
dup()'s shared file IO offset is a necessary part of Unix
In a recent entry I noted dup() somewhat weird seeming behavior that the new file descriptor you get from dup() (and also from dup2(), its sometimes better sibling) shares the file's IO offset with the original file descriptor. This behavior is different from open()'ing the same file again, where you get a file descriptor with an independent file IO offset (sometimes called the seek offset or the seek position). In discussing this on the Fediverse, I wondered if this was only a convenient implementation choice. The answer, which I should have realized even at the time, is that dup()'s shared IO offset is a necessary part of Unix pipelines (especially in the context of older Unixes, such as V7 Unix).
Consider the following illustrative shell pipeline:
$ (cmd1 | cmd2 | cmd3) 2>/tmp/errors-log
Here we want to redirect any errors from these commands (and any sub-things they run) into /tmp/errors-log. We want all of the errors, with them in errors-log in the order they were printed by the various commands (which is not necessarily pipeline order; cmd3 could write some complaints before cmd2 did, for example).
If the shell opens /tmp/errors-log once and dup()'s the resulting
file descriptor to standard error for cmd1, cmd2, and cmd3, this
is exactly what you get, and it's because of that shared file IO
offset. Every time any of the commands writes to standard error,
they advance the offset of the next write() for all of the commands
at once. Today you could get the same effect for writes with
O_APPEND
, but that wasn't in V7 Unix
The shared offset also makes setting up standard input easier in some shell situations. Consider this:
$ (cmd1; cmd2; cmd3) <input-file
Implementing this without dup()'s shared IO offset would require that the parent shell set up standard input once, before it started forking children, so that it could pass the same file descriptor to all of them. With dup(), the parent can merely open input-file and then leave it to each child to dup() it on to standard input at an appropriate time.
There's a closely related idiom that also requires these dup() semantics even in a single process. Consider:
$ command >/tmp/out 2>&1
You want both standard output and standard error in the same file, interleaved in the order they were written, but in the child process these are necessarily two different file descriptors. You need them to share the IO offset anyway, which is achieved by dup()'ing one to the other (in a specific order, also).
Even without these dup() semantics, sharing the file IO offset of the same (inherited) file descriptor between processes is basically essential. Consider:
$ make >/tmp/output
Make will write to standard output and it will pass its own standard output file descriptor on to children (ie, all of the commands that get run from your Makefile) unchanged. All of the writes by all of the various processes to each individual file descriptor 1 have to all share an IO offset, or they'd repeatedly write over each other at the start of the file.
(You can create similar but more contrived examples with standard input coming from a file.)
Before I started writing this entry, I don't think I appreciated how important Unix's separation of the file IO offset from file descriptors is, or how deep it goes.