2020-01-01
The good and bad of errno
in a traditional Unix environment
I said recently in passing that errno
was a generally good
interface in pre-threading Unix. That may
raise some eyebrows, so today let's talk about the good and the
bad parts of errno
in a traditional Unix environment, such as
V7 Unix.
The good part of errno
is that it's about the simplest thing that can
work to provide multiple values from system calls in C, which doesn't
directly have multi-value returns (especially in early C). Using a
global variable to 'return' a second value is about the best you can do
in basic C unless you want to pass a pointer to every system call and
C library function that wants to provide an errno
value (this would
include much of stdio, for example). Passing around such a pointer all
the time doesn't just create uglier code; it also creates more code
and more stack (or register) usage for the extra parameter.
(Modern C is capable of tricks like returning a two-element structure in a pair of registers, but this is not the case for the older and simpler version of C used by Research Unix through at least V7.)
Some Unix C library system call functions in V7 could have returned
error values as special values, but perhaps not all of them (V7 didn't
allow many files, but it did have quite constrained address space on the
PDP-11 series). Even if they did, this would result in more code when
you actually had to check the return value of things like open()
or
sbrk()
, since the C code would have had to check
the range or other qualities of the return value.
(The actual system calls in V7 Unix and before used an error signaling
method designed for assembly language, where the kernel arranged
to return either the system call result or the error number in
register r0 and set a condition code depending on which it was. You
can read this documented in eg the V4 dup
manpage, where
Unix still had significant assembly language documentation. The
V7 C library arranged to turn this into setting errno
and returning
-1 on errors; see eg libc/sys/dup.s
along with libc/crt/cerror.s.)
The bad part of errno
is that it's a single global value, which
means that it can be accidentally overwritten by new errors between
the time it's set by a failing system call and when you want to use
it. The simple way to have this happen is just to do another failing
system call in your regular code, either directly or indirectly. A
classical mistake was to do something that checked whether standard
output (or standard error) was a terminal by trying to do a TTY
ioctl()
on it; when the ioctl failed, your original errno
value
would be overwritten by ENOTTY
and the reason your open()
or
whatever failed would be listed as the mysterious 'not a typewriter'
message (cf).
Even if you avoided this trap you could have issues with signals,
since signals can interrupt your program at arbitrary points,
including immediately after you returned from a system call and
before you've looked at errno
. These days you're basically not
supposed to do anything in signal handlers, but in the older days
of Unix it was common to perform any number of things in them.
Especially, for instance, a SIGCHLD
handler might call wait()
to collect the exit status of children until it failed with some
errno
, which would overwrite your original one if the timing was
bad. A signal handler could arrange to deal with this if the
programmer remembered the issue, but you might not; people often
overlook timing races, especially if they have narrow windows and
rarely happen.
(SIGCHLD
wasn't in V7, but it was in BSD; this is because BSD
introduced job control, which made it necessary. But that's another
entry.)
On the whole I think that errno
was a good interface for the
constraints of traditional Unix, where you didn't have threads or
good ways of returning multiple values from C function calls. While
it had drawbacks and bad sides, it was generally possible to work
around them and they usually didn't come up too often. The errno
API only started to get really awkward when threads were introduced
and you could have multiple things making system calls at once in
the same address space. Like much of Unix (especially in the Research
Unix era through V7), it's not perfect but it's good enough.