The good and bad of errno in a traditional Unix environment

January 1, 2020

I said recently in passing that errno was a generally good interface in pre-threading Unix. That may raise some eyebrows, so today let's talk about the good and the bad parts of errno in a traditional Unix environment, such as V7 Unix.

The good part of errno is that it's about the simplest thing that can work to provide multiple values from system calls in C, which doesn't directly have multi-value returns (especially in early C). Using a global variable to 'return' a second value is about the best you can do in basic C unless you want to pass a pointer to every system call and C library function that wants to provide an errno value (this would include much of stdio, for example). Passing around such a pointer all the time doesn't just create uglier code; it also creates more code and more stack (or register) usage for the extra parameter.

(Modern C is capable of tricks like returning a two-element structure in a pair of registers, but this is not the case for the older and simpler version of C used by Research Unix through at least V7.)

Some Unix C library system call functions in V7 could have returned error values as special values, but perhaps not all of them (V7 didn't allow many files, but it did have quite constrained address space on the PDP-11 series). Even if they did, this would result in more code when you actually had to check the return value of things like open() or sbrk(), since the C code would have had to check the range or other qualities of the return value.

(The actual system calls in V7 Unix and before used an error signaling method designed for assembly language, where the kernel arranged to return either the system call result or the error number in register r0 and set a condition code depending on which it was. You can read this documented in eg the V4 dup manpage, where Unix still had significant assembly language documentation. The V7 C library arranged to turn this into setting errno and returning -1 on errors; see eg libc/sys/dup.s along with libc/crt/cerror.s.)

The bad part of errno is that it's a single global value, which means that it can be accidentally overwritten by new errors between the time it's set by a failing system call and when you want to use it. The simple way to have this happen is just to do another failing system call in your regular code, either directly or indirectly. A classical mistake was to do something that checked whether standard output (or standard error) was a terminal by trying to do a TTY ioctl() on it; when the ioctl failed, your original errno value would be overwritten by ENOTTY and the reason your open() or whatever failed would be listed as the mysterious 'not a typewriter' message (cf).

Even if you avoided this trap you could have issues with signals, since signals can interrupt your program at arbitrary points, including immediately after you returned from a system call and before you've looked at errno. These days you're basically not supposed to do anything in signal handlers, but in the older days of Unix it was common to perform any number of things in them. Especially, for instance, a SIGCHLD handler might call wait() to collect the exit status of children until it failed with some errno, which would overwrite your original one if the timing was bad. A signal handler could arrange to deal with this if the programmer remembered the issue, but you might not; people often overlook timing races, especially if they have narrow windows and rarely happen.

(SIGCHLD wasn't in V7, but it was in BSD; this is because BSD introduced job control, which made it necessary. But that's another entry.)

On the whole I think that errno was a good interface for the constraints of traditional Unix, where you didn't have threads or good ways of returning multiple values from C function calls. While it had drawbacks and bad sides, it was generally possible to work around them and they usually didn't come up too often. The errno API only started to get really awkward when threads were introduced and you could have multiple things making system calls at once in the same address space. Like much of Unix (especially in the Research Unix era through V7), it's not perfect but it's good enough.

Written on 01 January 2020.
« Things I've stopped using in GNU Emacs for working on Go
How job control made the SIGCHLD signal useful for (BSD) Unix »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Wed Jan 1 23:28:50 2020
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.