How job control made the
SIGCHLD signal useful for (BSD) Unix
On modern versions of Unix, the
SIGCHLD signal is sent to a process
when one of its child processes terminates or some other status
changes happen. Catching
SIGCHLD is a handy way to find out about
child processes exiting while your program is doing other things,
and ignoring it will make them vanish instead of turning into
zombies. Despite all of these useful things,
SIGCHLD was not in
V7 Unix; it was added in 4.2 BSD and independently in System III
In V7, programs like the shell had fairly straightforward handling
of waiting for children that was basically synchronous. When they
ran something (or you ran the '
wait' shell builtin), they called
until either it returned an error or possibly the process ID they
were looking for came up. If you wanted to interrupt this, you used
^C and the shell's signal handler for
SIGINT did some magic. This
was sufficient in V7 because the V7 shell didn't really need to
know about changes in child status except when you asked for it.
When BSD added job control, it also made it so that background
programs that tried to do terminal input or output could be
automatically suspended. This is necessary when programs can move
between being foreground programs and background ones; you might
start a program in the foreground, background it when it takes too
long to respond, and then want to foreground it again once it's
ready to talk to you. However, adding this feature means that a job
control shell now needs to know about process state changes
asynchronously. If the shell is waiting at the shell prompt for you
to enter the next command (ie, reading from the terminal itself and
read()) and a background program gets suspended for
terminal activity, you probably want the shell to tell you about
this right away, not when you next provide a line of input to the
To get this immediate notification, you need a signal that's sent
when child processes change their status, including when they get
suspended this way. That's
SIGCHLD enables programs to react to their children
exiting even when they're doing other things too. This is useful in
general, which is probably why System III added its own version even
though it didn't have job control.
(Shells that use job control don't have to immediately react this way, and some don't. It may even depend on shell settings, which is sensible; some people don't like getting interrupted by shell messages when they're typing a command or thinking, and would rather tap the Return key when they want to see.)
PS: 4.2 BSD also introduced
which for the first time allowed processes to check for child process
status without blocking. This is what you'd use in a shell if you're
only checking for exited and suspended children right before you
print the next prompt.
The good and bad of
errno in a traditional Unix environment
I said recently in passing that
errno was a generally good
interface in pre-threading Unix. That may
raise some eyebrows, so today let's talk about the good and the
bad parts of
errno in a traditional Unix environment, such as
The good part of
errno is that it's about the simplest thing that can
work to provide multiple values from system calls in C, which doesn't
directly have multi-value returns (especially in early C). Using a
global variable to 'return' a second value is about the best you can do
in basic C unless you want to pass a pointer to every system call and
C library function that wants to provide an
errno value (this would
include much of stdio, for example). Passing around such a pointer all
the time doesn't just create uglier code; it also creates more code
and more stack (or register) usage for the extra parameter.
(Modern C is capable of tricks like returning a two-element structure in a pair of registers, but this is not the case for the older and simpler version of C used by Research Unix through at least V7.)
Some Unix C library system call functions in V7 could have returned
error values as special values, but perhaps not all of them (V7 didn't
allow many files, but it did have quite constrained address space on the
PDP-11 series). Even if they did, this would result in more code when
you actually had to check the return value of things like
sbrk(), since the C code would have had to check
the range or other qualities of the return value.
(The actual system calls in V7 Unix and before used an error signaling
method designed for assembly language, where the kernel arranged
to return either the system call result or the error number in
register r0 and set a condition code depending on which it was. You
can read this documented in eg the V4
dup manpage, where
Unix still had significant assembly language documentation. The
V7 C library arranged to turn this into setting
errno and returning
-1 on errors; see eg libc/sys/dup.s
along with libc/crt/cerror.s.)
The bad part of
errno is that it's a single global value, which
means that it can be accidentally overwritten by new errors between
the time it's set by a failing system call and when you want to use
it. The simple way to have this happen is just to do another failing
system call in your regular code, either directly or indirectly. A
classical mistake was to do something that checked whether standard
output (or standard error) was a terminal by trying to do a TTY
ioctl() on it; when the ioctl failed, your original
would be overwritten by
ENOTTY and the reason your
whatever failed would be listed as the mysterious 'not a typewriter'
Even if you avoided this trap you could have issues with signals,
since signals can interrupt your program at arbitrary points,
including immediately after you returned from a system call and
before you've looked at
errno. These days you're basically not
supposed to do anything in signal handlers, but in the older days
of Unix it was common to perform any number of things in them.
Especially, for instance, a
SIGCHLD handler might call
to collect the exit status of children until it failed with some
errno, which would overwrite your original one if the timing was
bad. A signal handler could arrange to deal with this if the
programmer remembered the issue, but you might not; people often
overlook timing races, especially if they have narrow windows and
SIGCHLD wasn't in V7, but it was in BSD; this is because BSD
introduced job control, which made it necessary. But that's another
On the whole I think that
errno was a good interface for the
constraints of traditional Unix, where you didn't have threads or
good ways of returning multiple values from C function calls. While
it had drawbacks and bad sides, it was generally possible to work
around them and they usually didn't come up too often. The
API only started to get really awkward when threads were introduced
and you could have multiple things making system calls at once in
the same address space. Like much of Unix (especially in the Research
Unix era through V7), it's not perfect but it's good enough.
The Unix C library API can only be reliably used from C
To fully implement system call origin verification, OpenBSD would like Go to make
system calls through the C library instead of directly making system
calls from its own runtime (which it has some reasons for doing). On the surface, this sounds like
only a moderately small issue; sure, it's a bit awkward, but a
language like Go should be able to just make calls to the usual C
library functions like
open() (and using the C calling ABI).
Unfortunately it's not that simple, because very often parts of
the normal C library API are actually implemented in the C preprocessor.
Because of this, the C library API cannot be reliably and generally
used without actually writing your own C glue code.
This sounds extreme, so let me illustrate it with everyone's favorite
errno, which you consult to get the error value from a
failed system call (and from some failed library calls). As covered
in yesterday's entry, in the
errno must be implemented so that different threads
can have different values for it, because they may be making different
system calls at the same time. This requires thread local storage, and
generally thread local storage cannot be accessed as a plain variable;
it must be accessed through some special tricks supported by the C
runtime. So here are the definitions of '
errno' from OpenBSD 6.6
and a current Fedora Linux with glibc:
/* OpenBSD */ int *__errno(void); #define errno (*__errno()) /* Fedora glibc */ extern int *__errno_location (void) __THROW __attribute_const__; # define errno (*__errno_location ())
In both of these cases,
errno is actually a preprocessor definition.
The definitions refer to non-public and undocumented C library functions
(that's what the leading double underscores signal) that are not part
of the public API. If you compile C code against this
errno.h in your program), it will work, but that's
the only officially supported way of doing it. There is no useful
errno variable to load in your own language's runtime after a
call to, say, the
open() function, and if you call
____errno_location in your runtime, you are using a non-public API
and it could break tomorrow (although it probably won't). To build a
reliable language runtime that sticks to the public C library API, it's
not enough to just call exported functions like
open(); you also
need to write and compile your own little C function that just returns
errno to your runtime.
(There may be other important cases besides
errno; I will leave them
to interested parties to find.)
This is not a new issue in Unix, of course. From the
beginning of stdio in V7, some of the stdio 'functions'
were implemented as preprocessor macros in stdio.h.
But for a long time, people didn't insist that the C library was the
only officially supported way of making system calls, so you could
bypass things like the whole modern
errno mess unless you needed to be
compatible with C code for some reason.
(Before threading came into the Unix picture,
errno was a plain
variable and a generally good interface, although not perfect.)
The BSD and Linux approaches to development put coherence in different places
I said yesterday that both the BSD 'base system' approach and the Linux distribution 'assemblage of parts' approach to building systems each have their own advantages. One of the large scale differences between the two approaches is what winds up being relatively coherent and unified.
In the BSD approach where the base system is all developed by a single group under its own control, the base system is coherent and unified among almost all of the various parts (and exceptions tend to stick out and be awkward, like GCC). However, this internal coherence means that there's relatively less coherence across different BSDs, so FreeBSD is by now reasonably different from OpenBSD, NetBSD, and DragonflyBSD (and others), especially in the kernel. To a significant extent, having a coherent base system means not sharing parts of that base system with anyone except people who are already very close to your whole system.
In the Linux approach where the system is assembled from a disparate collection of parts, any particular Linux is far less coherent within itself simply because the parts are developed by different people, with different approaches and styles, to different priorities and standards, and so on. However, Linux shares those parts fairly widely across Linux distributions, and so Linux distributions are far more coherent with each other than the BSDs are. This sharing and reuse of parts is really why we can talk of 'Linux distributions' instead of 'Linuxes'; they are, for the most part, distributing more or less the same things assembled in more or less the same way, although with different tools, different priorities about customization and versions and updates, and so on. One Linux distribution is much more similar to most other ones than, say, FreeBSD is to NetBSD or OpenBSD.
(Some Linux distributions make different choices at various levels, including in the core C library, but distributions that go off the common path tend to wind up relatively different.)
There is no right or wrong approach here, merely different advantages and different priorities. Sometimes these priorities are necessary to get certain sorts of results; for example, OpenBSD really has to be a BSD. And people are certainly drawn to the coherence and unity of FreeBSD over the relative chaos of basically all Linuxes.
(Meanwhile, Linux probably makes it easier to experiment with alternatives to some pieces, because you don't have to sail off entirely on your own. Alpine Linux gets to reuse a lot of components that other people develop for them, from the kernel on up, while still swapping out some core pieces.)
OpenBSD has to be a BSD Unix and you couldn't duplicate it with Linux
OpenBSD has a well deserved reputation for putting security and a
clean system (for code, documentation, and so on) first, and
everything else second. OpenBSD is of course based on BSD (it's
right there in the name) and descends from
NetBSD (you can read the history here). But one of the questions
you could ask about it is whether it had to be that way, and in
particular if you could build something like OpenBSD on top of
Linux. I believe that the answer is no.
Linux and the *BSDs have a significantly different model of what they are. BSDs have a 'base system' that provides an integrated and fully operational core Unix, covering the kernel, C library and compiler, and the normal Unix user level programs, all maintained and distributed by the particular BSD. Linux is not a single unit this way, and instead all of the component parts are maintained separately and assembled in various ways by various Linux distributions. Both approaches have their advantages, but one big one for the BSD approach is that it enables global changes.
Making global changes is an important part of what makes OpenBSD's approach to improving security, code maintenance, and so on work. Because it directly maintains everything as a unit, OpenBSD is in a position to introduce new C library or kernel APIs (or change them) and then immediately update all sorts of things in user level programs to use the new API. This takes a certain amount of work, of course, but it's possible to do it at all. And because OpenBSD can do this sort of ambitious global change, it does.
This goes further than just the ability to make global changes, because in theory you can patch in global changes on top of a bunch of separate upstream projects. Because OpenBSD is in control of its entire base system, it's not forced to try to reconcile different development priorities or integrate clashing changes. OpenBSD can decide (and has) that only certain sorts of changes will be accepted into its system at all, no matter what people want. If there are features or entire programs that don't fit into what OpenBSD will accept, they just lose out.
(I suspect that this decision on priorities gives OpenBSD has more leverage to push other people in directions that it wants, because the OpenBSD developers are clearly willing to remove support for something if they feel strongly enough about it. For example, I suspect that their new system call origin verification is going to eventually force Go to make system calls only through OpenBSD's C library, contrary to what Go prefers to do.)
In the old days, we didn't use multiple Unixes by choice (mostly)
One of the possible reactions to the fading out of multi-architecture Unix environments is to lament the modern hegemony of 64-bit x86 Linux and yearn for the good old days of multiple Unixes and heterogeneous Unix environments. However, my view is that this is false nostalgia. Back in the days, most people did not work on or run multiple Unixes and multiple architectures because they wanted to; they did it because they had to. In fact sensible places usually tried hard to be Unix monocultures (a Sun SPARC monoculture, for example), because that made your life much easier.
The reason that there was a flourishing bunch of Unixes back in the days and people often had so many of them was simple; there was no architecture or hardware standard, so every hardware vendor had their own hardware-specific Unix. If you wanted that vendor's hardware you pretty much had to take their Unix, and if you wanted their Unix you definitely had to take their hardware (much as with Apple today). Unless you and everyone else in your organization could stick to a single Unix and a single sort of hardware, you had to have a multi-Unix environment. Even if you stuck with a single vendor and their Unix, you could still wind up with multiple architectures as the vendor went through an architecture transition. Sometimes the vendor also put you through a Unix transition, for example when DEC changed from Ultrix to OSF/1, or Sun from SunOS to Solaris.
(There could be all sorts of reasons that you 'wanted' a vendor's hardware or Unix, including that they were offering you the best price on Unix servers at the moment or that some software you really needed ran best or only on their Unix or their hardware. And needless to say, different groups within your organization could have different needs, different budgets, or different salespeople and so wind up with different Unixes. Universities were especially prone to this back in the days, and were also prone to keeping old hardware (and its old or different Unix) running for as long as possible.)
Once there was a common hardware standard in the form of x86 PC hardware, the march towards a Unix monoculture on that hardware was probably inevitable. Unixes are just not that different from each other (more or less by design), and there are real benefits to eliminating those remaining differences in your environment by just picking one. For example, you only have to build and have around one set of architecture and ABI dependent files, remember one way of doing things and administering your systems, and so on.
The fading out of multi-'architecture' Unix environments
When I started in the Unix world, it was relatively common to have overall Unix environments where you had multiple binary architectures and user home directories that were shared between machines with different architectures. Sometimes you had multiple architectures because you were in the process of an architecture transition from one Unix vendor (for example, from Motorola 68K based Sun 3s to SPARC based Sun 4s, with perhaps a SunOS version jump in the process). Sometimes you had multiple architectures because you'd bought systems from different vendors; perhaps you still had some old Vaxes that you were nursing along for one reason or another but all your new machines were from Sun.
All of this pushed both Unix systems and user home directories in a
certain direction, one where it was sensible to have both
/csri/lib, the latter of which
was magically arranged to be a symlink to the right architecture for
your current machine. For user home directories, you needed to somehow
separate out personal binaries for architecture A from the same binaries
for architecture B; usually this was different directories and having
.profile adjust your
$PATH to reflect the current machine's
Those days have been fading out for some time. People's Unix environments have become increasingly single-architecture over the years, especially as far as user home directories are concerned. Partly this is just that there is much less diversity of Unixes and Unix vendors, and partly this is because cross machine shared home directories have gone out of style in general (along with NFS). The last gasp of a multi-architecture environment here was when we were still running both 32-bit and 64-bit Linux machines with NFS-mounted user home directories, but we got rid of our last 32-bit Linux machines more than half a decade ago.
(We had Solaris and then OmniOS machines, and we still have OpenBSD ones, but neither are used by users or have shared home directories.)
A certain amount of modern software and systems still sort of believe
in a multi-architecture Unix environment, and so will do things
like automatically install compiled libraries with an architecture
dependent name (I was recently pleased to discover that Python's
pip package installer does this). However, an increasing amount
doesn't unless you go out of your way. For example, both Rust's
cargo and Go's
go command install their compiled binaries into
a fixed directory by default, which only works if your home directory
isn't shared between architectures. In practice, this is fine, or
at least fine enough that both projects have been doing this for
some time. And it's certainly more convenient to just have a
$HOME/go/bin and a
$HOME/.cargo/bin than to have ones with
longer and more obscure names involving snippets like
(By 'architecture' here I mean the overall ABI, which depends on both the machine architecture itself and the Unix you're running on it. In the old days there tended to be one Unix per architecture in practice, but these days there's only a few machine architectures left in common use, so a major point of ABI difference is which Unix you're using.)
Groups of processes are a frequent and fundamental thing in Unix
Recently, I wrote about a gotcha when catching Control-C in programs that are run from scripts, where things could go wrong because the Control-C was delivered not just to the program but also to the shell script, which wasn't expecting it (while the program was). From the way I wrote that entry (which focused on a gotcha involving this group signalling behavior), you might wind up with the impression that this behavior of Unix signals is a wart in Unix. In fact, it's not; that signals from things like Control-C behave this way is an important part of Unix shell usability.
The core reason for this is that in Unix, it's very common for a group of processes to be one entity as far as you're concerned. Unix likes processes and it likes assembling things out of groups and trees of processes, and so you wind up with what people think of as one entity that is actually composed of multiple processes. When you do things like type a Control-C, you almost always want to operate on the entity as a whole, not any specific process in it, and so Unix supports this by sending terminal signals to its best guess at the group of processes that are one thing.
That sounds pretty abstract, so let's make it concrete. One simple case of a group of processes acting as one entity is the shell pipeline:
$ prog1 <somefile | prog2 | prog3 | prog4
If you type a Control-C, almost everyone wants the entire pipeline
to be interrupted and exit. It's not sufficient for the kernel to
just signal one process, let it exit, and hope that this causes all
of the other ones to hit pipe IO errors,
because one of those programs (say
prog2) could be engaged in a
long, slow computation before it reads or writes to a pipe.
(As a sysadmin, one of my common cases here is '
big-file | tail -10', and then if it takes too long I get impatient
and Ctrl-C the whole thing.)
Shell scripts are another obvious case; since the shell is such a relatively limited language, almost all shell scripts run plenty of external programs even when they're not using pipes. That creates at least two processes (the shell script and the external program), and again when you Ctrl-C the command you want both of them to be interrupted.
A final common case for a certain sort of person is running
Especially for large programs, a
make run can create quite
deep trees of processes (and go through quite a lot of them). And again, if you Ctrl-C your
want everything to be interrupted (and promptly).
(Unix could delegate this responsibility to some single process in
this situation, such as the master process for a shell script or
make itself. But for much the same reason that basic terminal
line editing belongs in the kernel, Unix
opts to have the kernel do it.)
Making changes to multiple files at once in Vim
We recently finished switching the last of our machines to a different client for Let's Encrypt, and as part of that switch the paths to our TLS certificates had to be updated in all of the configuration files using them. On a lot of our machines there's only a single configuration file, but on some of our Apache servers we have TLS certificate paths in multiple files. This made me quite interested in finding out how to do the same change across multiple files in Vim. It turns out that Vim has supported this for a long time and you can go about it in a variety of ways. Some of these ways expose what I would call a quirk of Vim and other people probably call long-standing design decisions.
Under most circumstances, or more specifically when I'm editing
only a moderate number of files, the easiest thing for me to do is
to use the very convenient '
:all' command to open a
window for every buffer, and then use '
:windo' to apply a command
for every window, eg '
:windo %s/.../.../'. Then I'd write out all
of the changed buffers with '
The Vim tips on Search and replace in multiple buffers and
Run a command in multiple buffers
also cover '
:bufdo', which runs things on all buffers whether or
not they're in windows. That was in fact the first thing I tried,
except I left off the magic bit to write out changed buffers and
Vim promptly stopped after making my change in the first file. This
is what I consider a historical quirk, although we're stuck with
Vi has long had multiple buffers, but it's always been pretty stubborn about getting you to write out your changes to the current buffer before you moved to another one. It's easy to see the attraction of getting people to do this on the small and relatively loaded machines that vi was originally written on, since a clean buffer is a buffer that you don't have to retain in memory or in a scratch file on disk (and it's also not at risk if either vi or the machine crashes). However, these days it's at odds with how most other multi-file editors approach the problem. Most of them will let you keep any number of modified buffers around without complaint, and merely stop you from quitting without saving them or actively discarding them. Not hassling you all of the time makes these editors a bit easier to use, and Vim is already a bit inconsistent here since windows are allowed to be changed without preventing you from switching away from them.
Given my views here, I probably want to set '
on. Unless I'm very confident in my change, I don't want to add '
update' to the '
:bufdo' command to immediately write out updates,
and as noted '
hidden' being on makes Vim behave more like other
editors. The drawbacks that the Vim documentation notes don't apply
to me; I never use '
:q!' or '
:qa!' unless my intention is
explicitly to discard all unsaved changes.
It's possible to do this in a trickier way, with the '
Because I just experimented with this, it should be done as:
:hide bufdo %s/.../.../
I don't think there's a perfect way to undo the effects of a
multi-file operation that didn't work out as I intended. If all
buffers were unchanged before the
windo, I can use it
again to invoke undo in each buffer, with '
:bufdo u'. With unchanged
buffers, this is harmless if my cross-file operation didn't actually
change a particular file. If there are unsaved changes in some
buffers, though, this becomes dangerous because the undo in each
buffer is blind; it will undo the most recent change whether or
not that came from the first '
(All of this tells me that I should carefully (re)read the Vim buffer FAQ, because how Vim does buffers, files, tabs, and windows is kind of confusing. GNU Emacs is also confusing here in its own way, but at least with it I understand the history.)
On the whole, '
:all' and then '
:windo ...' is the easier to
remember and easier to use option, and it lets me immediately
inspect some of the changes across all of the files involved.
So it's likely to be what I normally use. It's not as elegant
as the various other options and I'm sure that Vim purists will
sigh, but I'm very much not a Vim purist.
(This is one of those entries that I write for my own future reference. Someday I'll write a page in CSpace to collect all of the Vim things that I want to remember and keep using regularly, but for now blog entries will have to do.)
Why chroot is a security feature for (anonymous) FTP
I recently ran across Is chroot a security feature? (via); following Betteridge's law of headlines, the article's answer is 'no', for good reasons that I will let you read in the article. However, I mildly disagree with the article on a philosophical level for the case of anonymous ftp and things like it. Chroot is a security feature for ftpd because ftpd does something special; anonymous ftp adds an additional security context to your system that wasn't there before.
Before you set up anonymous ftp, your system had the familiar Unix
security contexts of user, group, and 'all logins'. Anonymous ftp
adds the additional context of 'everyone on the network'. This
context is definitely not the same as 'everyone with a login on the
system' (it's much broader), and so there's good reasons to want
to distinguish between the two. This is especially the case if you
allow people to write things through anonymous ftp, since Unixes
traditionally have and rely on various generally writable directories
/var/tmp, but also things like queue submission
directories). You almost certainly don't want to open those up to
everyone on the network just because you opened them up to everyone
on the machine.
(The more your Unix machine is only used by a small group of people and the broader the scope of the network it's on, the more difference there is between these contexts. If you take a small research group's Unix machine and put it on the ARPANET, you have a relatively maximal case.)
Ftpd could implement this additional security context itself, as most web servers do. But as web servers demonstrate, this would be a bunch of code and configuration, and it wouldn't necessarily always work (over the years, various web servers and web environments have had various bugs here). Rolling your own access permission system is a complicated thing. Having the kernel do it for you in a simple and predictable way is much easier, and that way you get chroot.
(Now that I've followed this chain of thought, I don't think it's
a coincidence that the first use of
chroot() for security seems
to have been 4.2 BSD's ftpd.)