2012-03-26
Microkernels are quite attractive to academic computer science researchers
Recently there was a Stackoverflow question that asked why Tanenbaum was wrong in his predictions for the future in the Tanenbaum/Torvalds debate. This got me thinking about microkernels (one of Tanenbaum's predictions was that microkernels were the future of operating system design). In particular, it got me thinking that microkernels have a bunch of properties that make them almost uniquely attractive for academic operating system researchers. For background, I recommend Rob Pike's System Software Research is Irrelevant [PDF] for an idea of the constraints and environment that academic OS research exists in.
To start with, let's admit that the idea of microkernels is attractive in general. Of course we'd like there to be a minimal, elegant set of simple operations that we could use to easily compose operating systems from. It's no wonder that microkernels are by and large a dream that won't die. Beyond that, the attractive properties of microkernels for academic research include:
- microkernels are small in both scope and (hopefully) code size. Small
projects are achievable projects in academic research.
- a microkernel is naturally limited in functionality, so you have an
excellent reason to not implement a lot of things. This drastically
reduces the scope of your work and gives you a great excuse for
not coding a lot of boring, routine infrastructure that's necessary
for a usable system.
(Note that in academic research you basically cannot afford to implement things that will not result in papers, because papers are your primary and in fact only important output.)
- microkernels are (strongly) modular, which makes it easier to farm out
work among a bunch of students, postdocs, and research assistants.
A strong division of labour is important not just for the obvious
reason but because it increases the number of papers that all of
the people working on the project can collectively generate. If
everyone is involved in everything you probably get one monster
paper with a monster author list, but if people are working mostly
independently you can get a whole bunch of papers, each with a
small number of authors.
(And if someone wants to get a MSc or PhD thesis out of their work, they have to be working on their own.)
- microkernels have a minimal set of primitives, which makes it easy to
write papers about them. You don't have to try to describe a large,
complex OS and what makes it interesting; you can simply write a paper
about the microkernel primitives you chose (and perhaps some clever way
that you implemented them and made them acceptably fast).
- there are lots of choices for viable primitives and thus lots of
different microkernel designs, which means that it's easy to do
new work in the field; all you have to do is pick a set of
primitives that no one has used yet.
- there is a strong liking for elegant minimalism in basically all
parts of academic computer science. The minimalism of microkernels
plays right into this overall attitude.
- the whole 'normal kernel on microkernel' idea of porting an
existing OS kernel to live on top of your microkernel gives you
at least the hope of creating a usable environment on your
microkernel with a minimum amount of work (ie, without implementing
all of a POSIX+ layer and TCP/IP networking and so on). Plus some
grad student can probably get a paper out of it, which is a double
win.
- the drawbacks of microkernels are mostly at the pragmatic levels of
actual performance, which academics mostly don't care about and don't
get called on. You can excuse relative performance figures against
something like Linux or FreeBSD by saying that your microkernel has
not had the kind of optimization that those OSes have had, or that
the performance loss is worth it for some benefit that microkernels
are supposed to give you.
- in academic computer science you do not have to actually prove any claims that your microkernel is more reliable, easier to write software for, or the like than traditional kernels. In fact it's not clear how you would prove such claims; productivity and reliability claims are notoriously hard to validate because there are so many variables involved.
I don't know exactly why Tanenbaum thought that microkernels would clearly win, but I suspect that the unusual attractiveness of microkernels to academic researchers like him didn't hurt.
(It's also worth noting that back in the very early 1990s of the Tanenbaum/Torvalds debate, microkernels were much newer and much less thoroughly beaten to death than they are today. It was much easier to believe that they were the new hotness and solution to our operating system problems.)
What it means to become another user on Unix
Ignoring things like SELinux for the moment, the basic security state of Unix system calls has always been that root is allowed to become any other UID at will, but no one else is allowed to change their UID or other security attributes (setuid programs then provide an escape hatch from this). But what does it mean to become another user on Unix, beyond just setting your UID? In fact there are a whole series of things that it can mean, some of which you do not necessarily want.
Let's make a list:
- switching to the user's UID. This is the basic prerequisite of what
it means to become another user on Unix, and is done by
setuid()
. - switching your groups to the user's groups, which is done with a
combination of
initgroups()
andsetgid()
. - setting some environment variables like
$HOME
and$SHELL
to values appropriate for the new user.(Most but not all versions of
su
do this. Versions ofsu
that do not change$HOME
can be exciting, especially when combined with shells that read initialization files from$HOME
.) - clearing various environment variables; in the extreme case, you will
clear all environment variables and give yourself a new set of safe
values for things like
$PATH
and so on.(Aggressively scrubbing the environment is generally the default for
sudo
. In general there are any number of environment variables that are very dangerous to leave intact, such as all of theLD_*
variables that influence the behavior of the dynamic loader.) - running the user's shell, either interactively or with
-c ...
to execute commands.(
su
normally always runs the user's shell;sudo
will often run commands directly without starting the user's shell.) - changing to the user's home directory.
- running the user's shell as a login shell.
- allocating a new pseudo-tty as the user and attaching the user's shell
and so on to it as a new session. This is generally the domain
of things that are making real login sessions, like
sshd
, instead of programs that just become the user.
(Note that this list is not in the order that your code wants to
actually do these operations. For example, you want to set groups
before changing to the other UID because the moment you setuid()
to a
non-root user you lose the power to set your own groups list.)
Many things that become another user deliberately run the the target
user's shell even when they just want to run a command. This is done
partly so that the user's shell can do any special initialization that,
eg, may augment its $PATH
and partly so that users with restricted
shells can't escape them in various clever ways. However, not everything
does this and sometimes running the user's shell is inconvenient.
There are a surprising number of cases where you want basically the
first two only; you want to run a command as a user but in a way
such that their environment (their shell, their shell initialization,
and so on) is ignored. It's possible to do this with standard
commands like su
and sudo
by carefully reading their manpages
and picking exactly the right options, but I find it easier to have
a very basic runas
program sitting around.
(runas
only works for root, so it can completely ignore authentication
and similar issues.)