2012-03-26
Microkernels are quite attractive to academic computer science researchers
Recently there was a Stackoverflow question that asked why Tanenbaum was wrong in his predictions for the future in the Tanenbaum/Torvalds debate. This got me thinking about microkernels (one of Tanenbaum's predictions was that microkernels were the future of operating system design). In particular, it got me thinking that microkernels have a bunch of properties that make them almost uniquely attractive for academic operating system researchers. For background, I recommend Rob Pike's System Software Research is Irrelevant [PDF] for an idea of the constraints and environment that academic OS research exists in.
To start with, let's admit that the idea of microkernels is attractive in general. Of course we'd like there to be a minimal, elegant set of simple operations that we could use to easily compose operating systems from. It's no wonder that microkernels are by and large a dream that won't die. Beyond that, the attractive properties of microkernels for academic research include:
- microkernels are small in both scope and (hopefully) code size. Small
projects are achievable projects in academic research.
- a microkernel is naturally limited in functionality, so you have an
excellent reason to not implement a lot of things. This drastically
reduces the scope of your work and gives you a great excuse for
not coding a lot of boring, routine infrastructure that's necessary
for a usable system.
(Note that in academic research you basically cannot afford to implement things that will not result in papers, because papers are your primary and in fact only important output.)
- microkernels are (strongly) modular, which makes it easier to farm out
work among a bunch of students, postdocs, and research assistants.
A strong division of labour is important not just for the obvious
reason but because it increases the number of papers that all of
the people working on the project can collectively generate. If
everyone is involved in everything you probably get one monster
paper with a monster author list, but if people are working mostly
independently you can get a whole bunch of papers, each with a
small number of authors.
(And if someone wants to get a MSc or PhD thesis out of their work, they have to be working on their own.)
- microkernels have a minimal set of primitives, which makes it easy to
write papers about them. You don't have to try to describe a large,
complex OS and what makes it interesting; you can simply write a paper
about the microkernel primitives you chose (and perhaps some clever way
that you implemented them and made them acceptably fast).
- there are lots of choices for viable primitives and thus lots of
different microkernel designs, which means that it's easy to do
new work in the field; all you have to do is pick a set of
primitives that no one has used yet.
- there is a strong liking for elegant minimalism in basically all
parts of academic computer science. The minimalism of microkernels
plays right into this overall attitude.
- the whole 'normal kernel on microkernel' idea of porting an
existing OS kernel to live on top of your microkernel gives you
at least the hope of creating a usable environment on your
microkernel with a minimum amount of work (ie, without implementing
all of a POSIX+ layer and TCP/IP networking and so on). Plus some
grad student can probably get a paper out of it, which is a double
win.
- the drawbacks of microkernels are mostly at the pragmatic levels of
actual performance, which academics mostly don't care about and don't
get called on. You can excuse relative performance figures against
something like Linux or FreeBSD by saying that your microkernel has
not had the kind of optimization that those OSes have had, or that
the performance loss is worth it for some benefit that microkernels
are supposed to give you.
- in academic computer science you do not have to actually prove any claims that your microkernel is more reliable, easier to write software for, or the like than traditional kernels. In fact it's not clear how you would prove such claims; productivity and reliability claims are notoriously hard to validate because there are so many variables involved.
I don't know exactly why Tanenbaum thought that microkernels would clearly win, but I suspect that the unusual attractiveness of microkernels to academic researchers like him didn't hurt.
(It's also worth noting that back in the very early 1990s of the Tanenbaum/Torvalds debate, microkernels were much newer and much less thoroughly beaten to death than they are today. It was much easier to believe that they were the new hotness and solution to our operating system problems.)
2012-03-24
Garbage collection and modern virtual memory systems
Every so often I run across a piece of writing that makes me slap my forehead because it points out something obvious that I'd still totally missed. Today's is Pat Shaughnessy's discussion of Ruby 2.0's new garbage collection system, in which he casually explains the problem with traditional non-compacting mark and sweep garbage collection in a modern Unix environment. As usual, I am now going to rephrase the general issue to make sure I have it straight in my head.
Mark and sweep GC works in two phases. In the first phase, it traverses
the tree of live objects and marks all of them as still alive with a
mark bit; in the second phase, it sweeps all of memory to free all
unmarked objects and resets all of the marks. The problem is that a
traditional implementation puts the mark bit in the object itself, which
means that you write to each live object during a garbage collection
pass. This is a great way to dirty a lot of virtual memory pages (and
cache lines). As Shaughnessy notes, this is especially unfortunate in
any copy on write environment (such as a fork()-based server on Unix),
because dirtying pages just to set and unset the mark bits also forces
them to be copied (and unshares them).
(How bad this is depends partly on how densely packed your objects are in memory, which has a lot of variables.)
The fundamental problem is that simple mark and sweep dirties otherwise read-only objects, and it does so in scattered locations across memory. In the old days of simple memory systems (both physical and virtual), this was not such a big deal. In a modern memory system this can matter quite a bit.
(Note that a compacting mark and sweep GC has even worse problems from this perspective. Not only does it mark live objects, but it's probably going to move them around in memory and thus dirty even more virtual memory pages.)
While it's tempting to praise reference counting GC for avoiding this, it's not quite accurate; reference counting only avoids similar issues for live objects that don't change their usage count. Many otherwise read-only objects will have their usage count fluctuate while your code is running, which means writes to the object to track it, which means all of these bad things with copy on write and so on.
(In other words, Python people should not feel smug here.)
In a way this whole issue should not be surprising. We've known for a while that a lot of classic memory allocation related algorithms were not well suited for modern memory systems; for example, a while back there was a bunch of work on virtual memory friendly memory allocators (ones that looked at and dirtied as few pages as possible during operation). That simple classical garbage collection algorithms also have these issues is not particularly novel or startling.
As a pragmatic matter, this suggests that you should normally force a
garbage collection pass immediately before any fork() that will not
shortly be followed by an exec(). If you're running in an environment
where the fork() happens outside of your control, force a GC pass
after you've initialized your system and are ready to run.
2012-03-19
A modest suggestion: increase your text size
I've always used relatively small fonts in my desktop environment, as seen in MyDesktopTour (although they aren't as small as the fonts some people I know use). This wasn't particularly because I liked small fonts; it's more because those were the font sizes I needed to use in order to fit two side by side 80-column xterm windows into a 1280x1024 display. I used them for years and generally didn't think about it; I mean, they worked and everything was fine.
(Looking at that shot of my desktop, the visible xterm is using my 'small' font, which I used for secondary xterms. An 80 column xterm using the normal font would be about as wide as the visible browser window.)
Recently I moved to a larger 1900x1200 display, which started a cascade of font-related changes in my environment. I am not going to try to describe them in detail, but the short version is that I eliminated my smaller xterm font size and switched the 'normal' font to one that gives an 80x24 xterm window that is about 12% wider and 19% taller than it used to be.
Perhaps my old fonts were perfectly readable as they were. But having made the switch, I will say that the new larger fonts definitely seem even more readable than the old ones. In fact, they make me feel that I probably was quietly straining my eyes a bit for years without really being conscious of it. Or to put it directly, there is a difference between readable and easily readable. A lot of text sizes are readable, even quite small ones. But you don't necessarily want to spend all your time reading text in those sizes even though you can. Bigger text sizes really are good for you in many cases.
(It's my opinion that this doesn't really change based on the display resolution, either. Small text is still small text even if it's rendered very smoothly.)
Hence my modest suggestion of increasing your text size, even if you think your current text size is perfectly fine and perfectly readable. You may discover that you were like me.
(I can't really do a true comparison of the readability of my old fonts against my new one for various reasons including that a true A/B comparison would have to have the old fonts on an old display, not the new one. I believe the displays are pretty close in DPI, but close is note quite the same thing as 'the same'.)
2012-03-18
The standard trajectory of a field
Over and over again, fields of work have followed a common trajectory (both inside and outside of computers). At first, everything is new and unknown, and the field is pioneered by people who have to be innovators and often deeply technical. There are no known tools and procedures, so they need to be developed from scratch; generally everything has to be hand-crafted and unique and you spend a lot of time building all of this yourself.
But that's just the beginning. Over time this pioneering shifts and the field becomes increasingly routinized. As the problems become well explored and their solutions are increasingly well known, the field mostly or entirely turns into 'apply known solution Y to problem X', often with standard, already existing tools. The people in the field no longer need to be deeply technical innovators, because they are not solving new problems and building things from scratch. These days, the next step after routinization is automation (provided that it is easy enough to automate, or at least lucrative enough).
This shift is inevitable in most fields. As long as the problems and issues in the field do in fact repeat, people will necessarily gain familiarity with them over time; as this happens, people stop having to invent solutions to the problems on the spot and can instead just reuse the same solution as before (or at least some close variant of it). And it is usually less technically demanding to apply existing solutions than to invent new ones.
There is an important corollary to this: if you want to work on interesting things and with problems that have not been routinized, you need to be at the frontier somewhere. This implies that you need to keep moving. If you stay still in that comfortable bit of your field that you like, it is all but inevitable that the pioneering frontier will move past you and around you will spring up the cultivated fields of known problems with known, routine solutions. Then one day you will look up and wonder why you are spending all day applying recipes instead of inventing things.
Applications to the field of your choice are left as an exercise for the reader (or in the case of system administration, at least to another entry).