2013-01-23
My Unix is a general purpose operating system
When you start thinking about the present and future of Unix, one of the questions you are confronted with is what Unix is for.
One vision of Unix is that its focus is text-mode or headless servers that have a basically static location (in both physical and network terms) and exist to run daemons and services; websites, databases, disk storage nodes, fileservers, and so on. Let me be blunt: this is a very popular thing to do with Unix and is probably the dominant use of Unix today. It runs all the way from a devops organization virtualized in the cloud to one person running a single do-it-all server machine under their desk in a small organization.
My view of Unix is broader than this. For me, Unix is a general purpose operating system, one that is not just for servers but also for simple graphical environments (such as mine), full blown fancy desktops, laptops whether minimal or fancy, little simple machines, and many other places. I keep an open mind and feel that what is fundamentally Unix is capable of wide scope and applicability. In short, it scales in many directions.
(I will skip trying to summarize my reasons for why, but part of it is I feel that Unix has turned out to be a pretty good framework for interacting with computers.)
By Unix I mean more than the kernel and the core APIs. I mean, well, the metaphors and the framework and in general everything that makes Unix a familiar environment for general use. A machine can run a Unix kernel and have Unix-like APIs without being Unix; where the line is between Unix and non-Unix is ultimately one of feel (and varies from person to person).
There are environments where it's not clear if Unix fits. For example, I'm not sure much of the Unix framework really works in smartphones and it may not work well with tablets; we're going to have to see. Part of this is that the Unix framework is a general framework and is not necessarily a good fit to a very specialized, narrow device. Part of it is that the Unix metaphors may not be a good fit for some environments.
Sidebar: my views on some not-quite-Unixes
From what I know of it so far, Android seems clearly not a Unix in this sense although it uses a Unix kernel (yes, Linux is a Unix). Since I have low exposure to Android I may be wrong (and I'm open to having my mind changed).
Mac OS X is a fuzzy case but I consider it mostly not a Unix. If you use OS X as Apple intends you to, its Unix is simply a substrate for what it really is (in the same way its kernel uses Mach as a substrate without actually being Mach in any meaningful way). You can use an OS X box as a Unix machine but in some ways doing so seems to be swimming upstream.
(Part of this perception is based on the history of Apple and of Mac OS. To put it one way, I don't think that Apple has any interest in making Unix machines with a nice Apple desktop; that they arguably do is just a side effect, not a goal.)
What I want to know about kernel security updates
This is kind of a rant. The issue is on my mind because we spent a chunk of this evening applying kernel updates to our Ubuntu machines and rebooting them, something that we feel forced to do once every few months or so. One of the reasons that we don't do this more often, such as every time when Ubuntu releases a kernel update, is that kernel updates are among the most disruptive updates that there are; in order to make them take effect you must reboot the machine, which is completely disruptive to anyone using the machine (especially if they're logged in to it).
But another reason we don't apply Ubuntu kernel updates all that often is that Ubuntu's kernel updates are terrible at giving us useful information about how severe the issues are and how urgent doing an update is. Except in terribly obvious extreme situations (eg 'locally exploitable bug, gives root, an exploit is public') we wind up faced with a flurry of issues of extremely uncertain but generally low seeming impact. Unsurprisingly we wind up defaulting to not doing major disruptions on a regular basis, then periodically we decided that we should get up to date just in case.
While Ubuntu has its specific failings here, this is not just an Ubuntu problem. I think every Linux distribution I've seen a kernel security update from has failed to include the information we'd need to make meaningful decisions. All of them irritate me.
As a sysadmin, here is what I want to know about every issue fixed in a kernel security update:
- how severe is the consequence of the issue? Does the exploit give
you root, disclose some sort of information (and if so, what sort
and can it be leveraged to disclose things like passwords), or
just allow you to lock the machine up?
- is this remotely exploitable or does it require running your code
on the machine? If it's remotely exploitable, how remote is remote;
'on the same LAN' is a lot different than 'anyone on the Internet'.
('Exploitable from inside a VM' is another case.)
The most common sort of issue that I see bugfixes for is a locally exploitable denial of service issue. While it's nice to fix these bugs, they are fundamentally unimportant for many sysadmins since any local user generally already has plenty of ways to lock up or crash a Unix system. But you'd never know this from how distributions phrase things in kernel update notices.
- is this exploitable on a default configuration machine? Or does
it require some specific hardware to be present or some specific,
non-default configuration or protocol to have been set up?
You would not believe how many updates don't make this clear. This matters hugely to whether a particular issue is even relevant to us and it makes me angry every time a distribution or vendor forces me to research this myself.
- how currently exploitable is this issue? This ranges from
'a weaponized exploit has been made public' all the way through
'we think that someone might someday be able to figure out how
to exploit this'.
Yes, yes, I'm sure that distribution security teams hate having to say anything about this (unless it's the former), but trust me, this is the kind of thing that my manager asks me when I say 'this seems pretty urgent, I think we need to do an emergency reboot without our usual one-week advance notice (if there are no conference or paper deadlines)'.
- what is the primary source for this issue, or at least what is an
index page with links to the primary source information? Many
kernel security issues are reported, disclosed, or announced on
things like public mailing lists, generally with far more technical
detail than the distribution wants to put in their update notice.
I want to read this primary source material and I become angry
when a distribution (which had all of this information itself)
hides it and forces me to do web searches.
And everyone should link to the CVE page for CVE issues as well. There is nothing I like quite so much as doing web searches for information that a distribution's security team already had but decided not to give me. Really.
I suspect that most distributions would want to put together their own information page in some standardized format. This is fine, just as long as they put a link to their own info page in the announcement and their info page links to the primary source (and the CVE information and so on). This would also be a good place to put extended discussions of things like how to tell if your particular system is potentially vulnerable to the issue.
My excessively cynical side suspects that distribution security teams leave out a lot of this information in order to push people towards applying every kernel update as soon as possible. If so, I have news for those security teams: they have it exactly backwards. There are powerful forces pushing us (and anyone) against applying updates, especially disruptive updates like new kernels. Every doubt and quibble and uncertainty in a kernel update message feeds those forces and makes it less likely that the update will be applied. In order to get us to apply an important update on an urgent basis, it must be clear that it is urgent. If it is not clear, everyone loses.
Everything works much better when the security team is honest and clear about kernel updates. We'll still sit on all of the updates that are just yet more ways for local users to lock the machine up, but that's no different than what we're already doing. But when you release something that's genuinely dangerous we'll be much more likely to notice, understand, and update much earlier than we would otherwise.
(By the way, for everyone who is about to advise us that we should have dynamic load balancers and pools of machines where we can take some out of service on a rolling basis for kernel upgrades and so on: there is no such thing as a general dynamic load balancer for user login sessions, established sessions in general, or actual running user processes. Thanks.)