A surprising hazard of running as root all the time
We have some machines that are 'no user-operable parts inside'
setups; as part of that, they have no user logins, just
yes, running as root all the time is bad, but on these boxes almost
all we'd ever do with a plain login is su to root.)
I'm attuned to all of the regular hazards of this, but today I
stumbled over a new one: how long it takes to notice that
accidentally wound up mode 0750 (and owned by a group that didn't have
hardly anything in it) on a Solaris 2.4 machine.
Of course, root doesn't get permission denied messages, and most of
the obvious things were running as root and kept on working. About the
only sign was a large collection of files called things like
mailAAAa00087' scribbled in
/var/tmp. It turned out that these
files were complaints from cron about being unable to run
jobs because it couldn't change to
lp's home directory, and bounce
messages talking about '
lp... Can't create output'.
So I looked at lp's home directory,
/usr/spool/lp, which looked
perfectly fine and I could even
cd into it as root. Only when I
su lp' and tried it did I get a 'permission denied' error and
started backtracking to discover the
/var permissions problem.
Sidebar: so how did it happen?
What I think happened is that someone built a tar file of a
/var/named directory they wanted to move around, but instead of
tarring up the directory, they
cd'd into the directory and tared up
.'. Then they moved it to this machine and accidentally untarred it
/var instead of making a
/var/named directory and untarring it
there. As part of unpacking,
tar dutifully set the permissions on
all of the files and directories in the tarball, including '
So the moral is: tarfiles that include
. are annoying and
dangerous in more than one way.
Security versus resilience
A while back I wrote this, about an exception created by the cgi module when crackers submitted XML-RPC calls instead of form POSTs. It makes a great example for discussing the difference between 'secure systems' and 'resilient systems'.
Put broadly, security is keeping people out, while resilience is keeping operating when people attack you. The cgi module example shows that you can have one without the other. Sometimes this may even be deliberate; an exceptionally paranoid system could shut itself down any time it saw unexpected input, just to be sure. This would be quite secure but not at all resilient.
(There are real systems that are close to this paranoid, for example the PAL systems that try to prevent unauthorized use of nuclear weapons.)
The cgi module seems to be secure (and I say 'seems' only because I haven't personally analyzed the code). To a large extent Python makes it easy to be secure; you are protected from basic issues like buffer overruns, and exceptions force you to handle errors one way or another. Python code may fail, but it almost always fails safely. (This does leave design issues, where the code is right but the algorithm is horribly wrong, but no language can really help there.)
However, resilience is much harder and less common, as the cgi module example demonstrates (and there's a number of other ways to make programs using the cgi module unhappy). If this is sloppy programming on the part of the cgi module, then such sloppy programming is practically endemic; truly paranoid programming, even for network applications, is still rare. (And I'm not going to claim that I've managed it.)
I think that resilience is in general harder than security. Security is all about confining things and making sure that things don't happen, whereas resilience is about thinking about everything that could go wrong. This makes resilience much more of an open-ended problem than security, with many more things to think about and keep track of.
Because resilience is about 'what can go wrong?', it also needs you to go behind the convenient abstractions, like 'network IO is just a stream of bytes'. (It is, but it's a stream of bytes that may come very slowly or very fast, not come at all, or be incomplete. What happens to your program in each case?)
On a concrete level, I'm pretty confidant in DWiki's security, and its design has a certain amount of thought put into the issues. I'm equally confidant that DWiki is not resilient and that there are a bunch of ways (even without writing comments) to hammer it. (DWiki gets a certain amount of resilience from being run as a CGI-BIN by Apache, but this only goes so far.)