2015-11-30
A new piece of my environment: xcape, an X modifier key modifier
I was turned on to xcape by evaryont in a comment on my entry on my Backspace/Delete X key mapping shift. What xcape does is, well, let's quote from its own readme:
xcape allows you to use a modifier key as another key when pressed and released on its own.
A modifier key here is things like Shift, Control, and Alt (and CapsLock if you turn it into a modifier, such as making it another Control). The common use of xcape is by vi people to make one of those keys act as Escape when it's tapped, so they don't have to make the long stretch off to the top left of the keyboard for a key that they use all the time; instead they can tap something much closer.
(At the same time they don't lose the normal use of a valuable
modifier key the way they would if they completely turned one of
the modifier keys into Escape with, say, xmodmap.)
It's not clear how xcape works from the manpage or the readme. Before I started reading the code (it's short), I had concerns that it actually intercepted the modifier key and did weird things with it, which might interfere with other programs. This is not how it works. Instead, xcape passively listens to all keyboard events; when it sees a press and a release of the modifier key alone fly by within its time window, it injects a synthetic key-down and key-up event for your chosen additional key. No existing events are touched, only new ones added.
(Xcape is listed as Linux specific, although it might not be; it only seems to use the X 'record' and 'XTest' extensions, and I think they're generic. The record extension is used to monitor key events, the XTest extension to inject the new ones.)
What I'm using xcape for is a bit different from usual. Dmenu is a core part of my environment, and I have my window manager set to bring it up when I hit F5. F5 was in an easily reached location on my old keyboard, but on my new keyboard it's moved just enough so it's no longer a casual, rapid tap. So I'm using xcape to make tapping the CapsLock key (which I normally use as a Control key) also generate F5 and thereby bring up dmenu. The CapsLock key is of course in an extremely convenient and easily reached spot, which is great for this.
In general this works and achieves the goal of making bringing up dmenu be a fast, easy thing. The one drawback to reusing CapsLock is that I sometimes activate dmenu accidentally during normal typing; evidently I can plan to type a control character but then rapidly change my mind without thinking about it, which creates a CapsLock press and release close enough together to trigger xcape. If this turns out to be a long-term annoyance, I'll probably shift dmenu to being triggered off the much less used actual right Control key.
(This keyboard also has Windows keys, so I could go all the way to making the otherwise unused left Window key trigger dmenu, which wouldn't need xcape at all. But on the whole I like being able to call up dmenu so easily and casually, so I'm inclined to keep things the way I have them now.)
It's possible that someday I'll add an xcape mapping for Escape, but I'm extremely used to hitting Escape in its current location now (it's basically a reflex action at this point) and I don't really find it a problem. Still, I acknowledge that I may be missing out by not doing so and devoting the time to acclimatize to a new Escape location.
(I'd probably put Escape on the left Shift.)
2015-11-27
Documentation should explain why things are security issues, at least briefly
In my discussion of Apache suexec I
mentioned the apache2-suexec-custom Debian package, which allows
you to change suexec's idea of its docroot and thus use suexec
to run virtual host CGIs that aren't located under /var/www. If
you're using suexec-custom, one of the obvious questions is what
it's safe to set the suexec docroot to. If you read the manpage,
you will hit this paragraph:
Do not set the [suexec] document root to a path that includes users' home directories (like /home or /var) or directories where users can mount removable media. Doing so would create local security issues. Suexec does not allow to set the document root to the root directory /.
This is all that the manpage has to say about this. In fact, this is all of the documentation you get about the security issues involved, period.
Perhaps the people who wrote this documentation felt that the
security issues created here are obvious to everyone. If so, they
were wrong. I at least have no idea what specifically makes including
user home directories dangerous. It seems unlikely to be that users
can create new executables, because if you're doing virtual hosting
and using suexec, you're presumably already giving all of those
different virtual hosting UIDs write access to their subdirectory
in /var/www so they can set up their own CGIs. After all, suexec
explicitly requires all of those CGIs and their containing
directories to be owned by the target user, not you. And after that,
what is there that applies to user home directories but not /var/www?
(It can't be that suexec will run arbitrary programs under user
home directories, because suexec has to be run through Apache and
you should not be telling Apache 'treat anything at all under this
entire general directory hierarchy as a CGI through these URL'. If
you tell Apache that your CGI-BIN directory is /usr/bin or /home
or the like, you have already made a horrible mistake.)
This is a specific example of what is a general failing, namely not explaining why things are security issues. When you don't explain why things are a security problem, you leave people uncertain about what's safe and what isn't. Here, I've been left with no idea about what the important security properties of suexec's docroot actually are. The authors of the manpage have in mind some dangers, but I don't know what they are and as a result I don't know how to avoid them. It's quite possible that this will result in me accidentally configuring Apache and suexec in a subtly insecure way.
The explanation of why things are a security issue doesn't have to be deep and detailed; I don't demand, say, an example of how to exploit an issue. But it should be detailed enough that an outsider can see clearly what they need to avoid and broadly why. If you say 'avoid this general sort of setup', you need to explain what makes that setup dangerous so that people can avoid accidentally introducing a dangerous bit in another setup. Vagueness here doesn't help anyone.
(As a corollary, if you say that a general sort of setup is safe, you should probably explain why that's so. Otherwise you risk people making some small, harmless looking variant of the setup that is in fact not safe because it violates one of the assumptions.)
By the way, all of this applies to local system setup documentation too. If you know why something has to be done or not done in a particular way to preserve security, write it down in specific (even if it seems obvious to you now). Future readers of your documentation will thank you for being clear, and as usual this may well include your future self.
PS: It's possible that you don't know of any specific issues in your program but feel that it's probably not safe to use outside of certain narrow circumstances that you've considered in detail. If so, the documentation should just say this outright. Sysadmins and other people who care about the security properties of your program will appreciate the honesty.
2015-11-12
Don't have support registrations in the name of a specific sysadmin
Every so often at work you will buy something, sign up for a service, arrange a support contract, register for monitoring, or whatever with an outside company or organization. Not infrequently these things will ask you for an email address that will be both your organization's key to the service and the person who gets notifications about it (sometimes you have to pick a username or login too). Here is something that we have learned the awkward way: when you do this, don't just use your email address (and name, and so on). Instead, either use an existing generic group email address or make up a new service specific email address (often these will just be mail aliases that distribute the email to all of the relevant parties). There are two reasons for this.
The first reason is that it keeps a single person from being the critical path for things to do with the service. If things like password resets or approvals for some action go only to me because I used my own email address and you need to do one of these things when I'm sick or on vacation or very busy or whatever, well, we have a problem. Or at least an annoyance. Using a generic address that multiple people see avoids that problem; I don't need to wait for the single magic person to be able to deal with whatever they need to do.
The second reason is that, well, to put it bluntly: people leave eventually. If person X leaves and there are things tied to their email address, using their customary personal login, and so on, life is at least a bit awkward. You can make it work, but take it from personal experience, it still feels weird and not entirely right to log in somewhere as ex-co-worker X because that's just how it was set up.
(I imagine you can have lots of fun if there have been several generations of turnover. 'Why do we have to log in to this site as 'jane'? Who's Jane? Oh, she was here ten years ago.')
Consistently registering everything with a generic email address, a suitable generic login or username, and so on avoids all of that. When someone leaves nothing needs to change and there's no uncomfortable feel or awkward 'who is jane?' explanations in a few years.
(There are exceptions to this, of course. Sometimes a service has been built with these issues in mind, so it has groups and supports multiple accounts that you manage and so on. Sometimes a registration is genuinely personal and will only ever be used by you and it's okay for it to go away if you leave. Sometimes it's just in the nature of the service that everyone needs an individual login in order for things to really work. And so on.)
PS: The flipside of this is that if you're a service provider who has people register accounts with you, this is yet another reason that you really want to support changing logins.
2015-11-10
Why I spent a lot of time agonizing over an error message recently
I recently spent an inordinate amount of time not so much writing a local script as repeatedly writing, rewriting, and modifying its error messages (the rest of the script mostly simple). Now, I'll admit up front that I have a general habit of obsessing over small details of program output, and maybe some of the fidgeting with the error messages was for this. But I actually maintain that I had a completely sensible reason for caring so much about the script's error messages. You see, the script isn't supposed to fail.
More exactly, it's not supposed to fail but we think that it might someday do so because every so often something weird is going on with the operation the script is doing. In fact the script exists to automate certain workarounds we were doing when we did this particular operation 'by hand' (it's actually buried inside another script). So almost all of the time the script is supposed to work, and we certainly hope it works all the time, but there's a rare possibility of failure lurking in the underbrush.
What this means for the script is that by the time we get an error, we'll probably have long since forgotten exactly what's going on. It's likely that the script will work reliably for weeks and months, during which our knowledge of the entire problem will have been displaced by other things. This means it's important for the error message we get to be clear, so we don't have to try to remember all of the surrounding context from scratch. A cryptic error message would make perfect sense for us right now, when the context is clear in our minds, but it won't in six months.
When I was revising the error message, one part of what I did was to look for things that might be mis-remembered or misinterpreted by people who'd forgotten the context. A surprisingly large amount of my initial language was at least partially ambiguous when I took a step back and tried my best to read it without context. Things that were obvious or only had one meaning inside the context suddenly took on an uncomfortable new life outside it. The resulting error messages are significantly more verbose now, but at least I can hope that they'll still make sense in six months.
(This is of course a version of the problem of context in programming.)
2015-11-09
What sysadmins want out of logging means that it can't be too simple
Dave Cheney recently wrote Let's talk about logging about (Go) logging packages, where he advocates, well, I'm going to quote him directly:
I believe that there are only two things you should log:
- Things that developers care about when they are developing or debugging software.
- Things that users care about when using your software.
Obviously these are debug and info levels, respectively.
log.Infoshould simply write that line to the log output. There should not be an option to turn it off as the user should only be told things which are useful for them. [...]
My reaction is that this is too simple for real use. Ignoring things like (web) activity logs (which Dave Cheney agrees are a different case), there are clear divisions between what sysadmins need at different times and in different situations.
First, let's agree that programs should always be able to log their basic actions. If you're a web server, this is HTTP requests; if you're a mail server, this is email traffic; and so on. This tells sysadmins whether or not the system is doing anything, and if it's doing something what it's doing and how fast. Sysadmins will use this to do monitoring, to check if something happened (such as an email arriving or a request being processed), and so on.
Systems not infrequently encounter internal issues that are not fatal errors. They may experience timeouts, request errors, and so on. If we say that errors are fatal, these are all 'warnings' (even if some terminate the processing of the current whatever that the system is handling). They mark odd things that should not normally happen. Sysadmins like to have a record of these for obvious reasons.
Finally, when sysadmins are working to diagnose problems with services we want to be able to get detailed activity traces of exactly how the system processed requests. What did it look at? What did it find or not find as it stepped through things? Here we're looking for a description of why the system is acting as it is. This level of information is too voluminous to be logged routinely, and often it needs to be segmented up so that we can look only at certain aspects (because otherwise we'll drown in probably irrelevant information).
It's tempting to say that this level of information is the same as developer debug information, but it's my view that it's not. Developer debug information is internally focused and aimed at people who know the code and are making code changes. Sysadmin activity traces are externally focused and aimed at people who do not know the code and are not changing it. As a sysadmin, I don't care about internal state in the code; I'm going to assume there's no code bug and instead that I have either a misconfiguration or a malfunction somewhere in the overall system environment. I want to find that.
You can in theory run all of this through a simple log.Info
interface. But if you do so there are two problems. First, you need
to create internal standards in your program for formatting messages
so that sysadmins can tell the different sorts of messages apart
from each other. Second, you are spewing massive amounts of information
out all the time (since you're always dumping all activity traces),
which is not very friendly. My view is that a good logging package
should be able to do this for you. A too-simple logging package
throws both program authors and sysadmins to the wolves of ad-hoc
logging and log filtering.
This is why real programs grow features to control what gets logged and to log different sorts of things in different places. Apache does not have separate request logs and error logs for arbitrary reasons, for example; real people wanted that separation because they find it quite useful.
2015-11-02
Status reporting commands should have script-oriented output too
There's a lot of status reporting programs out there on a typical systems; they report on packages, on filesystems, on the status of ZFS pools or Linux software RAID or LVM, on boot environments, on all sorts of things. I've written before about these programs as tools or frontends, where I advocated for writing tools, but it's clear that battle is long since lost; almost no one writes programs that are tools instead of frontends. So today I have a more modest request: status reporting programs should have script oriented output as well as human oriented output.
The obvious reason is that this makes it easier for sysadmins to build scripts on top of your programs. Sysadmins do want to do this, especially these days where automation is increasingly important, and parsing your regular human-oriented output is more difficult and also more error-prone. Such script oriented output doesn't have to be very elaborate, either; it just has to be clear and easy to deal with in a script.
But there's a less obvious reason to have script oriented output; it's much easier to make script oriented output be stable (either de facto or explicitly documented as such). The thing about human oriented output is that it's quite prone to changing its format as additional information gets added and people rethink what the nicest presentation of information is. And it's hard to argue against better, more informative, more readable output (and in fact I don't think one should). But changed output is death on things that try to parse that output; scripts really want and need stable output, and will often break if they're parsing your human oriented output and you change it. When you explicitly split human oriented output from script oriented output, you can provide both the stability that scripts need and the changes that improve what people see. This is a win for both parties.
(As a side effect it may make it easier to change the human oriented output, because there shouldn't be many worries about scripts consuming it too. Assuming that you worried about that in the first place.)
(This is the elaborated version of a tweet and the resulting conversation with Dan McDonald.)