I should remember that I can cast things in Go
I think of Go as a strongly typed language.
My broad and somewhat reflexive view of strongly typed languages is
that they mostly don't allow you to cast things around because most
casts will require expensive data conversion and the language wants
you to do that explicitly, with your own code. Go even sticks to this
view; you can cast numbers around (because it's too useful) and you
can go between
byte (because it's a core operation),
and that's mostly it.
(Then there's interfaces, which expose some tricks. Interface casting involves a bunch of potentially expensive magic, but it's a core feature of Go so it's an exception to the 'no expensive operations via casts' rule of thumb.)
However, there is an important practical exception to this, which
comes about because of another thing that Go encourages: lots of
named types that you derive from fundamental types. Rather than using,
int, for all sorts of different things in your code, everyone
knows that you should instead create various specific types:
type Phase int type Action int type OneMap map[string]string type TwoMap map[string]string
This way you can never accidentally use a
Phase when the actual
function, field, or whatever is supposed to be an
Action, or pass
OneMap function a
TwoMap, and so on. Go's strong typing will
force them to be separate (even if this is sometimes irritating,
for example if you're dealing with cgo).
These derived types can be cast to each other and to their underlying type. This is not just if they're numbers; any derived type can be cast around like this, provided that the underlying 'real' types are the same (per the Conversions section of the language spec).
(At a mechanical level it's easy to see why this is okay; since the two derived types have exactly the same memory layout, you don't have to do expensive conversion to generate a type-safe result.)
Now, ordinarily you still don't want to cast a
OneMap to a
(or to a
map[string]string). But there is one special case that
matters to me, and that's if I want to do the same operation on
both sorts of maps. Since I actually can cast them around, I don't
need to write two duplicated blocks of (type-specific) code to do
the same operation. Instead I can write one, perhaps one that's
generic to the
map[string]string type, and simply call it for
both cases through casts. This is not the only way to create common
code for a generic operation but it's probably the easiest one to
add on the fly without a bunch of code refactoring.
So this is why I need to remember that casting types, even complex types, is something that I can do in Go. It's been kind of a reflexive blind spot in my Go code in the past, but hopefully writing this will avoid it in the future.
ZFS pool import needs much better error messages
One of the frustrating things about dealing with sufficiently damaged
ZFS pools is that '
zpool import' and friends do not generate very
detailed error messages. There are a lot of things that can go wrong
with a ZFS pool that will make it not importable, but '
import' has clear explanations for only some of them. For many others
all you get is a generic error in '
zpool import' status reporting
The pool cannot be imported due to damaged devices or data.
(Here I'm talking about the results of just running '
to see available pools and their states and configuration, not
trying to actually import a pool. Here
zpool has lots of room to
write explicit and detailed messages about what seems to be wrong
with your pool's configuration.)
This isn't just an issue of annoying and frustrating people with
opaque, generic error messages. Given that the error messages are
generic, it's quite easy for people to focus only on the obvious
zpool import reports, even if those problems may
not be the reason the pool can't be imported. As it happens I have
a great example of this in action, in this SuperUser question.
When you read this question, can you figure out what's wrong? Both
the SuperUser ZFS community and the ZFS on Linux mailing list
(I believe that everything you need to figure out what's going on
is actually in the information in the question and the code behind
zpool import' actually knows what the problem is. This assumes
that my diagnosis
is correct, of course.)
zpool import should not be fully verbose by default, as
there's a certain amount of information that may only make sense
to people who know a fair bit about how ZFS works. But it certainly
should be possible to get this information with, eg, a verbose
switch instead of having to reverse engineer it from
If nothing else, this means that you can get a verbose report and
show it to ZFS exports in the hope that they can tell you what's
On a purely pragmatic level I think that
zpool import should be
really verbose and detailed when a pool can't be imported. 'My pool
won't import' is one of the most stressful experiences you can have
with ZFS; to get unclear, generic errors at this point is extremely
frustrating and does not help one's mood in the least. This is
exactly the time when large amounts of detail are really, really
appreciated, even if they're telling you exactly how far up the
creek you are.
(This means that I would very much like a '
zpool import -v <pool>'
option that describes exactly what the import is doing or trying
to do and then covers all of the problems that it detected with the
pool configuration, all the things the kernel said to it, and so
on. A report of 'I am asking the kernel to import a pool made up
of the following devices in the following vdev structure' is not
PS: while this example is from ZFS on Linux and FreeBSD, I've looked at the current Illumos code for zpool and libzfs, and as far as I can see it would have exactly the same problem here.
(Part of the issue is that
zpool import and libzfs have what you
could call less than ideal reporting if a pool is marked as active
on some other system and also has configuration problems. But even
if it reported multiple errors I think that the real problem here
would remain obscure; the current '
zpool import' code appears to
deliberately suppress printing out parts of the information necessary.)
We killed off our SunSolve email contact address yesterday
Back in the days when Sun was Sun, Sun's patch access and support
system was imaginatively called Sunsolve. If you had a support
contract with Sun (which often was only about the ability to get
patches and file bug reports), you had a SunSolve account. We had
one, of course (we have been using
Solaris for longer than it's been Solaris). In the very beginning
we made a classical mistake
and had it in the name and email of a specific sysadmin (who then
moved on), but in the early days of our Solaris 10 fileservers we switched this to a generic email address,
Yesterday, we removed that address.
Our Solaris machines have all been out of commission for a while now, but we left the address in place mostly because of inertia. What pushed me to remove it is the usual reason; we just couldn't get Oracle to stop mailing things to it. I don't think Oracle spammed it (unlike some people), but they did keep sending us information about patch clusters and quarterly updates and this and that, all of which is irrelevant to us these days.
(I managed to get Oracle to mostly knock it off, but the other day they decided that they had an update that was so urgent that they just had to mail it to us. Never mind that we don't have any of the software at issue, that Oracle had our email address was good enough for them.)
At one level this is an unimportant little bit of cleanup that we should have done long ago. With our Solaris machines gone and our grandfathered support contract let run down, the email address had no point; it was just another lingering bit of clutter, and we should get rid of that kind of thing while we remember what it is and why we can remove it.
(If you wait long enough on this sort of thing, you can easily forget whether or not there's some special, inobvious reason that you're keeping these old oddities around. So it's best to strike while everything is fresh in your mind.)
At another level, the
sunsolve email address was one of the last
lingering traces of what was (after all) a very long association
with Sun and Solaris. Just as with other things, letting
it go is yet another line drawn under all of that history, even if
SunSolve itself stopped existing years ago.
(Oracle decommissioned SunSolve and folded the functionality into their own support system not long after they bought Sun. The conversion was not entirely pleasant for support customers.)
PS: Since I just looked, it warms my heart a little bit that PCA is still trucking along. Oracle may have killed some very useful customer-done things but at least they left PCA alone. If we still had to deal with the mess that is Solaris patches, we'd be very thankful for that.
Don't have support registrations in the name of a specific sysadmin
Every so often at work you will buy something, sign up for a service, arrange a support contract, register for monitoring, or whatever with an outside company or organization. Not infrequently these things will ask you for an email address that will be both your organization's key to the service and the person who gets notifications about it (sometimes you have to pick a username or login too). Here is something that we have learned the awkward way: when you do this, don't just use your email address (and name, and so on). Instead, either use an existing generic group email address or make up a new service specific email address (often these will just be mail aliases that distribute the email to all of the relevant parties). There are two reasons for this.
The first reason is that it keeps a single person from being the critical path for things to do with the service. If things like password resets or approvals for some action go only to me because I used my own email address and you need to do one of these things when I'm sick or on vacation or very busy or whatever, well, we have a problem. Or at least an annoyance. Using a generic address that multiple people see avoids that problem; I don't need to wait for the single magic person to be able to deal with whatever they need to do.
The second reason is that, well, to put it bluntly: people leave eventually. If person X leaves and there are things tied to their email address, using their customary personal login, and so on, life is at least a bit awkward. You can make it work, but take it from personal experience, it still feels weird and not entirely right to log in somewhere as ex-co-worker X because that's just how it was set up.
(I imagine you can have lots of fun if there have been several generations of turnover. 'Why do we have to log in to this site as 'jane'? Who's Jane? Oh, she was here ten years ago.')
Consistently registering everything with a generic email address, a suitable generic login or username, and so on avoids all of that. When someone leaves nothing needs to change and there's no uncomfortable feel or awkward 'who is jane?' explanations in a few years.
(There are exceptions to this, of course. Sometimes a service has been built with these issues in mind, so it has groups and supports multiple accounts that you manage and so on. Sometimes a registration is genuinely personal and will only ever be used by you and it's okay for it to go away if you leave. Sometimes it's just in the nature of the service that everyone needs an individual login in order for things to really work. And so on.)
PS: The flipside of this is that if you're a service provider who has people register accounts with you, this is yet another reason that you really want to support changing logins.
No new web templating languages; use an existing one
Suppose, hypothetically, that you are creating a web application. Let's even suppose that it's a very small and simple one, almost an embarrassingly small one. As part of this app, you need a very little bit of something like a templating system. Not much, just a bit more than printing formatted strings. Clearly you have such a trivial situation that you can just bang together a tiny and simple mini-templating language, right?
Let me save you some time and effort: no. Don't do it. The reality is that we've reached a point in time where writing your own (web) templating language or system is basically guaranteed to be a mistake. I know, you have a trivial application and you don't want to take an external dependency, you hardly need anything, all of the existing templating systems are wrong or too heavyweight, there's a whole list of excuses. Don't accept them. Suck it up, take an external dependency, and use an existing templating system even if it's vast overkill for your problem. Your future self will thank you in a few years.
(I could almost go further than this and maybe I should, but that's another entry.)
All of this especially applies if you have an application that's needs more than a trivial templating system; I picked an extreme case because it's where the temptation can be strongest. Writing your own non-tiny templating system today is an especially masochistic exercise because even a basic one is a bunch of work and raises a moderate ton of questions that you're ill-equipped to answer (or even recognize) unless this is not your first templating system.
In hindsight, writing my own 'simple' templating system was one of the mistakes I made when I wrote DWiki (the code that powers Wandering Thoughts). It's been a very educational mistake, but unless you really want to do things the hard way for the experience I can't recommend it.
(Note that rolling your own is not a great learning experience unless you live with the result for a number of years, so that you have plenty of time to run into the lurking problems. Almost anything can look good if you write it, use it briefly, and then abandon it. Many of my painful lessons took years to smack me in the face.)
PS: This assumes that you aren't working in a new language where no one's written a decent templating system. If you are, I think that you should at least steal one of the battle-tested designs from good templating systems in other languages.
(Also, yes, a very few people have very special needs and have to write their own systems. They know who they are.)
Why I spent a lot of time agonizing over an error message recently
I recently spent an inordinate amount of time not so much writing a local script as repeatedly writing, rewriting, and modifying its error messages (the rest of the script mostly simple). Now, I'll admit up front that I have a general habit of obsessing over small details of program output, and maybe some of the fidgeting with the error messages was for this. But I actually maintain that I had a completely sensible reason for caring so much about the script's error messages. You see, the script isn't supposed to fail.
More exactly, it's not supposed to fail but we think that it might someday do so because every so often something weird is going on with the operation the script is doing. In fact the script exists to automate certain workarounds we were doing when we did this particular operation 'by hand' (it's actually buried inside another script). So almost all of the time the script is supposed to work, and we certainly hope it works all the time, but there's a rare possibility of failure lurking in the underbrush.
What this means for the script is that by the time we get an error, we'll probably have long since forgotten exactly what's going on. It's likely that the script will work reliably for weeks and months, during which our knowledge of the entire problem will have been displaced by other things. This means it's important for the error message we get to be clear, so we don't have to try to remember all of the surrounding context from scratch. A cryptic error message would make perfect sense for us right now, when the context is clear in our minds, but it won't in six months.
When I was revising the error message, one part of what I did was to look for things that might be mis-remembered or misinterpreted by people who'd forgotten the context. A surprisingly large amount of my initial language was at least partially ambiguous when I took a step back and tried my best to read it without context. Things that were obvious or only had one meaning inside the context suddenly took on an uncomfortable new life outside it. The resulting error messages are significantly more verbose now, but at least I can hope that they'll still make sense in six months.
(This is of course a version of the problem of context in programming.)
What sysadmins want out of logging means that it can't be too simple
Dave Cheney recently wrote Let's talk about logging about (Go) logging packages, where he advocates, well, I'm going to quote him directly:
I believe that there are only two things you should log:
- Things that developers care about when they are developing or debugging software.
- Things that users care about when using your software.
Obviously these are debug and info levels, respectively.
log.Infoshould simply write that line to the log output. There should not be an option to turn it off as the user should only be told things which are useful for them. [...]
My reaction is that this is too simple for real use. Ignoring things like (web) activity logs (which Dave Cheney agrees are a different case), there are clear divisions between what sysadmins need at different times and in different situations.
First, let's agree that programs should always be able to log their basic actions. If you're a web server, this is HTTP requests; if you're a mail server, this is email traffic; and so on. This tells sysadmins whether or not the system is doing anything, and if it's doing something what it's doing and how fast. Sysadmins will use this to do monitoring, to check if something happened (such as an email arriving or a request being processed), and so on.
Systems not infrequently encounter internal issues that are not fatal errors. They may experience timeouts, request errors, and so on. If we say that errors are fatal, these are all 'warnings' (even if some terminate the processing of the current whatever that the system is handling). They mark odd things that should not normally happen. Sysadmins like to have a record of these for obvious reasons.
Finally, when sysadmins are working to diagnose problems with services we want to be able to get detailed activity traces of exactly how the system processed requests. What did it look at? What did it find or not find as it stepped through things? Here we're looking for a description of why the system is acting as it is. This level of information is too voluminous to be logged routinely, and often it needs to be segmented up so that we can look only at certain aspects (because otherwise we'll drown in probably irrelevant information).
It's tempting to say that this level of information is the same as developer debug information, but it's my view that it's not. Developer debug information is internally focused and aimed at people who know the code and are making code changes. Sysadmin activity traces are externally focused and aimed at people who do not know the code and are not changing it. As a sysadmin, I don't care about internal state in the code; I'm going to assume there's no code bug and instead that I have either a misconfiguration or a malfunction somewhere in the overall system environment. I want to find that.
You can in theory run all of this through a simple
interface. But if you do so there are two problems. First, you need
to create internal standards in your program for formatting messages
so that sysadmins can tell the different sorts of messages apart
from each other. Second, you are spewing massive amounts of information
out all the time (since you're always dumping all activity traces),
which is not very friendly. My view is that a good logging package
should be able to do this for you. A too-simple logging package
throws both program authors and sysadmins to the wolves of ad-hoc
logging and log filtering.
This is why real programs grow features to control what gets logged and to log different sorts of things in different places. Apache does not have separate request logs and error logs for arbitrary reasons, for example; real people wanted that separation because they find it quite useful.
The great Delete versus Backspace split
In the history of Unix, one of the great quiet divisions has been the split of what your erase character should be, with the choice being DEL (Ctrl-?) or Ctrl-H. In turn, this goes back to the physical serial terminals that for years were how you logged on to most Unix systems. In the days of real terminals, you wanted your Unix erase character to be set as whatever your terminal generated when you hit the big, prominent 'backspace' key (which was, of course, not at all programmable). Setting DEL when your key generated Ctrl-H or vice versa was a recipe for frustration and extra work.
If life was great, everyone would have agreed that the terminal backspace key generated one thing and we'd be done. Of course that didn't happen. As I remember it, most of the world's serial terminal makers decided that the backspace key would generate Ctrl-H, but Digital (aka DEC) decided to be special and have their backspace keys generate DEL. The Digital VT-52, VT-100, and later VT-series serial terminals were very popular, so Digital's decision had an outsized effect on Unix users. Adding to the fun for Unix users was the GNU Emacs decision to bind Ctrl-H to 'get help' and officially bless DEL as the (sole) delete character.
(It didn't hurt the (Unix) popularity of the VT series that for years you bought Digital computers to run Unix, first PDP-11s and later Vaxes.)
Although my history of this is somewhat complicated, I wound up in
the Digital 'erase is DEL' camp, partly because it's what at least
some of the hardware I used wanted to do and partly because it made
life easier in GNU Emacs. Given this, you can probably guess the
original cause of my swapping Backspace and Delete in X until
recently; I started doing this when my office
workstation changed from a DECStation to an SGI Indy. The DECStation's backspace key generated DEL
(and all of my environment was set up to deal with that), while the
SGI Indy's backspace key wanted to generate Ctrl-H. At the time I
made the rational decision that it was simpler to use
switch Backspace and Delete than to change my entire environment
(I don't know and can't remember what GNU Emacs's state was at the time as far as distinguishing the Backspace key from a typed Ctrl-H. These days they're definitely different under X.)
PS: I don't remember how the original vi behaved with Ctrl-H and
DEL, but I think it was friendlier than GNU Emacs as far as dealing
with a backspace key that sent Ctrl-H (at least if you had your
stty erase character set properly, which you probably did).
Why I (still) care about SELinux and it's flaws
A perfectly sensible reaction to my series of disgruntlements is to ask why I still care enough to write about it. There is all sorts of ill-considered software out there in the world, and disabling SELinux is simple enough. I don't gripe about Ubuntu's AppArmor, for example (which we disable too). As it happens, there are two major reasons that I continue to care about SELinux.
First, the continued existence and popularity of SELinux drains the time and attention of people away from doing other, more usable security work. Linux needs security work of all sorts, including defenses against normal programs being compromised. In fact, the existence and theoretical purity and power of SELinux (and it being integrated into the kernel and major distributions) serves to block most explorations of more usable but more messy solutions. If you propose doing something, especially if you touch user-level programs, I expect that you'll get told 'SELinux already solves that (and better)'.
(If you want an idea of what such solutions might look like, look
at the work OpenBSD is doing here with eg the
system call and other related things.)
Or in short, SELinux is effectively a high stakes gamble with Linux security. People are betting on what is very close to mathematical security, which would be great if it worked but instead often leads to the total failure of SELinux's toxic mistake.
Second, increasingly SELinux is being advocated as a default thing for everyone to use as part of hardening Linux, not just as an extra add-on for the paranoid. This is not exactly a new development (it's why SELinux is the default in Red Hat Enterprise and Fedora), but my strong impression is that it's been ramping up these days (more and more people will loudly tell you that you're doing it wrong if you disable SELinux, for example). When SELinux is supposed to be for everyone, well, it affects me more and more; it's increasingly present and increasingly mandatory.
Also, as part of caring about the direction of Linux in general I care about something that is theoretically supposed to be the Linux answer for (user-level) security issues for everyone. If SELinux is Linux's security solution and I think it's a bad idea, every so often my irritation boils over and I write another blog entry here.
(Real, usable security is one of my hot buttons in general, as you may have either noticed or guessed.)
Revisiting a bit of my X keymapping history
I started using X what is now a very long time ago, and unlike some people I've never had a complete break with my original X environment, one where I restarted my setup from scratch and threw away all of my old customizations. The natural result of this is that I have been carrying forward some historical decisions without actually ever really looking at them in a modern environment.
At some point in my life with X, one of my customizations became
to swap the Backspace and Delete keys via
xmodmap. My reason for
doing this at the time was straightforward; Backspace was more
convenient but generated ^H in various things, while I had set my
Unix delete character to ^? (aka DEL) a very long time ago for
reasons that seemed to make sense at the time. So things rumbled
forwards, and when X programs gave me the choice I set them so that
both the Backspace and the Delete keys would generate a DEL or
otherwise act the same.
(It's possible that I was actually mistaken about this and my swapping Backspace and Delete was due to a misunderstanding. At any rate, I did it and this is what I can vaguely remember as my reasoning.)
Of course, not all things treat the two keys the same. Many editing fields in X programs use Backspace for 'delete-left' and Delete for 'delete-right', so I got thoroughly acclimatized to reaching off to the far key in order to actually backspace over things in those programs. And there were always a few other anomalies here and there that I just reflexively dealt with.
Recently, for reasons beyond the scope of this entry, I wound
up in a situation without my usual Backspace/Delete swap. Much to
my surprise, I didn't notice this in
xterm, where everything
continued just as before (most of how I noticed was realizing that
I was deleting characters in a few X programs with the much more
convenient Backspace key). While I wasn't paying attention,
had quietly decided to start turning both the Backspace and Delete
keys into DEL (or at least into whatever your
stty erase character
is set to; I haven't investigated). Since I have
urxvt and Gnome
Terminal set up the same way, my
xmodmap key swapping turns out
to now be both unnecessary and actually a little bit inconvenient.
(Konsole isn't set up to do this, but then I never use it anyways. Sorry, KDE.)
So now I've removed that little bit of
xmodmap work from my X
dotfiles, and I'm taking a bit of a look at the other keymapping
things I'm doing. I can tell you that making CapsLock into an
additional Control key is definitely staying, though.
The whole exercise has been interesting and a little bit spooky.
I haven't thought about my
xmodmap stuff for quite a while
now; after all, it worked, right? Yet either it quietly became
unnecessary at some point or was never necessary in the first
place. I'm sure there's other parts of my X environment that are
the same way and I just haven't stumbled over them yet.
(My collection of X resources settings is a good candidate for this. I'm sure there's settings in there for programs that don't even exist any more.)
PS: The widespread use of GNU readline and similar line editing things in programs can make it a little bit hard to see just what characters your Backspace and Delete keys are generating, since by default I believe that readline et al do the same thing with ^H and DEL.