Wandering Thoughts archives


I should remember that I can cast things in Go

I think of Go as a strongly typed language. My broad and somewhat reflexive view of strongly typed languages is that they mostly don't allow you to cast things around because most casts will require expensive data conversion and the language wants you to do that explicitly, with your own code. Go even sticks to this view; you can cast numbers around (because it's too useful) and you can go between string and []byte (because it's a core operation), and that's mostly it.

(Then there's interfaces, which expose some tricks. Interface casting involves a bunch of potentially expensive magic, but it's a core feature of Go so it's an exception to the 'no expensive operations via casts' rule of thumb.)

However, there is an important practical exception to this, which comes about because of another thing that Go encourages: lots of named types that you derive from fundamental types. Rather than using, say, int, for all sorts of different things in your code, everyone knows that you should instead create various specific types:

type Phase int
type Action int

type OneMap map[string]string
type TwoMap map[string]string

This way you can never accidentally use a Phase when the actual function, field, or whatever is supposed to be an Action, or pass a OneMap function a TwoMap, and so on. Go's strong typing will force them to be separate (even if this is sometimes irritating, for example if you're dealing with cgo).

These derived types can be cast to each other and to their underlying type. This is not just if they're numbers; any derived type can be cast around like this, provided that the underlying 'real' types are the same (per the Conversions section of the language spec).

(At a mechanical level it's easy to see why this is okay; since the two derived types have exactly the same memory layout, you don't have to do expensive conversion to generate a type-safe result.)

Now, ordinarily you still don't want to cast a OneMap to a TwoMap (or to a map[string]string). But there is one special case that matters to me, and that's if I want to do the same operation on both sorts of maps. Since I actually can cast them around, I don't need to write two duplicated blocks of (type-specific) code to do the same operation. Instead I can write one, perhaps one that's generic to the map[string]string type, and simply call it for both cases through casts. This is not the only way to create common code for a generic operation but it's probably the easiest one to add on the fly without a bunch of code refactoring.

So this is why I need to remember that casting types, even complex types, is something that I can do in Go. It's been kind of a reflexive blind spot in my Go code in the past, but hopefully writing this will avoid it in the future.

programming/GoHasCasts written at 01:41:01; Add Comment


ZFS pool import needs much better error messages

One of the frustrating things about dealing with sufficiently damaged ZFS pools is that 'zpool import' and friends do not generate very detailed error messages. There are a lot of things that can go wrong with a ZFS pool that will make it not importable, but 'zpool import' has clear explanations for only some of them. For many others all you get is a generic error in 'zpool import' status reporting of, say:

The pool cannot be imported due to damaged devices or data.

(Here I'm talking about the results of just running 'zpool import' to see available pools and their states and configuration, not trying to actually import a pool. Here zpool has lots of room to write explicit and detailed messages about what seems to be wrong with your pool's configuration.)

This isn't just an issue of annoying and frustrating people with opaque, generic error messages. Given that the error messages are generic, it's quite easy for people to focus only on the obvious problems that zpool import reports, even if those problems may not be the reason the pool can't be imported. As it happens I have a great example of this in action, in this SuperUser question. When you read this question, can you figure out what's wrong? Both the SuperUser ZFS community and the ZFS on Linux mailing list couldn't.

(I believe that everything you need to figure out what's going on is actually in the information in the question and the code behind 'zpool import' actually knows what the problem is. This assumes that my diagnosis is correct, of course.)

Perhaps zpool import should not be fully verbose by default, as there's a certain amount of information that may only make sense to people who know a fair bit about how ZFS works. But it certainly should be possible to get this information with, eg, a verbose switch instead of having to reverse engineer it from zdb output. If nothing else, this means that you can get a verbose report and show it to ZFS exports in the hope that they can tell you what's wrong.

On a purely pragmatic level I think that zpool import should be really verbose and detailed when a pool can't be imported. 'My pool won't import' is one of the most stressful experiences you can have with ZFS; to get unclear, generic errors at this point is extremely frustrating and does not help one's mood in the least. This is exactly the time when large amounts of detail are really, really appreciated, even if they're telling you exactly how far up the creek you are.

(This means that I would very much like a 'zpool import -v <pool>' option that describes exactly what the import is doing or trying to do and then covers all of the problems that it detected with the pool configuration, all the things the kernel said to it, and so on. A report of 'I am asking the kernel to import a pool made up of the following devices in the following vdev structure' is not too verbose.)

PS: while this example is from ZFS on Linux and FreeBSD, I've looked at the current Illumos code for zpool and libzfs, and as far as I can see it would have exactly the same problem here.

(Part of the issue is that zpool import and libzfs have what you could call less than ideal reporting if a pool is marked as active on some other system and also has configuration problems. But even if it reported multiple errors I think that the real problem here would remain obscure; the current 'zpool import' code appears to deliberately suppress printing out parts of the information necessary.)

solaris/ZFSImportBetterErrors written at 00:35:51; Add Comment


We killed off our SunSolve email contact address yesterday

Back in the days when Sun was Sun, Sun's patch access and support system was imaginatively called Sunsolve. If you had a support contract with Sun (which often was only about the ability to get patches and file bug reports), you had a SunSolve account. We had one, of course (we have been using Solaris for longer than it's been Solaris). In the very beginning we made a classical mistake and had it in the name and email of a specific sysadmin (who then moved on), but in the early days of our Solaris 10 fileservers we switched this to a generic email address, cleverly named sunsolve.

Yesterday, we removed that address.

Our Solaris machines have all been out of commission for a while now, but we left the address in place mostly because of inertia. What pushed me to remove it is the usual reason; we just couldn't get Oracle to stop mailing things to it. I don't think Oracle spammed it (unlike some people), but they did keep sending us information about patch clusters and quarterly updates and this and that, all of which is irrelevant to us these days.

(I managed to get Oracle to mostly knock it off, but the other day they decided that they had an update that was so urgent that they just had to mail it to us. Never mind that we don't have any of the software at issue, that Oracle had our email address was good enough for them.)

At one level this is an unimportant little bit of cleanup that we should have done long ago. With our Solaris machines gone and our grandfathered support contract let run down, the email address had no point; it was just another lingering bit of clutter, and we should get rid of that kind of thing while we remember what it is and why we can remove it.

(If you wait long enough on this sort of thing, you can easily forget whether or not there's some special, inobvious reason that you're keeping these old oddities around. So it's best to strike while everything is fresh in your mind.)

At another level, the sunsolve email address was one of the last lingering traces of what was (after all) a very long association with Sun and Solaris. Just as with other things, letting it go is yet another line drawn under all of that history, even if SunSolve itself stopped existing years ago.

(Oracle decommissioned SunSolve and folded the functionality into their own support system not long after they bought Sun. The conversion was not entirely pleasant for support customers.)

PS: Since I just looked, it warms my heart a little bit that PCA is still trucking along. Oracle may have killed some very useful customer-done things but at least they left PCA alone. If we still had to deal with the mess that is Solaris patches, we'd be very thankful for that.

solaris/SunSolveEnding written at 00:27:47; Add Comment


Don't have support registrations in the name of a specific sysadmin

Every so often at work you will buy something, sign up for a service, arrange a support contract, register for monitoring, or whatever with an outside company or organization. Not infrequently these things will ask you for an email address that will be both your organization's key to the service and the person who gets notifications about it (sometimes you have to pick a username or login too). Here is something that we have learned the awkward way: when you do this, don't just use your email address (and name, and so on). Instead, either use an existing generic group email address or make up a new service specific email address (often these will just be mail aliases that distribute the email to all of the relevant parties). There are two reasons for this.

The first reason is that it keeps a single person from being the critical path for things to do with the service. If things like password resets or approvals for some action go only to me because I used my own email address and you need to do one of these things when I'm sick or on vacation or very busy or whatever, well, we have a problem. Or at least an annoyance. Using a generic address that multiple people see avoids that problem; I don't need to wait for the single magic person to be able to deal with whatever they need to do.

The second reason is that, well, to put it bluntly: people leave eventually. If person X leaves and there are things tied to their email address, using their customary personal login, and so on, life is at least a bit awkward. You can make it work, but take it from personal experience, it still feels weird and not entirely right to log in somewhere as ex-co-worker X because that's just how it was set up.

(I imagine you can have lots of fun if there have been several generations of turnover. 'Why do we have to log in to this site as 'jane'? Who's Jane? Oh, she was here ten years ago.')

Consistently registering everything with a generic email address, a suitable generic login or username, and so on avoids all of that. When someone leaves nothing needs to change and there's no uncomfortable feel or awkward 'who is jane?' explanations in a few years.

(There are exceptions to this, of course. Sometimes a service has been built with these issues in mind, so it has groups and supports multiple accounts that you manage and so on. Sometimes a registration is genuinely personal and will only ever be used by you and it's okay for it to go away if you leave. Sometimes it's just in the nature of the service that everyone needs an individual login in order for things to really work. And so on.)

PS: The flipside of this is that if you're a service provider who has people register accounts with you, this is yet another reason that you really want to support changing logins.

sysadmin/RegisterGenericAddresses written at 00:15:42; Add Comment


No new web templating languages; use an existing one

Suppose, hypothetically, that you are creating a web application. Let's even suppose that it's a very small and simple one, almost an embarrassingly small one. As part of this app, you need a very little bit of something like a templating system. Not much, just a bit more than printing formatted strings. Clearly you have such a trivial situation that you can just bang together a tiny and simple mini-templating language, right?

Let me save you some time and effort: no. Don't do it. The reality is that we've reached a point in time where writing your own (web) templating language or system is basically guaranteed to be a mistake. I know, you have a trivial application and you don't want to take an external dependency, you hardly need anything, all of the existing templating systems are wrong or too heavyweight, there's a whole list of excuses. Don't accept them. Suck it up, take an external dependency, and use an existing templating system even if it's vast overkill for your problem. Your future self will thank you in a few years.

(I could almost go further than this and maybe I should, but that's another entry.)

All of this especially applies if you have an application that's needs more than a trivial templating system; I picked an extreme case because it's where the temptation can be strongest. Writing your own non-tiny templating system today is an especially masochistic exercise because even a basic one is a bunch of work and raises a moderate ton of questions that you're ill-equipped to answer (or even recognize) unless this is not your first templating system.

In hindsight, writing my own 'simple' templating system was one of the mistakes I made when I wrote DWiki (the code that powers Wandering Thoughts). It's been a very educational mistake, but unless you really want to do things the hard way for the experience I can't recommend it.

(Note that rolling your own is not a great learning experience unless you live with the result for a number of years, so that you have plenty of time to run into the lurking problems. Almost anything can look good if you write it, use it briefly, and then abandon it. Many of my painful lessons took years to smack me in the face.)

PS: This assumes that you aren't working in a new language where no one's written a decent templating system. If you are, I think that you should at least steal one of the battle-tested designs from good templating systems in other languages.

(Also, yes, a very few people have very special needs and have to write their own systems. They know who they are.)

web/NoNewTemplateLanguages written at 01:13:33; Add Comment


Why I spent a lot of time agonizing over an error message recently

I recently spent an inordinate amount of time not so much writing a local script as repeatedly writing, rewriting, and modifying its error messages (the rest of the script mostly simple). Now, I'll admit up front that I have a general habit of obsessing over small details of program output, and maybe some of the fidgeting with the error messages was for this. But I actually maintain that I had a completely sensible reason for caring so much about the script's error messages. You see, the script isn't supposed to fail.

More exactly, it's not supposed to fail but we think that it might someday do so because every so often something weird is going on with the operation the script is doing. In fact the script exists to automate certain workarounds we were doing when we did this particular operation 'by hand' (it's actually buried inside another script). So almost all of the time the script is supposed to work, and we certainly hope it works all the time, but there's a rare possibility of failure lurking in the underbrush.

What this means for the script is that by the time we get an error, we'll probably have long since forgotten exactly what's going on. It's likely that the script will work reliably for weeks and months, during which our knowledge of the entire problem will have been displaced by other things. This means it's important for the error message we get to be clear, so we don't have to try to remember all of the surrounding context from scratch. A cryptic error message would make perfect sense for us right now, when the context is clear in our minds, but it won't in six months.

When I was revising the error message, one part of what I did was to look for things that might be mis-remembered or misinterpreted by people who'd forgotten the context. A surprisingly large amount of my initial language was at least partially ambiguous when I took a step back and tried my best to read it without context. Things that were obvious or only had one meaning inside the context suddenly took on an uncomfortable new life outside it. The resulting error messages are significantly more verbose now, but at least I can hope that they'll still make sense in six months.

(This is of course a version of the problem of context in programming.)

sysadmin/ContextInErrorMessages written at 01:33:02; Add Comment


What sysadmins want out of logging means that it can't be too simple

Dave Cheney recently wrote Let's talk about logging about (Go) logging packages, where he advocates, well, I'm going to quote him directly:

I believe that there are only two things you should log:

  1. Things that developers care about when they are developing or debugging software.
  2. Things that users care about when using your software.

Obviously these are debug and info levels, respectively.

log.Info should simply write that line to the log output. There should not be an option to turn it off as the user should only be told things which are useful for them. [...]

My reaction is that this is too simple for real use. Ignoring things like (web) activity logs (which Dave Cheney agrees are a different case), there are clear divisions between what sysadmins need at different times and in different situations.

First, let's agree that programs should always be able to log their basic actions. If you're a web server, this is HTTP requests; if you're a mail server, this is email traffic; and so on. This tells sysadmins whether or not the system is doing anything, and if it's doing something what it's doing and how fast. Sysadmins will use this to do monitoring, to check if something happened (such as an email arriving or a request being processed), and so on.

Systems not infrequently encounter internal issues that are not fatal errors. They may experience timeouts, request errors, and so on. If we say that errors are fatal, these are all 'warnings' (even if some terminate the processing of the current whatever that the system is handling). They mark odd things that should not normally happen. Sysadmins like to have a record of these for obvious reasons.

Finally, when sysadmins are working to diagnose problems with services we want to be able to get detailed activity traces of exactly how the system processed requests. What did it look at? What did it find or not find as it stepped through things? Here we're looking for a description of why the system is acting as it is. This level of information is too voluminous to be logged routinely, and often it needs to be segmented up so that we can look only at certain aspects (because otherwise we'll drown in probably irrelevant information).

It's tempting to say that this level of information is the same as developer debug information, but it's my view that it's not. Developer debug information is internally focused and aimed at people who know the code and are making code changes. Sysadmin activity traces are externally focused and aimed at people who do not know the code and are not changing it. As a sysadmin, I don't care about internal state in the code; I'm going to assume there's no code bug and instead that I have either a misconfiguration or a malfunction somewhere in the overall system environment. I want to find that.

You can in theory run all of this through a simple log.Info interface. But if you do so there are two problems. First, you need to create internal standards in your program for formatting messages so that sysadmins can tell the different sorts of messages apart from each other. Second, you are spewing massive amounts of information out all the time (since you're always dumping all activity traces), which is not very friendly. My view is that a good logging package should be able to do this for you. A too-simple logging package throws both program authors and sysadmins to the wolves of ad-hoc logging and log filtering.

This is why real programs grow features to control what gets logged and to log different sorts of things in different places. Apache does not have separate request logs and error logs for arbitrary reasons, for example; real people wanted that separation because they find it quite useful.

sysadmin/SysadminLoggingNotSimple written at 01:50:26; Add Comment


The great Delete versus Backspace split

In the history of Unix, one of the great quiet divisions has been the split of what your erase character should be, with the choice being DEL (Ctrl-?) or Ctrl-H. In turn, this goes back to the physical serial terminals that for years were how you logged on to most Unix systems. In the days of real terminals, you wanted your Unix erase character to be set as whatever your terminal generated when you hit the big, prominent 'backspace' key (which was, of course, not at all programmable). Setting DEL when your key generated Ctrl-H or vice versa was a recipe for frustration and extra work.

If life was great, everyone would have agreed that the terminal backspace key generated one thing and we'd be done. Of course that didn't happen. As I remember it, most of the world's serial terminal makers decided that the backspace key would generate Ctrl-H, but Digital (aka DEC) decided to be special and have their backspace keys generate DEL. The Digital VT-52, VT-100, and later VT-series serial terminals were very popular, so Digital's decision had an outsized effect on Unix users. Adding to the fun for Unix users was the GNU Emacs decision to bind Ctrl-H to 'get help' and officially bless DEL as the (sole) delete character.

(It didn't hurt the (Unix) popularity of the VT series that for years you bought Digital computers to run Unix, first PDP-11s and later Vaxes.)

Although my history of this is somewhat complicated, I wound up in the Digital 'erase is DEL' camp, partly because it's what at least some of the hardware I used wanted to do and partly because it made life easier in GNU Emacs. Given this, you can probably guess the original cause of my swapping Backspace and Delete in X until recently; I started doing this when my office workstation changed from a DECStation to an SGI Indy. The DECStation's backspace key generated DEL (and all of my environment was set up to deal with that), while the SGI Indy's backspace key wanted to generate Ctrl-H. At the time I made the rational decision that it was simpler to use xmodmap to switch Backspace and Delete than to change my entire environment around.

(I don't know and can't remember what GNU Emacs's state was at the time as far as distinguishing the Backspace key from a typed Ctrl-H. These days they're definitely different under X.)

PS: I don't remember how the original vi behaved with Ctrl-H and DEL, but I think it was friendlier than GNU Emacs as far as dealing with a backspace key that sent Ctrl-H (at least if you had your stty erase character set properly, which you probably did).

unix/DeleteBackspaceSplit written at 02:26:48; Add Comment


Why I (still) care about SELinux and it's flaws

A perfectly sensible reaction to my series of disgruntlements is to ask why I still care enough to write about it. There is all sorts of ill-considered software out there in the world, and disabling SELinux is simple enough. I don't gripe about Ubuntu's AppArmor, for example (which we disable too). As it happens, there are two major reasons that I continue to care about SELinux.

First, the continued existence and popularity of SELinux drains the time and attention of people away from doing other, more usable security work. Linux needs security work of all sorts, including defenses against normal programs being compromised. In fact, the existence and theoretical purity and power of SELinux (and it being integrated into the kernel and major distributions) serves to block most explorations of more usable but more messy solutions. If you propose doing something, especially if you touch user-level programs, I expect that you'll get told 'SELinux already solves that (and better)'.

(If you want an idea of what such solutions might look like, look at the work OpenBSD is doing here with eg the tame()/pledge() system call and other related things.)

Or in short, SELinux is effectively a high stakes gamble with Linux security. People are betting on what is very close to mathematical security, which would be great if it worked but instead often leads to the total failure of SELinux's toxic mistake.

Second, increasingly SELinux is being advocated as a default thing for everyone to use as part of hardening Linux, not just as an extra add-on for the paranoid. This is not exactly a new development (it's why SELinux is the default in Red Hat Enterprise and Fedora), but my strong impression is that it's been ramping up these days (more and more people will loudly tell you that you're doing it wrong if you disable SELinux, for example). When SELinux is supposed to be for everyone, well, it affects me more and more; it's increasingly present and increasingly mandatory.

Also, as part of caring about the direction of Linux in general I care about something that is theoretically supposed to be the Linux answer for (user-level) security issues for everyone. If SELinux is Linux's security solution and I think it's a bad idea, every so often my irritation boils over and I write another blog entry here.

(Real, usable security is one of my hot buttons in general, as you may have either noticed or guessed.)

linux/SELinuxWhyICare written at 00:33:02; Add Comment


Revisiting a bit of my X keymapping history

I started using X what is now a very long time ago, and unlike some people I've never had a complete break with my original X environment, one where I restarted my setup from scratch and threw away all of my old customizations. The natural result of this is that I have been carrying forward some historical decisions without actually ever really looking at them in a modern environment.

At some point in my life with X, one of my customizations became to swap the Backspace and Delete keys via xmodmap. My reason for doing this at the time was straightforward; Backspace was more convenient but generated ^H in various things, while I had set my Unix delete character to ^? (aka DEL) a very long time ago for reasons that seemed to make sense at the time. So things rumbled forwards, and when X programs gave me the choice I set them so that both the Backspace and the Delete keys would generate a DEL or otherwise act the same.

(It's possible that I was actually mistaken about this and my swapping Backspace and Delete was due to a misunderstanding. At any rate, I did it and this is what I can vaguely remember as my reasoning.)

Of course, not all things treat the two keys the same. Many editing fields in X programs use Backspace for 'delete-left' and Delete for 'delete-right', so I got thoroughly acclimatized to reaching off to the far key in order to actually backspace over things in those programs. And there were always a few other anomalies here and there that I just reflexively dealt with.

Recently, for reasons beyond the scope of this entry, I wound up in a situation without my usual Backspace/Delete swap. Much to my surprise, I didn't notice this in xterm, where everything continued just as before (most of how I noticed was realizing that I was deleting characters in a few X programs with the much more convenient Backspace key). While I wasn't paying attention, xterm had quietly decided to start turning both the Backspace and Delete keys into DEL (or at least into whatever your stty erase character is set to; I haven't investigated). Since I have urxvt and Gnome Terminal set up the same way, my xmodmap key swapping turns out to now be both unnecessary and actually a little bit inconvenient.

(Konsole isn't set up to do this, but then I never use it anyways. Sorry, KDE.)

So now I've removed that little bit of xmodmap work from my X dotfiles, and I'm taking a bit of a look at the other keymapping things I'm doing. I can tell you that making CapsLock into an additional Control key is definitely staying, though.

The whole exercise has been interesting and a little bit spooky. I haven't thought about my xmodmap stuff for quite a while now; after all, it worked, right? Yet either it quietly became unnecessary at some point or was never necessary in the first place. I'm sure there's other parts of my X environment that are the same way and I just haven't stumbled over them yet.

(My collection of X resources settings is a good candidate for this. I'm sure there's settings in there for programs that don't even exist any more.)

PS: The widespread use of GNU readline and similar line editing things in programs can make it a little bit hard to see just what characters your Backspace and Delete keys are generating, since by default I believe that readline et al do the same thing with ^H and DEL.

unix/XBackspaceShift written at 01:47:57; Add Comment

Page tools: See As Normal.
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.