Give your personal scripts good error messages

May 24, 2010

Sysadmins have a habit of accumulating little personal scripts to make our lives more convenient, often things that make perfect sense to us but are too peculiar or specific to be worth installing generally. Since these are personal hacks, they're often bashed together in very casual ways; if it mostly works for us, it's good enough.

From personal experience, I now have a suggestion for these scripts: make sure that they have clear or at least comprehensible error messages, not little cryptic ones. If you do not, there will come a day when you get that cryptic error message, generally quite a long time after having written that script, and you will sit there going 'what the heck does this mean?' and scratching your head. And then you will wind up retracing and reverse-engineering your script, and that is just plain embarrassing.

This is especially important if you put in the error message about a can't-happen situation, or at least one that your script doesn't handle, because these are just the sort of things that come up a year or two after you've written the script. By the way, I strongly advocate making any personal scripts that summarize and amalgamate information explicitly check for situations that they don't handle. This saves them silently produce very wrong answers and possibly having you act on those answers.

(It's common for me to take shortcuts and make assumptions in such scripts, since after all they only have to work in our very specific environment. Of course, sometimes these aspects of our environment change and my assumptions blow up.)

A good error message is one that is clear and complete enough that you know what went wrong without remembering how the script works. Especially if it's for an error you never expect to happen, err on the side of verbosity and over-explaining things; terse error messages make sense only when you're going to see them reasonably often.

Sadly, doing this well can be harder than it looks. When you're writing a script you're immersed in what it does and how it works, so even if you think in these terms it's quite easy to over-estimate how much you'll remember six months or a year from now when you actually get that error message. Speaking from more personal experience, even rewriting the error message after you've had to retrace the script a year later doesn't entirely help, since of course now you understand the script again.

(And on that note, I should go revise an error message or two.)


Comments on this page:

From 69.113.211.148 at 2010-05-24 23:13:33:

A few years ago, I went to my boss and asked him a simple question: if, from intellectual property and productivity perspectives, it would be okay to open-source certain applications and scripts I was writing. I figured there might be a slim chance that I would get a bugfix or two from the community, but for the most part, I was mostly just looking to contribute something back, considering the huge amount of open-source software I use on a day-to-day basis.

But a funny thing happened: the quality of my code went up tremendously when I started writing it as though a thousand eyes were upon it. Instead of one user, I started writing like I had a hundred, and I didn't want anyone to have to email me to clarify something about how my application worked. Hardcoded variables and constants suddenly ended up as command-line switches with sensible defaults. Quick-and-dirty hacks disappeared. Comments started going where they belonged. And yes, thing of things, RPM specfiles started showing up in the source.

It's said that character is how you behave when no one else is looking, but sometimes, a couple of extra eyes can supply all the discipline in the world.

--Jeff

By cks at 2010-05-25 00:01:44:

I'm talking about the kind of ultra-personal scripts that make no sense outside of your own environment. For example: we keep data on what disks are used in our SAN environment, and I have personal script that eats this data and tells me what natural pairs of SAN disks a given fileserver still has unused (we normally use certain SAN disks only on certain fileservers). This script is hopelessly specific to our environment; there is no possible way to generalize it and make it useful for anyone else.

By cks at 2010-05-25 00:06:09:

The note I forgot to add is that this script is also a quick hack. One could work out exactly the same information by reading the 'disks in use' report (which in fully verbose form also reports on free disks), knowing our conventions, and pairing up unused disks as appropriate. But I'm the kind of lazy person who doesn't like doing things by hand more than once, so I put something together to just tell me the same information.

(This is also not something that we need to know very often or that drives automated scripts; we need to know unused natural mirror pairs only when we are expanding the storage in use, and that happens maybe once every few months.)

Written on 24 May 2010.
« Replacing modules for testing (and fun)
Watch out for quietly degrading SATA disks »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Mon May 24 22:11:36 2010
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.