Fixing low command error distances

August 25, 2008

Suppose that you have a command with an unnervingly low error distance, either because a vendor stuck you with it or because it's the natural way to structure the command's arguments. The way to fix this is to change the sort of error required to make a mistake, so that you move from a likely change to an unlikely one.

(If you are working with a vendor command, you will need to do this with some sort of a cover script or program. If you are working with a local command, you can just change the arguments directly.)

For a concrete example, lets look at the ZFS zpool command to add a spare disk to a ZFS pool: 'zpool add POOL spare DEVICE'. Much like adding mirrors to a ZFS pool, this is one omitted word away from a potential disaster. The simple fix in a cover script is to change it to a separate command, making it something like 'sanpool spare POOL DEVICE'; this changes the error distance from an omitted word to a changed word, a less likely mistake (especially because the word you'd have to change is in a sense the focus of what you're doing).

To make the change less likely, modify the command that the cover script uses to expand a ZFS pool; instead of using 'add' (which is general but raises the question of 'add what?'), use 'grow'. Contrast:

sanpool grow POOL DEVICE
sanpool spare POOL DEVICE

Now the commands are fairly strongly distinct and harder to substitute for each other, because it is a much bigger mental distance from 'add a spare' to 'grow the pool' than from 'add a spare' to 'add a device'.

(When trying to prevent errors, it is useful to approach the commands from a high level view of what people are trying to do rather than look for low-level similarities in how it gets done. In a sense the way to avoid errors is to avoid similarities in things that are actually different.)

Another other way for cover scripts to help you avoid errors is to just not allow them to start with. System commands may have to be general and thus allow even the questionable, but your scripts can be more restrictive; for example, if you know you should never have non-redundant ZFS pool devices, you can just make 'sanpool grow' require that.

Written on 25 August 2008.
« The concept of error distance in sysadmin commands
How I think about how important security updates are »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Mon Aug 25 22:44:27 2008
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.