sysadmin/FixingErrorDistances written at 22:44:27; Add Comment
Fixing low command error distances
Suppose that you have a command with an unnervingly low error distance, either because a vendor stuck you with it or because it's the natural way to structure the command's arguments. The way to fix this is to change the sort of error required to make a mistake, so that you move from a likely change to an unlikely one.
(If you are working with a vendor command, you will need to do this with some sort of a cover script or program. If you are working with a local command, you can just change the arguments directly.)
For a concrete example, lets look at the ZFS
To make the change less likely, modify the command that the cover script uses to expand a ZFS pool; instead of using 'add' (which is general but raises the question of 'add what?'), use 'grow'. Contrast:
sanpool grow POOL DEVICE sanpool spare POOL DEVICE
Now the commands are fairly strongly distinct and harder to substitute for each other, because it is a much bigger mental distance from 'add a spare' to 'grow the pool' than from 'add a spare' to 'add a device'.
(When trying to prevent errors, it is useful to approach the commands from a high level view of what people are trying to do rather than look for low-level similarities in how it gets done. In a sense the way to avoid errors is to avoid similarities in things that are actually different.)
Another other way for cover scripts to help you avoid errors is to
just not allow them to start with. System commands may have to be
general and thus allow even the questionable, but your scripts can
be more restrictive; for example, if you know you should never have
non-redundant ZFS pool devices, you can just make '
sysadmin/CommandErrorDistance written at 00:36:04; Add Comment
The concept of error distance in sysadmin commands
I have recently started thinking about the concept of what I will call the 'error distance' of sysadmin commands: how much do you have to change a perfectly normal command in order to do something undesirable or disastrous (instead of just failing with an error)?
(As an example, consider the ZFS command to expand a ZFS pool with
a new pair of mirrored disks, which is '
You want the error distance for commands to be as large as possible, because this avoids accidents when people make their inevitable errors. Low error distance is also more dangerous in commonly used commands than uncommonly used ones, because you are less likely to carefully check a command that you use routinely (especially if you don't consider it inherently dangerous).
When considering the error distance, my belief is that certain sorts of changes are more likely than others (and thus make the error distance closer). My gut says:
(I suspect that this has been studied formally at some point, probably by the HCI/Human Factors people.)
* * *
Atom feeds are available; see the bottom of most pages.