Wandering Thoughts archives


Shell scripts should be written to be clear first

When writing shell scripts, there's often a general tendency towards minimalism and what could be called 'code golfing'. After all, one of the important things for shell script performance is to run as few extra commands as possible. However, shell scripts have another problem, which is that for various reasons the shell is not a great programming language (see parts of my entry on why large shell scripts have problems). In particular, shell scripts often lack clarity because they have to do a lot of things indirectly.

In the past I've said that the normal problem with configuration systems is that they lack clarity more than they lack power. Similarly, shell scripts more often lack clarity than they lack performance, so I'm strongly of the feeling that you should be biased toward clarity in your shell scripting. One aspect of this is being careful of where you're being clever. Every clever, efficient idiom chosen over the obvious but slightly less efficient version is a potential future tripping hazard (for other people and your future self).

Some of what you consider (too) clever will depend on the degree of knowledge of shell scripting you want to assume. For example, there are a lot of options for manipulating variable expansion to do useful things that save you from running external programs. Should you use them? Maybe. I'm no longer so sure of that for here, because I'm just not that immersed in shell scripting (and neither are my co-workers). But arithmetic is pleasant and obvious enough that I'd definitely keep it rather than resorting to other options.

So, as one example, I would rather write 'prog="$(basename "$0")"' than the more efficient 'prog="${0##*/}"'. Everyone understands what the former does, including my future self in six months, while many people will have to refresh their memory about the latter. The time for the latter is if I'm doing an operation like this a very large number of times, enough times to have a clear performance impact.

So far in this entry I've carefully skirted around a large bear, namely what gets called "unnecessary use of cat" (sometimes 'useless use', which is a very strong opinion). I used to be on the 'anti-cat' side, but more and more I find myself feeling that using cat is clearer than the alternate options (and it's in a situation where the extra process doesn't matter). For a recent example, my use of cat with here documents is perhaps not strictly necessary (depending on how much you trust various bits of magic with read), but I believe that it will be clearer and more obviously good to most people than the alternatives.

(A great deal of "unnecessary use of cat" mythology, folklore, and history dates from an era of Unix where hardware was much slower and processes much more expensive than they are now. Using or not using cat is generally almost invisible today unless you're already hitting shell script performance issues.)

Here documents make a good example in general, because they're one of the more obscure shell features. If you're already using one obscure shell feature (one with subtle issues, like the very important effects of apparently unnecessary quoting), my view is that you want everything else involved to be as clear and obvious as possible. Keep your clarity budget as healthy as possible, rather than piling puzzle on puzzle.

(So, for example, while there are rules on combining other things with here documents, my view is that the meta-rule is "don't". Unless you have no choice, keep it simple. The shell allows you to do all sorts of things that are not actually good ideas.)

programming/ShellScriptsBeClearFirst written at 21:45:29; Add Comment

Page tools: See As Normal.
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.