2009-04-26
A Bourne shell irritation: piping just stderr
I generally like the Bourne shell, but I will easily admit that it has a number of things that are less than ideal. One of those less than ideal things is a peculiar omission: you cannot easily redirect just standard error into a pipeline.
This sounds like a peculiar thing to want, but there are situations where you do need to process stderr separately; for example, you might need to timestamp all stderr output and log it. In a hypothetical shell with this feature you could have a 'timestamper' program and just write:
process |[2] timestamper >>error-log
In the Bourne shell, not so much. You can do it, but it is what you would call intricate:
exec 3>&1
process 2>&1 >&3 3>&- | timestamper >>error-log
What's going on here is that first we save a copy of the real stdout on file descriptor 3, and then in the command itself we:
- send stdout to the timestamper process (remember, pipelines are set up before redirection)
- redirect stderr to stdout, which is the pipeline
- redirect stdout to fd 3, which is our original stdout
- close fd 3 so that
processdoes not inherit a surprise file descriptor.
(We must save the original stdout separately via exec because
of the pipeline vs redirection timing issue; if we wrote 'process
3>&1 ...' we would be capturing the pipe'd stdout, not the
original stdout.)
(This incantation is not original to me; I've seen it written up elsewhere on the Internet, but I never really understood it until I wrote out everything here (sort of cf, especially the comment).)
2009-04-03
The (or a) problem with Unix manpages
Unix has three sorts of manpages: excellent ones that clearly answer your questions, good ones that answer your questions if you read them carefully, and bad ones that aren't worth reading because they don't answer your questions (at least in a way that you can understand).
The first sort are rare and obvious when encountered, which means that when you read a manpage you are generally trying to guess whether you are dealing with the second sort or the third sort. The problem is that there are significantly more of the third sort than there are of the second sort, so people (myself included) are trained into aggressively skimming manpages instead of reading them carefully, because usually reading carefully doesn't actually help and just wastes your time.
And this is a problem because then you run into a manpage that actually does answer your questions, except you didn't bother to read it carefully so you didn't notice (if you are lucky, you notice later). This is usually at least a bit embarrassing.
(This actually generalizes to other documentation, but I think that Unix manpages are a large source of this sort of thing, partly because they aggregate together in one spot a lot of documentation from a lot of different people.)
How to use Vixie cron to schedule at regular odd times
The traditional way of specifying that cron should run a command every ten minutes is to write out a list of every minute that the command is to run on:
0,10,20,30,40,50 * * * * /some/command
Vixie cron introduced a shorthand notation for this: '*/10'. We use
it quite a lot, since it is very convenient; it is both shorter and
more explicit about what is really going on, which means that we are
less likely to make mistakes.
(Quick, spot the error in '10,20,30,40,50'. Or is there even an error? Without additional information on what was actually intended, you can't know for sure.)
But suppose you want to run a command every ten minutes but offset
by two minutes (so that it runs at :02, :12, and so on) because,
for example, you already have another command that runs every ten
minutes and you don't want the two to clash. It turns out that Vixie
cron also has a short form for this, one that I find somewhat less
obvious: '2-59/10'.
(Per the documentation, this means 'every ten minutes starting at :02 and going to :59'; the '2-59' is the range and the '/10' is the skip interval. Cron starts at the start of the range and then skips forward the skip interval every time until it runs out of the range.)
You have to start the range at the offset that you want, in this case two minutes, but the end time doesn't have to be aligned with the skip interval. I end my ranges at 59 minutes by convention, because it's always good enough regardless of what the skip interval is.
(This is one of those entries that I write so that maybe I'll remember the logic the next time around, and if not I can look it up here. Although the manpage actually is quite clear if read carefully, now that I look at it closely.)