2006-08-22
fork(), wait(), and threads
Something I've discovered through (painful) experience is that threaded
programs cannot portably fork() in one thread and then wait() for
the forked process in another one.
This will work in some environments but fail explosively in others; the one I stubbed my toe on was a Linux 2.4 kernel based system (without NPTL, which is the usual state of affairs for 2.4-based Linux distributions).
Admittedly, mixing threads and fork() is a bit perverse, but
sometimes it's what you need to do.
(I saw this in a Python program, but Python isn't doing anything special in the POSIX threads department so I have to expect that it's completely generic.)
2006-08-16
The fun of 32-bit bugs
As computers (and disk space, and memory, and etc etc) get larger, the quantities that we deal with routinely get bigger too. And when they get bigger, fun things start happening.
Today's fun thing was that I doing some measurements of disk IO speed on a machine with 2 gigabytes of memory. My usual rule of thumb is to work on at least twice the amount of main memory to crush cache effects, which meant I was telling my benchmarking program to read and write 4 GB.
Which turned out to be kind of a problem, because I had declared a
variable as int instead of long (or better yet, off_t). At 4GB
it rolled over and various interesting things happened. I count myself
fortunately that it was instantly obvious that something was bad;
it could have just resulted in quietly wrong numbers that I might not
have noticed.
As our systems and what we do with them get bigger, I imagine I can look
forward to more and more incidents like this. Already, 2GB files are
becoming pretty common and it is more and more irritating when tools
don't deal with them, or don't deal well with them (some versions of
less will eat large files but mangle the percentages and jumping to
percentages, as I found out recently).
One benefit of 64-bit computing is that many of these problems can be
papered over on 64-bit platforms by a recompile, since that makes long
big enough again. (Some people will decry that as a quick fix.)
(Yes, technically this wasn't a 32-bit bug, it was a 31-bit bug. Close enough, says I.)
Sidebar: why not just use bonnie++ or the like?
Two reasons. First, bonnie++ benchmarks too much; at the moment I'm only interested in streaming read and write speeds. Second, bonnie++ was giving me odd results and I wanted to crosscheck them with something else.
2006-08-13
The real Bourne shell problem
The Bourne shell's real problem for writing shell scripts is that it doesn't have a real data structure for lists. This leads to putting lists in strings, which leads to reparsing strings, which leads to hideous doom and all of those problems with filenames that have whitespace in them.
(When lists are represented as strings, every string might be a list, so every time the Bourne shell expands a variable it must try to turn it into a list unless the expansion is specifically marked as not doing that.)
This matters twice over. First, because lists of things come up all the time in shell scripts (lists of arguments, lists of files, etc). Second, because of the reparsing, everything that is not carefully guarded may be reinterpreted as a list and explode.
(The other consequence is that people create all sorts of conventions
for how to represent lists of things, like the colon-separated $PATH
and imitators. In a sane shell you would be able to write 'for d
in $PATH'; in the Bourne shell, you can't.)
The issue of exporting real lists into the environment need not have blocked implementing them. Bourne could have defined a canonical exported form, or just forbidden exporting real lists into the environment. Even unexportable real lists would make Bourne shell scripting much less irritatingly explosive.
(This insight is not original to me; I got it from Tom Duff's paper on rc, the Plan 9 shell, where he lays out a lucid explanation of the Bourne shell's flaws.)
Sidebar: a nitpick
Technically, the Bourne shell has one real list: the list of arguments.
Sufficiently determined shell scripts (that have saved everything
important from their arguments) can reuse this reliable storage via
'set --', and some do.
2006-08-10
A Bourne shell gotcha: redirection order
I was reminded of this by seeing it in a shell script recently: the order that you do redirection in on a command line can be important. This means that the following two lines are not equivalent:
foobar 2>&1 >/some/where foobar >/some/where 2>&1
The first one sends standard error to wherever the current standard
output is, and then sends standard output to /some/where; the latter
sends both standard output and standard error to /some/where.
This (only) happens when you are redirecting to file descriptors, because the redirections are applied left to right and use the current 'value' of that file descriptor, even if the file descriptor will later be sent somewhere else by a later redirection.
This is unfortunately an easy mistake to make and a pernicious one to
boot, since you may not notice it for a while (for example, if foobar
produces errors only rarely).
Sidebar: more redirection idioms
While I'm writing this stuff down, there's some common Bourne shell redirection idioms that are worth remembering:
echo error message 1>&2- This redirects error messages from shell scripts to standard error, where they should go.
exec >/some/where 2>&1- This redirects all further output from
the script and things it runs to
/some/where. (You can obviously use it to redirect just stdout or stderr alone, as desired.)
You can play rather obscure tricks by using high file descriptors in creative ways. For example, the following ugly incantation swaps standard output and standard error:
foobar 5>&2 2>&1 1>&5 5>&-
This has the defect that it destroys file descriptor 5, which hopefully wasn't already being used for anything important.
2006-08-06
A fun little regular expression bug
The problem with regular expressions is the same problem as computer programming in general: the computer will faithfully do exactly what you told it to do, regardless of whether or not this was what you actually wanted.
(The other problem with regular expressions is that they are a crappy programming language, not in power (they've got plenty of power) but in terms of being able to read and write them. (Okay, one of the other problems with regexps. There are more.))
DWiki had an interesting bug recently that perfectly illustrates this. As part of parsing DWikiText, I need to recognize things that look like this:
- foo bar: more text
The simple regular expression to match this and to group the 'foo bar' and the 'more text' bits is:
^-\s+(.+):\s+(.+)
However, this matches too much if you have a 'blah: more' in the 'more text' bit, because regular expressions are greedy. Fine, no problem:
^-\s+([^:]+):\s+(.+)
And that's what DWiki had for a long time. But the other day I discovered that this has a bug; it doesn't match the following:
- [[foo http://bar]]: baz.
This is because the regular expression as written requires that the 'foo bar' portion not merely have no ': ' bits in it, but that it have no colons in it at all. (If it does have a ':' but no following space, the group ends but the following ': ' required by the regexp isn't there.)
My quick on the spot solution was to club in an exception:
^-\s+(([^:]|:(?!\s))+):\s+(.+)
(This says that the first group is allowed to contain colons, as long as they are not followed by a space. Yes, I could probably write that as ':\S' instead. I was in a hurry; I count it lucky that I wrote '\s' instead of just ' '.)
I did it this way because I was very much in a quick fix, patch the existing regexp mode at the time. In retrospect, the whole thing might be better written using a non-greedy regexp match:
^-\s+(.+?):\s+(.+)
Of course, it would be nice if I could be confidant that this performs as well as the other version, but as I've found out before I can't be unless I measure it.
(Disclaimer: the regexps are slightly simplified from the real ones that DWiki uses, because the real ones are encrusted with some internal concerns that obfuscate them a bit.)