2006-11-15
A little regexp thing to remember about \b (and \w)
A lot of documentation of Perl-style regular expressions describes \b as 'matching a word boundary' or similar phrases. Life would be simpler if the documentation used the phrase 'identifier boundary' instead, because \b's idea of word characters includes underscores. Thus \b and \w's idea of word characters makes a lot of sense for picking out identifiers in languages like C, but not necessarily so much sense for things like picking out words in written text.
(The same thing applies to GNU grep's --word-regexp option.)
Saying that this is documented if you read the full description of \b is no excuse. The problem is that 'word' is a dangerously loaded term to use, because it invites people to think that they know what it means and not read carefully (especially if they are skimming to refresh their memory). If the documentation used 'identifier' instead, people would not be led astray by their intuition about what a word is.
(This is a general problem with giving any technical definition a name that's a common term; people have to know, remember, or even realize that the common term doesn't mean what they think it means. For example, X Windows got a lot of people grumpy by inverting the way people thought about clients and servers, so in X the 'server' is on your desk and the 'client' is that big compute server over in the machine room.)
Why the Bourne shell is not my favorite language
The difference between
for i in "a b"; do mv -f $i $i-UBUNTU ... done
and
FOO="a b" for i in $FOO; do mv -f $i $i-UBUNTU ... done
is subtle (in visual appearance) and easy to accidentally forget, but important.
(Fortunately I am doing test installations in VMWare these days, so a mistake is less tedious than it used to be.)