Wandering Thoughts archives

2013-11-20

test is surprisingly smart

Via Hacker News I would up reading Common shell script mistakes. When I read this, I initially thought that it contained well-intentioned but mistaken advice about test (aka '[ ... ]'). Then I actually checked what test's behavior is and got a bunch of surprised. It turns out that test is really quite smart, sometimes disturbingly so.

Here's two different versions of a test expression:

[ x"$var" = x"find" ] && echo yes
[ "$var" = "find" ] && echo yes

In theory, the reason the first version has an 'x' in front of both sides is to deal with the case where someone sets $var to something that is a valid test operator, like '-a' or '-x' or even '('; after all, '[ -a = find ]' doesn't look like a valid test expression. But if you actually check, it turns out that the second version works perfectly well too.

What's going on is that test is much smarter than you might think. Rather than simply processing its arguments left to right, it uses a much more complicated process of actually parsing its command line. When I started writing this entry I thought it was just modern versions that behaved this way, but in fact the behavior is much older than that; it goes all the way back to the V7 version of test, which actually implements a little recursive descent parser (in quite readable code). This behavior is even specified in the Single Unix Specification page for test where you can read the gory details for yourself (well, most of them).

(The exception is that the SuS version of test doesn't include -a for and or -o for or. This is an interesting exclusion since it turns out they were actually in the V7 version of test per eg the manpage.)

Note that this cleverness can break down in extreme situations. For example, '[ "$var1" -a "$var2" -a "$var3" ]' is potentially dangerous; consider what happens if $var2 is '-r'. And of course you still really want to use "..." to force things to be explicit empty arguments, because an outright missing argument can easily completely change the meaning of a test expression. Consider what happens to '[ -r $var ]' if $var is empty.

(It reduces to '[ -r ]', which is true because -r is not the empty string. You probably intended it to be false because a zero-length file name is considered unreadable.)

unix/TestIsQuiteSmart written at 23:05:59; Add Comment

The difference between no argument and an empty argument

Here is a little Bourne shell quiz. Supposing that $VAR is not defined, are the following two lines equivalent?

./acnt $VAR
./acnt "$VAR"

The answer is no. If the acnt script is basically 'echo "$#"', then the first one will print 0 and the second one will print 1; in other word, the first line called acnt with no argument and the second one called acnt with one argument (that happens to be an empty string).

Unix shells almost universally draw some sort of a distinction between a variable expansion that results in no argument and an empty argument (although they can vary in how you force an empty argument). This is what we're seeing here; in the Bourne shell, using a "..." forces there to always be a single argument regardless of what $VAR expands to or doesn't. Sometimes this is useful behavior, for example when it means that a program is invoked with exactly a specific number of arguments (and with certain things in certain argument positions) even if some things aren't there. Sometimes this is inconvenient, if what you really wanted was to quote $VAR but not necessarily pass acnt an empty argument if $VAR wound up unset. If you want this latter behavior, you need to use the more awkward form:

./acnt ${VAR:+"$VAR"}

(Support for this is required by the Single Unix Specification and is present in Solaris 10, so I think you're very likely to find it everywhere.)

Note that it can be possible to indirectly notice the presence of empty arguments in situations where they don't show directly. For example:

$ echo a "$VAR" b
a  b

If you look carefully there is an extra space printed between a and b here; that is because echo is actually printing 'a', separator space, an empty string, another separator space, and then 'b'. Of course some programs are more obvious, even if the error message is a bit more mysterious:

$ cat "$VAR"
cat: : No such file or directory

(This entry is brought to you in the process of me discovering something interesting about modern versions of test, but that's another entry.)

unix/EmptyArgumentVsNone written at 00:01:22; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.