2013-11-20
test
is surprisingly smart
Via Hacker News I
would up reading Common shell script mistakes. When
I read this, I initially thought that it contained well-intentioned
but mistaken advice about test
(aka '[ ... ]
'). Then I actually
checked what test
's behavior is and got a bunch of surprised. It
turns out that test
is really quite smart, sometimes disturbingly
so.
Here's two different versions of a test
expression:
[ x"$var" = x"find" ] && echo yes [ "$var" = "find" ] && echo yes
In theory, the reason the first version has an 'x' in front of both
sides is to deal with the case where someone sets $var
to something
that is a valid test
operator, like '-a
' or '-x
' or even
'(
'; after all, '[ -a = find ]
' doesn't look like a valid test
expression. But if you actually check, it turns out that the second
version works perfectly well too.
What's going on is that test
is much smarter than you might think.
Rather than simply processing its arguments left to right, it uses
a much more complicated process of actually parsing its command
line. When I started writing this entry I thought it was just modern
versions that behaved this way, but in fact the behavior is much older
than that; it goes all the way back to the V7 version of test
,
which actually implements a little recursive descent
parser (in quite readable code). This behavior is even
specified in the Single Unix Specification page for test
where you can read the gory details for yourself (well, most of them).
(The exception is that the SuS version of test
doesn't include
-a
for and or -o
for or. This is an interesting exclusion since
it turns out they were actually in the V7 version of test per eg
the manpage.)
Note that this cleverness can break down in extreme situations. For
example, '[ "$var1" -a "$var2" -a "$var3" ]
' is potentially
dangerous; consider what happens if $var2
is '-r
'. And of course
you still really want to use "..."
to force things to be explicit
empty arguments, because an outright missing
argument can easily completely change the meaning of a test
expression. Consider what happens to '[ -r $var ]
' if $var
is
empty.
(It reduces to '[ -r ]
', which is true because -r
is not the empty
string. You probably intended it to be false because a zero-length file
name is considered unreadable.)
The difference between no argument and an empty argument
Here is a little Bourne shell quiz. Supposing that $VAR
is not
defined, are the following two lines equivalent?
./acnt $VAR ./acnt "$VAR"
The answer is no. If the acnt
script is basically 'echo "$#"
',
then the first one will print 0 and the second one will print 1; in
other word, the first line called acnt
with no argument and the
second one called acnt
with one argument (that happens to be an empty
string).
Unix shells almost universally draw some sort of a distinction between
a variable expansion that results in no argument and an empty argument
(although they can vary in how you force an empty argument). This is
what we're seeing here; in the Bourne shell, using a "..."
forces
there to always be a single argument regardless of what $VAR
expands
to or doesn't.
Sometimes this is useful behavior, for example when it means that a
program is invoked with exactly a specific number of arguments (and with
certain things in certain argument positions) even if some things aren't
there. Sometimes this is inconvenient, if what you really wanted was to
quote $VAR
but not necessarily pass acnt
an empty argument if $VAR
wound up unset. If you want this latter behavior, you need to use the
more awkward form:
./acnt ${VAR:+"$VAR"}
(Support for this is required by the Single Unix Specification and is present in Solaris 10, so I think you're very likely to find it everywhere.)
Note that it can be possible to indirectly notice the presence of empty arguments in situations where they don't show directly. For example:
$ echo a "$VAR" b a b
If you look carefully there is an extra space printed between a and b
here; that is because echo
is actually printing 'a', separator space,
an empty string, another separator space, and then 'b'. Of course some
programs are more obvious, even if the error message is a bit more
mysterious:
$ cat "$VAR" cat: : No such file or directory
(This entry is brought to you in the process of me discovering something
interesting about modern versions of test
, but that's another entry.)