Understanding a tricky case of Bourne shell redirection and command parsing
I recently saw this tweet by Dan Bornstein (via) that poses a Bourne shell puzzle, or more specifically a Bash puzzle:
Shell fun time! Do you know what the following script does? Did you have to run it?
#!/bin/bash function boop { echo "out $1" echo "err $1" 1>&2 } x='2'; boop A >& $x x='2'; boop B >& "$x" x=' 2 '; boop C >& $x x=' 2 '; boop D >& "$x"
There's a number of interesting things going on here, including a
Bash feature I didn't know about and surprising (to me) difference
between Bourne shells, and even between Bash run as bash
and run
as /bin/sh
. So let's talk about what's going on and where
Bourne shells differ in their interpretations of this (and why).
(If you want a version of this tweet that you can test in other
Bourne shells, change the 'function boop
' to 'boop()
'.)
To start with, we can mostly ignore the specifics of boop
; its
job is to produce some output to standard output and to standard
error. The '1>&2
' just says 'redirect file descriptor 1 to file
descriptor 2' (technically it means 'make file descriptor 1 be a
duplicate of file descriptor 2'). For standard output (fd 1), the
number is optional, so this can also be written as '>&2
', which
should look sort of familiar given the last four lines.
Now we need to break down each of the last four lines, one by one, starting with the first.
x='2'; boop A >& $x
After expanding $x
, this is simply 'boop A >& 2
', and so all
Bourne shells redirect boop
's standard output to standard error.
The only mild surprise here is that Bash processes redirections
after doing variable expansions, which isn't really a surprise if
you read the very large Bash manpage (because it documents the
order; see here).
I believe that this order is in fact the historical Bourne shell
behavior and it also seems to be the POSIX-required order.
x='2'; boop B >& "$x"
After variable expansion, this is 'boop B >& "2"
'. In some
languages, putting the 2 in quotes would cause the language to see
it purely as a string, blocking its interpretation as a file
descriptor number. In the Bourne shell, this is not what quotes do;
they mostly just block word splitting, which there isn't any
of here anyway. So this is the same as 'boop B >& 2
', which is
the same as the first line and has the same effect.
Now things start getting both interesting and different between shells.
x=' 2 '; boop C >& $x
In Bash, invoked in full Bash mode (and not as /bin/sh
), this
expands $x
without any particular protection from the usual word
splitting on whitespace. The application of word splitting trims
off the spaces at the start and the end, leaving this as basically
'boop C >& 2
', which is simply the same as the first two lines.
Dash behaves the same
way as Bash does.
(The value of $x
has spaces in it, but in general when $x
is
used unquoted those spaces get removed by word splitting after the
expansion happens. This reinterpretation of $x
's value is required
partly because the Bourne shell doesn't have lists.)
If you run Bash as /bin/sh
(or in POSIX mode in general), it doesn't
do word splitting on the expanded value of $x
in this context. This
leaves it with what is effectively boop C >& ' 2 '
. Bash then
interprets this as meaning to write standard output and standard error
to a file called ' 2
', for reasons I'll cover later.
(This mode difference is documented in Chet Ramey's page on Bash's POSIX mode; see number 11.)
Most other shells (apart from Bash and Dash) interpret this as an error, reporting some variant of 'illegal file descriptor name'. These shells don't do word splitting here either, and without word splitting and whitespace trimming we don't have a digit, we have a digit surrounded by spaces, which is not a valid file descriptor number.
(Not doing word splitting here appears to be the correct POSIX behavior, based on a careful reading of portions of 2.7 Redirection. See the third-last paragraph.)
x=' 2 '; boop D >& "$x"
At this point, the expansion of $x
is fully protected from having
its whitespace stripped; we wind up with effectively boop D >& '
2 '
. Bash in both normal and /bin/sh
mode writes both standard
output and standard error to a file called ' 2
', which is probably
a good part of the surprise that Dan Bornstein had in mind. Bash
interprets things this way because it has a specific feature of its
own for redirecting standard output and standard error.
Bash recommends writing this as '&>word' but accepts
'>&word' as a synonym, with the dry note that if word is a
number, 'other redirection operators apply'. Here word is no
longer a number (it's a number with spaces on either side), so
Bash's special >& feature takes over.
(This interpretation of >&word is permitted by the Single Unix Standard, which says that the behavior in this case is unspecified. Since it's unspecified, Bash is free to do whatever it wants.)
Every other Bourne shell that I have handy to test with (Dash,
FreeBSD /bin/sh
, OmniOS /bin/sh
, FreeBSD pdksh, and official
versions of ksh) report
the same 'illegal file descriptor name' as most of them did for
the third line (and for the same reason; ' 2
' is not a file
descriptor number). This too is allowed by the Single Unix Standard;
since the behavior is unspecified, we're allowed to make it an error.
Sidebar: The oddity of Dash's behavior
I was going to say that Dash is not POSIX compliant here, but that's
wrong. POSIX specifically says that '>& word
' where the word is
not a plain number is unspecified, so basically anything goes in
both the third and the fourth line. However, Dash is inconsistent
and behaves oddly. The most clear oddity is the results of the
following:
x=' 2 1 '; echo hi >& $x
This produces a 'hi' on standard error and nothing else; the trailing
'1' in $x
has simply vanished.
|
|