Understanding a tricky case of Bourne shell redirection and command parsing

November 23, 2017

I recently saw this tweet by Dan Bornstein (via) that poses a Bourne shell puzzle, or more specifically a Bash puzzle:

Shell fun time! Do you know what the following script does? Did you have to run it?

#!/bin/bash
function boop {
  echo "out $1"
  echo "err $1" 1>&2
}

 x='2';   boop A >& $x
 x='2';   boop B >& "$x"
 x=' 2 '; boop C >& $x
 x=' 2 '; boop D >& "$x"

There's a number of interesting things going on here, including a Bash feature I didn't know about and surprising (to me) difference between Bourne shells, and even between Bash run as bash and run as /bin/sh. So let's talk about what's going on and where Bourne shells differ in their interpretations of this (and why).

(If you want a version of this tweet that you can test in other Bourne shells, change the 'function boop' to 'boop()'.)

To start with, we can mostly ignore the specifics of boop; its job is to produce some output to standard output and to standard error. The '1>&2' just says 'redirect file descriptor 1 to file descriptor 2' (technically it means 'make file descriptor 1 be a duplicate of file descriptor 2'). For standard output (fd 1), the number is optional, so this can also be written as '>&2', which should look sort of familiar given the last four lines.

Now we need to break down each of the last four lines, one by one, starting with the first.

x='2';   boop A >& $x

After expanding $x, this is simply 'boop A >& 2', and so all Bourne shells redirect boop's standard output to standard error. The only mild surprise here is that Bash processes redirections after doing variable expansions, which isn't really a surprise if you read the very large Bash manpage (because it documents the order; see here). I believe that this order is in fact the historical Bourne shell behavior and it also seems to be the POSIX-required order.

x='2';   boop B >& "$x"

After variable expansion, this is 'boop B >& "2"'. In some languages, putting the 2 in quotes would cause the language to see it purely as a string, blocking its interpretation as a file descriptor number. In the Bourne shell, this is not what quotes do; they mostly just block word splitting, which there isn't any of here anyway. So this is the same as 'boop B >& 2', which is the same as the first line and has the same effect.

Now things start getting both interesting and different between shells.

x=' 2 '; boop C >& $x

In Bash, invoked in full Bash mode (and not as /bin/sh), this expands $x without any particular protection from the usual word splitting on whitespace. The application of word splitting trims off the spaces at the start and the end, leaving this as basically 'boop C >& 2 ', which is simply the same as the first two lines. Dash behaves the same way as Bash does.

(The value of $x has spaces in it, but in general when $x is used unquoted those spaces get removed by word splitting after the expansion happens. This reinterpretation of $x's value is required partly because the Bourne shell doesn't have lists.)

If you run Bash as /bin/sh (or in POSIX mode in general), it doesn't do word splitting on the expanded value of $x in this context. This leaves it with what is effectively boop C >& ' 2 '. Bash then interprets this as meaning to write standard output and standard error to a file called ' 2 ', for reasons I'll cover later.

(This mode difference is documented in Chet Ramey's page on Bash's POSIX mode; see number 11.)

Most other shells (apart from Bash and Dash) interpret this as an error, reporting some variant of 'illegal file descriptor name'. These shells don't do word splitting here either, and without word splitting and whitespace trimming we don't have a digit, we have a digit surrounded by spaces, which is not a valid file descriptor number.

(Not doing word splitting here appears to be the correct POSIX behavior, based on a careful reading of portions of 2.7 Redirection. See the third-last paragraph.)

x=' 2 '; boop D >& "$x"

At this point, the expansion of $x is fully protected from having its whitespace stripped; we wind up with effectively boop D >& ' 2 '. Bash in both normal and /bin/sh mode writes both standard output and standard error to a file called ' 2 ', which is probably a good part of the surprise that Dan Bornstein had in mind. Bash interprets things this way because it has a specific feature of its own for redirecting standard output and standard error. Bash recommends writing this as '&>word' but accepts '>&word' as a synonym, with the dry note that if word is a number, 'other redirection operators apply'. Here word is no longer a number (it's a number with spaces on either side), so Bash's special >& feature takes over.

(This interpretation of >&word is permitted by the Single Unix Standard, which says that the behavior in this case is unspecified. Since it's unspecified, Bash is free to do whatever it wants.)

Every other Bourne shell that I have handy to test with (Dash, FreeBSD /bin/sh, OmniOS /bin/sh, FreeBSD pdksh, and official versions of ksh) report the same 'illegal file descriptor name' as most of them did for the third line (and for the same reason; ' 2 ' is not a file descriptor number). This too is allowed by the Single Unix Standard; since the behavior is unspecified, we're allowed to make it an error.

Sidebar: The oddity of Dash's behavior

I was going to say that Dash is not POSIX compliant here, but that's wrong. POSIX specifically says that '>& word' where the word is not a plain number is unspecified, so basically anything goes in both the third and the fourth line. However, Dash is inconsistent and behaves oddly. The most clear oddity is the results of the following:

x=' 2 1 '; echo hi >& $x

This produces a 'hi' on standard error and nothing else; the trailing '1' in $x has simply vanished.

Written on 23 November 2017.
« Delays on SMTP replies discourage apparently SMTP AUTH scanners
Shooting myself in the foot by using exec in a shell script »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Thu Nov 23 00:40:34 2017
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.