A sensible surprise (to me) in the Bourne shell's expansion of "$@"

March 9, 2016

I generally like to think that I'm pretty well up on the odd corners of the Bourne shell due to having around Unix for a fair while. Every so often I stumble over something that shows me that I'm wrong.

So let's start with the following, taken from something Jed Davis discovered about Bash:

$ set -- one two three
$ for i in "front $@ back"; do echo $i; done
front one
two
three back
$

When I saw this, my first reaction was basically 'what?', because it didn't seem to make any sense. After I mumbled a bit on Twitter, Jed Davis found the explanation in the Single Unix Specification here:

When the expansion occurs within double-quotes, and where field splitting [...] is performed, each positional parameter shall expand as a separate field, with the provision that the expansion of the first parameter shall still be joined with the beginning part of the original word (assuming that the expanded parameter was embedded within a word), and the expansion of the last parameter shall still be joined with the last part of the original word.

The purpose of "$@" is to preserve arguments that originally have spaces in them as single arguments. So, for example:

$ set -- "one argument" "two argument"
$ for i in "$@"; do echo $i; done
one argument
two argument
$ for i in "$*"; do echo $i; done
one argument two argument
$

This is what the first part of the SuS specification describes (up to 'shall expand as a separate field'). But this definition opens up a question; what is result of expansion if you have not a simple "$@" but instead something with additional text inside the double quotes? One answer would be to completely turn off the special splitting and argument preserving behavior of "$@" (making it identical to "$*" here), but that probably wouldn't be very satisfying. Traditional Unix and thus SuS instead says that you should continue field splitting but pretend that any front text is attached to the first argument and any back text is attached to the last one.

(Since it's still text inside a "...", the front and rear text is not subject to any word splitting; it's attached untouched as a single unit.)

When I saw this, my first and not well thought out expectation was that any leading and trailing text would be subject to regular word splitting and thus be taken as separate, additional arguments. Of course this doesn't actually make sense if I think about it for real, because there is normally no word splitting inside double quotes. Thus, the traditional Unix and SuS behavior is perfectly reasonable here and makes sense from an algorithmic perspective.

Given all this, the result of the following is not really surprising:

$ set -- one two three
$ for i in "$@ $@"; do echo $i; done
one
two
three one
two
three
$

(Writing this entry has been useful in forcing me to confront some of my own fuzzy thinking around the whole area of "$@", as you can tell from the story of my first reaction to this.)

Written on 09 March 2016.
« Some thoughts on ways of choosing what TLS ciphers to support
I need to use getopts sooner (and more often) in Bourne shell scripts »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Wed Mar 9 23:33:06 2016
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.