Wandering Thoughts archives

2016-03-09

A sensible surprise (to me) in the Bourne shell's expansion of "$@"

I generally like to think that I'm pretty well up on the odd corners of the Bourne shell due to having around Unix for a fair while. Every so often I stumble over something that shows me that I'm wrong.

So let's start with the following, taken from something Jed Davis discovered about Bash:

$ set -- one two three
$ for i in "front $@ back"; do echo $i; done
front one
two
three back
$

When I saw this, my first reaction was basically 'what?', because it didn't seem to make any sense. After I mumbled a bit on Twitter, Jed Davis found the explanation in the Single Unix Specification here:

When the expansion occurs within double-quotes, and where field splitting [...] is performed, each positional parameter shall expand as a separate field, with the provision that the expansion of the first parameter shall still be joined with the beginning part of the original word (assuming that the expanded parameter was embedded within a word), and the expansion of the last parameter shall still be joined with the last part of the original word.

The purpose of "$@" is to preserve arguments that originally have spaces in them as single arguments. So, for example:

$ set -- "one argument" "two argument"
$ for i in "$@"; do echo $i; done
one argument
two argument
$ for i in "$*"; do echo $i; done
one argument two argument
$

This is what the first part of the SuS specification describes (up to 'shall expand as a separate field'). But this definition opens up a question; what is result of expansion if you have not a simple "$@" but instead something with additional text inside the double quotes? One answer would be to completely turn off the special splitting and argument preserving behavior of "$@" (making it identical to "$*" here), but that probably wouldn't be very satisfying. Traditional Unix and thus SuS instead says that you should continue field splitting but pretend that any front text is attached to the first argument and any back text is attached to the last one.

(Since it's still text inside a "...", the front and rear text is not subject to any word splitting; it's attached untouched as a single unit.)

When I saw this, my first and not well thought out expectation was that any leading and trailing text would be subject to regular word splitting and thus be taken as separate, additional arguments. Of course this doesn't actually make sense if I think about it for real, because there is normally no word splitting inside double quotes. Thus, the traditional Unix and SuS behavior is perfectly reasonable here and makes sense from an algorithmic perspective.

Given all this, the result of the following is not really surprising:

$ set -- one two three
$ for i in "$@ $@"; do echo $i; done
one
two
three one
two
three
$

(Writing this entry has been useful in forcing me to confront some of my own fuzzy thinking around the whole area of "$@", as you can tell from the story of my first reaction to this.)

BourneDollarAtExpansionSurprise written at 23:33:06; Add Comment

2016-03-07

Why it makes sense for true and false to ignore their arguments

It's standard when writing Unix command line programs to make them check their arguments and complain if the usage is incorrect. It's reasonably common to do this even for programs that don't take options or positional arguments. After all, if your command is supposed to take no arguments, it's really an error if someone runs it and gives it arguments.

(Not all scripts, programs, and so on actually check this, because you usually have to go at least a little bit out of your way to look at the argument count. But it's the kind of minor nit you might get code review comments about, or an issue report.)

true and false are an exception to this, in that they more or less completely ignore any arguments given to them. Part of this behavior is historical; the V7 /bin/true and /bin/false were extremely minimal, and when you're being minimal it's easiest to not even look at the arguments. But beyond the history, I think that this is perfectly sensible behavior for true and false because it makes them universal substitutes for other commands, for when you want to null out a command so that it does nothing.

Want to make a command do nothing but always succeed? Simple: 'mv command command.real; ln -s /bin/true command'. Want to do the same thing but have the command always fail? Use false instead of true. Sure, you can do the same thing with shell scripts that deliberately ignore the arguments and just do 'exit 0' or 'exit 1', but this is a little bit simpler and matches the historical behavior.

(You can also do this in shell scripts as a way of creating a 'don't actually do anything' mode, but there are probably better patterns there.)

On that note, it's interesting to note that although GNU true and false have command line options that will cause them to produce output, there is no way to get them to return the wrong exit status. And while they respond to --help and --version, they silently ignore other options (as opposed to, say, reporting a syntax error).

(This entry was sparked by Zev Weiss's mention of true in his comment on this entry.)

Sidebar: true and false in V7

In V7 Unix, true is an empty file and false is a file that is literally just 'exit 1'. Neither has a #! line at the start of the file, because that came in later. That true is empty instead of 'exit 0' saves V7 a disk block, which probably mattered back then.

TrueFalseAndArguments written at 23:13:13; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.