How SuS probably requires the 'run at least once' xargs behavior

April 25, 2013

A commentator left a long comment on my entry about how xargs behaves with no input arguing that the Single Unix Specification for xargs actually requires it to not run if standard input is empty. I think it's more likely to be the other way around, so today I want to run down why I think the SuS probably requires this annoying behavior.

There are two important sections of the SuS xargs specification here and I'm going to quote both, bolding important bits:

The xargs utility shall construct a command line consisting of the utility and argument operands specified followed by as many arguments read in sequence from standard input as fit in length and number constraints specified by the options. The xargs utility shall then invoke the constructed command line and wait for its completion. This sequence shall be repeated until one of the following occurs:

  • An end-of-file condition is detected on standard input.

[... other conditions elided ...]

[...] The utility named by utility shall be executed one or more times until the end-of-file is reached or the logical end-of file string is found. [...]

Now we get to play the fun game of interpreting standards. The easiest place to play this game with is the last sentence I quoted, which says both that the utility shall be executed at least once and that this happens until end-of-file is reached. If end of file is reached immediately, which takes precedence? In the style of reading standards that I've absorbed, explicit statements generally trump implications; that would mean that the explicit promise that utility shall be executed at least once trumps the potential implication of not running it on immediate EOF.

The first paragraph as a whole offers a similar conflict. It is easy to read it as a series of steps: first read in as many arguments as you can that fit, then run the command, and only then check for exit conditions and repeat if they are not met. You don't check for exit conditions before you run the command once because that's not what the series of steps tells you to do, and 'zero arguments' is not ruled out as a valid number of arguments to read from standard input; ergo, xargs runs the command line once even on immediate EOF. You can also read it as a general description instead of a series of steps, with the 'this sequence shall be repeated until ...' forming the framing procedure around the specific two steps used to form and run each command line; in this reading it's correct to run zero times if there is an immediate end of file on standard input since the framing loop's exit condition has been met.

If we read the first paragraph using an 'explicit trumps implicit' rule then I think that we have to conclude that the paragraph is the set of steps that xargs is intended to follow as it executes because this is exactly how the paragraph is written. This interpretation is reinforced by the 'once or more' language in the later paragraph.

None of this is unambiguous; the SuS specification never comes out and says outright 'xargs runs once even if it reads no arguments'. But given how much the usual extremely legalistic, 'every word and phrase and ordering decision counts' approach to reading standards pushes us towards the 'xargs runs once on EOF' interpretation, I think it's probably what SuS actually requires.

(Note that none of this matters in practice. As covered in the first entry, existing systems have no common behavior. The closest you can get is to always specify -r so that xargs does not run once, which works on GNU findutils, sufficiently recent FreeBSD, and OpenBSD.)

PS: this is not the most crazy thing in the SuS xargs specification. If you care about xargs portability and want to be horrified, read the description of -E carefully.

(Also, these crazy things are almost certainly not the fault of the SuS authors.)

Written on 25 April 2013.
« Two mistakes I made with VMs today
Are there less anti-spam DNS blocklists than there used to be? »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Thu Apr 25 01:02:16 2013
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.