Two xargs gotchas that you may not know about

May 1, 2013

I know, I've been harping on xargs a bit lately. But this stuff is important because most people's vague intuitions about how xargs behaves is actually wrong.

If you're like most people, you probably vaguely think that xargs operates on lines of input and the purpose of the GNU -0 extension to xargs (and find et al) is so that some joker putting a newline in a file name doesn't cause the world to blow up. Actually it's much worse than that.

The simple way to put this is xargs doesn't operate on lines, it operates on words. Words are the same as lines only if your lines don't have any whitespace, backslashes, single quotes (') or double quotes ("), all of which xargs will interpret in various ways. Oh, and blank lines are neither errors nor empty arguments under normal circumstances, they are simply word-separating whitespace. In short, newlines are only the beginning of the things that nasty people can put in their filenames to give you heartburn.

(Normally you don't see any of this because your input to xargs is well formed and simple.)

The other trap (as I alluded to) is the portable behavior of xargs if you don't give an explicit -E argument. If you don't, some versions of xargs will assume that a line with only an underscore (_) actually means the (logical) end of file and won't read any further input. It will probably surprise no one that Solaris 10 update 8 (that bastion of old times) behaves this way. Fortunately Linux, FreeBSD, and OpenBSD don't appear to do so.

(One of the morals here is that sometimes GNU programs make important innovations, as I believe that xargs -0 and find ... -print0 came from GNU.)

Written on 01 May 2013.
« The two stories of RISC
Virtual disks should be treated as 4k 'Advanced Format' drives »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Wed May 1 23:48:39 2013
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.