Two xargs
gotchas that you may not know about
I know, I've been harping on xargs
a bit lately. But this stuff is
important because most people's vague intuitions about how xargs
behaves is actually wrong.
If you're like most people, you probably vaguely think that xargs
operates on lines of input and the purpose of the GNU -0
extension to
xargs
(and find
et al) is so that some joker putting a newline in a
file name doesn't cause the world to blow up. Actually it's much worse
than that.
The simple way to put this is xargs
doesn't operate on lines, it
operates on words. Words are the same as lines only if your lines
don't have any whitespace, backslashes, single quotes ('
) or double
quotes ("
), all of which xargs
will interpret in various ways. Oh,
and blank lines are neither errors nor empty arguments under normal
circumstances, they are simply word-separating whitespace. In short,
newlines are only the beginning of the things that nasty people can put
in their filenames to give you heartburn.
(Normally you don't see any of this because your input to xargs
is
well formed and simple.)
The other trap (as I alluded to) is the
portable behavior of xargs
if you don't give an explicit -E
argument. If you don't, some versions of xargs
will assume that a
line with only an underscore (_) actually means the (logical) end
of file and won't read any further input. It will probably surprise no
one that Solaris 10 update 8 (that bastion of old times) behaves this
way. Fortunately Linux, FreeBSD, and OpenBSD don't appear to do so.
(One of the morals here is that sometimes GNU programs make important
innovations, as I believe that xargs -0
and find ... -print0
came
from GNU.)
|
|