find mostly doesn't need xargs today on modern Unixes

January 27, 2021

I've been using Unix for long enough that 'find | xargs' is a reflex. When I started and for a long time afterward, xargs was your only choice for efficiently executing a command over a bunch of find results. If you didn't want to run one grep or rm or whatever per file (which was generally reasonably slow in those days), you reached for 'find ... -print | xargs ...'. There were some gotchas in traditional xargs usage, and one of them was why GNU xargs, GNU find, and various other things start growing options to use the null byte as an argument terminator instead of the usual (and surprising) definition. Over time I adopted to these and soon was mostly using 'find ... -print0 | xargs -0 ...'.

For usage with find, all of this is unnecessary on a modern Unix and has been for some time, because find folded this into itself. Modern versions of find don't have just the traditional '-exec', which runs one command per file, but also an augmented version of it which aggregates the arguments together like xargs does. This augmented version is used by ending the '-exec' with '+' instead of ';', like so:

find . ... -exec grep -H whatever '{}' +

(I'm giving grep the -H argument for reasons covered here.)

Although I sometimes still reflexively use 'find | xargs', more and more I'm trying to use the simple form of just find with this augmented -exec. My reflexes can learn new tricks, eventually.

This augmented form of -exec is in the Single Unix Specification for find, so unsurprisingly it's not just in GNU Find but also OpenBSD, FreeBSD, NetBSD, and Illumos. I haven't tried to look up a find manpage in whatever commercial Unixes are left (probably at least macOS and AIX). Based on the rationale section of the SUS find, this very convenient find feature was introduced in System V R4. The Single Unix Specification also explains why they didn't adopt the arguably more Unixy option of '-print0' for null-terminated output.

(In practice everyone has adopted -print0 as well, even OpenBSD and Illumos. I assume without checking that they also all have 'xargs -0', because it doesn't make much sense to adopt one without the other.)

PS: Unfortunately this feature is not quite as flexible as it looks. Both the specification and actual find implementations require the '{}' to be at the end of the command, instead of anywhere in it. This means you can't do something like 'find ... -exec mv {} /some/dir +'. This makes life slightly simpler for find's code and probably only rarely matters for actual usage.

Written on 27 January 2021.
« Time for Python 2 users to make sure we have a copy of Pip and other pieces
Making tracking upstream Git repositories a bit quieter »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Wed Jan 27 00:09:21 2021
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.