find mostly doesn't need xargs today on modern Unixes

January 27, 2021

I've been using Unix for long enough that 'find | xargs' is a reflex. When I started and for a long time afterward, xargs was your only choice for efficiently executing a command over a bunch of find results. If you didn't want to run one grep or rm or whatever per file (which was generally reasonably slow in those days), you reached for 'find ... -print | xargs ...'. There were some gotchas in traditional xargs usage, and one of them was why GNU xargs, GNU find, and various other things start growing options to use the null byte as an argument terminator instead of the usual (and surprising) definition. Over time I adopted to these and soon was mostly using 'find ... -print0 | xargs -0 ...'.

For usage with find, all of this is unnecessary on a modern Unix and has been for some time, because find folded this into itself. Modern versions of find don't have just the traditional '-exec', which runs one command per file, but also an augmented version of it which aggregates the arguments together like xargs does. This augmented version is used by ending the '-exec' with '+' instead of ';', like so:

find . ... -exec grep -H whatever '{}' +

(I'm giving grep the -H argument for reasons covered here.)

Although I sometimes still reflexively use 'find | xargs', more and more I'm trying to use the simple form of just find with this augmented -exec. My reflexes can learn new tricks, eventually.

This augmented form of -exec is in the Single Unix Specification for find, so unsurprisingly it's not just in GNU Find but also OpenBSD, FreeBSD, NetBSD, and Illumos. I haven't tried to look up a find manpage in whatever commercial Unixes are left (probably at least macOS and AIX). Based on the rationale section of the SUS find, this very convenient find feature was introduced in System V R4. The Single Unix Specification also explains why they didn't adopt the arguably more Unixy option of '-print0' for null-terminated output.

(In practice everyone has adopted -print0 as well, even OpenBSD and Illumos. I assume without checking that they also all have 'xargs -0', because it doesn't make much sense to adopt one without the other.)

PS: Unfortunately this feature is not quite as flexible as it looks. Both the specification and actual find implementations require the '{}' to be at the end of the command, instead of anywhere in it. This means you can't do something like 'find ... -exec mv {} /some/dir +'. This makes life slightly simpler for find's code and probably only rarely matters for actual usage.

Comments on this page:

By Stephen Kitt at 2021-01-27 01:01:01:

GNU mv has a useful "mv -t" variant which works around the "-exec ... {} +" limitation (and avoids various sources of ambiguity); same goes for cp.

You might like "-execdir" on find implementations which have it; it avoids some race conditions on the paths involved by running commands from the directory containing the files to be processed.

I can confirm that macOS Catalina supports find … -exec … + form. I didn't really think about it; I've been assuming find will complain if it doesn't support it.

No complaints so far :)

Additionally, xargs has the -P argument that allows paralellization and it's a gain in modern systems with more than one core. As far as I know find itself doesn't allow nothing similar.

By Barry at 2021-01-27 21:32:13:

It seems to work on Solaris 9 and AIX 7.1, neither of which is exactly new.

I wish I'd found out about this feature years ago!

Note too that modern find also has -delete, which removes the need to spawn rm (whether via xargs or even just -exec … +).

This isn’t just a cleaner shortcut, it also takes away all the overhead that even find | xargs still has, which means the basic Unix toolbox can get you out of something-created-50-million-files-in-one-directory messes. Previously this level of (non)overhead was only accessible via something like a perl -e 'unlink while readdir'-type oneliner.

(This exists in find in the libre BSDs and thus also on macOS as well.)

By Anonymous at 2021-01-31 12:45:05:
Both the specification and actual find implementations require the '{}' to be at the end of the command, instead of anywhere in it. This means you can't do something like 'find ... -exec mv {} /some/dir +'.

Odd. On at least Fedora 33 / find 4.7.0, this (not having '{}' at the end) works as expected when I use exec with ';'. Only when I use the '+', I get an error 'find: missing argument to `-exec''. If I recall correctly, not having '{}' at the end in combination with ';' also worked on AIX as far back as 4.3.3.

In other words, this works for me :

find . -type f -exec mv {} /somedir \;

By Icarus Sparry at 2021-01-31 16:29:03:

The "find ... -exec ... {} +" was introduced by David Korn 30 or so years ago.

By Anonymous at 2021-02-08 11:57:47:

You can get around the limitation of {} needing to be at the end of the command by using sh: find . -exec sh -c 'exec mv -- "$@" destination/' sh {} +

By Reuben Thomas at 2021-02-20 06:10:08:

GNU find's manual has documented these tricks since 2007, with the workaround for argument position being fixed, I see, in 2010. The `+` option was implemented in 2005.

And I only found out about it thanks to this article; many thanks!

Written on 27 January 2021.
« Time for Python 2 users to make sure we have a copy of Pip and other pieces
Making tracking upstream Git repositories a bit quieter »

Page tools: View Source, View Normal, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Wed Jan 27 00:09:21 2021
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.