How to accidentally get yourself with 'find ... -name something*'

January 27, 2025

Suppose that you're in some subdirectory /a/b/c, and you want to search all of /a for the presence of files for any version of some program:

u@h:/a/b/c$ find /a -name program* -print

This reports '/a/b/c/program-1.2.tar' and '/a/b/f/program-1.2.tar', but you happen to know that there are other versions of the program under /a. What happened to a command that normally works fine?

As you may have already spotted, what happened is the shell's wildcard expansion. Because you ran your find in a directory that contained exactly one match for 'program*', the shell expanded it before you ran find, and what you actually ran was:

find /a -name program-1.2.tar -print

This reported the two instances of program-1.2.tar in the /a tree, but not the program-1.4.1.tar that was also in the /a tree.

If you'd run your find command in a directory without a shell match for the -name wildcard, the shell would (normally) pass the unexpanded wildcard through to find, which would do what you want. And if there had been only one instance of 'program-1.2.tar' in the tree, in your current directory, it might have been more obvious what went wrong; instead, the find returning more than one result made it look like it was working normally apart from inexplicably not finding and reporting 'program-1.4.1.tar'.

(If there were multiple matches for the wildcard in the current directory, 'find' would probably have complained and you'd have realized what was going on.)

Some shells have options to cause failed wildcard expansions to be considered an error; Bash has the 'failglob' shopt, for example. People who turn these options on are probably not going to stumble into this because they've already been conditioned to quote wildcards for 'find -name' and other similar tools. Possibly this Bash option or its equivalent in other shells should be the default for new Unix accounts, just so everyone gets used to quoting wildcards that are supposed to be passed through to programs.

(Although I don't use a shell that makes failed wildcard expansions an error, I somehow long ago internalized the idea that I should quote all wildcards I want to pass to programs.)


Comments on this page:

By Miksa at 2025-01-28 04:11:29:

I too have at some point internalized that advice, I wouldn't even think about not quoting a wildcard with find. I think I would automatically quote the wildcard with any program that can use a wildcard by itself. I think I originally learned this lesson with "unzip -tq '*.zip'" decades ago.

By holman at 2025-01-28 10:12:23:

Git is kind of weird here, because it will sometimes expand wildcards itself based on what's in the repository. Like, if I only have files "foo", "bar", "baz" in my working directory, and run

git rm b*

it will delete just "bar" and "baz". But if I run

git rm 'b*'

it will delete all repository contents starting with 'b', even files that aren't on the filesystem.

I tried the 'failglob' option and right away got warnings from my mail-checking PROMPT_COMMAND:

bash: no match: /home/…/inbox/new/*

I can't think of any better way to check without spawning subshells (which are annoyingly slow). I guess I could read BASHOPTS and temporarily disable the option if necessary.

By Miksa at 2025-01-30 06:05:15:

@holman

The man page for git-rm does explain this, maybe even clearly enough. "File globbing matches across directory boundaries. Thus, given two directories d and d2, there is a difference between using git rm 'd*' and git rm 'd/*', as the former will also remove all of directory d2."

It's necessary to remember how globbing and wildcards work. In the first case git doesn't even see the wildcard, the command becomes 'git rm bar baz'.

But there be dragons with commands that have built-in support for wildcards. What if you had typoed 'git rm c*' instead. If the directory didn't have any files starting with "c" the globbing would not happen, git-rm would receive the parameter 'c*' and would do widescale damage in the repo. In this kind of cases tab-completion is a must. After typing 'git rm b' you must doubletap tab so you can see what files the wildcard will match.

Written on 27 January 2025.
« Some learning experiences with HTTP cookies in practice
We got hit by an alarmingly well-prepared phish spammer »

Page tools: View Source, View Normal.
Search:
Login: Password:

Last modified: Mon Jan 27 22:43:50 2025
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.