2018-07-28
Word-boundary regexp searches are what I usually want
I'm a person of relatively fixed and slow to change habits as far as
Unix commands go. Once I've gotten used to doing something in one way,
that's generally it, and worse, many of my habits fossilized many years
ago. All of this is a long-winded lead in to explaing why I have only
recently gotten around to really using the '\b
' regular expression
escape character. This is a real pity, because now that I have my big
reaction is 'what took me so long?'
Perhaps unsurprisingly, it turns out that I almost always want to search for full words, not parts of words. This is true whether I'm looking for words in text, words in my email, or for functions, variables, and the like in code. In the past I adopted various hacks to deal with this, or just dealt with the irritation of excessive matches, but now I've converted over to using word-boundary searches and the improvement in getting what I really want is really great. It removes another little invisible point of friction and, like things before it, has had an outsized impact on how I feel about things.
(In retrospect, this is part of what how we write logins in documentation was doing. Searching for '<LOGIN>' instead of 'LOGIN' vastly reduced the chance that you'd run into the login embedded in another word.)
There are a couple of ways of doing word-boundary searches (somewhat
depending on the program). The advantage of '\b
' is that it works
pretty universally; it's supported by at least (GNU) grep, ripgrep, and less, and it's at least
worth trying in almost anything that supports modern (or 'PCRE')
regular expressions, which is a lot of things. Grep and ripgrep
also support the -w
option for doing this, which is especially
useful because it works with fgrep
.
(I reflexively default to fgrep
, partly so I don't have to think
about special characters in my search string.)
Per this SO question and its answers,
in vim I'd need to use '\<
' and '/>
' for the beginning and end
of words. I'm sure vim has a reason for having two of them. Emacs
supports '\b
', although I don't actually do regular expression
searches in Emacs regularly enough to remember how to invoke them
(since I just looked it up, the documentation
tells me it's C-M-s and C-M-r, which ought to be reasonably memorable
given plain searches).
PS: Before I started writing this entry, I didn't know about -w
in grep and ripgrep, or how to do this in vim (and I would have
only been guessing about Emacs). Once again, doing some research
has proven beneficial.
PPS: I care about less because less is often my default way of scanning through pretty much anything, whether it's a big text file or code. Grep and company may tell me what files have some content and a bit of its context, but less is what let me poke around, jump back and forth, and so on. Perhaps someday I will get a better program for this purpose, but probably not soon.