My current views on Shellcheck

July 6, 2017

As part of the reaction to my last entry, I've been asked how I feel about shellcheck (also). As it happens, I have some opinions here.

Shellcheck is what I'd classify as a linter. I've dealt with a lot of linters and I've wound up with a general approach to looking at them, which I summarize as a linter is interesting to the extent that it gives me useful new information. To do this, a linter needs to do two things; it must find some genuine issues that matter (or at least that I care about), and it can't complain too much about stuff I don't care about (or where it's wrong). Or in other words a useful linter must have some signal (the more the better) and not too much noise.

What is signal is somewhat in the eye of the beholder. It is not enough for a linter to be accurate; it has to be interesting as well. For an example of this, I will turn to Go. The Go community has a whole bunch of high-signal low-noise linters, starting with 'go vet' itself. If gosimple or unparam report something, it is real and you should probably care about it; if you fix something reported by either tool, a project maintainer would probably take your patch. Then there is errcheck. Errcheck is accurate, but it turns out that silently ignoring errors is extremely common in production Go code and probably very few maintainers would accept patches to litter their code with various ways to make errcheck shut up. In practice errcheck is too persnickety and so I never bother running it against my Go code, unlike other Go linters.

(The Go world has settled on a style of linters where there are a bunch, each of which is narrowly focused on a single area. This has spawned the obvious meta-linter project.)

Unfortunately, when I gave Shellcheck a spin it fell much more towards the errcheck end of the linter spectrum than the go vet end. Here is an example that shows one issue I have with it:

; cat testit
#!/bin/sh
echo Error: $1
; shellcheck testit
[...]
echo Error: $1
            ^-- SC2086: Double quote to prevent globbing and word splitting.

This is technically correct yet almost useless in practice. It's not an actual problem; I can't think of a situation where the lack of double quotes will cause an issue here, given that we are just running echo (and it has an initial argument). Perhaps it would be good style to write '"$1"' instead, but then it might be good style to litter your Go code with markers that you're explicitly ignoring errors (and thus make errcheck shut up). Neither are going to happen in this reality.

Linting shell scripts is a hard problem and I applaud Shellcheck for trying. If I was writing very high assurance shell scripts that absolute had to work, even in a hostile environment, I might use Shellcheck and make sure all my scripts passed all its checks. But for finding real issues in my casual-usage shell scripts, it is too much noise and too little signal in its current state.

(Also, according to a test with the version available online through shellcheck.net, it doesn't currently spot my variable reuse bug.)

Another issue or hazard with shellcheck is illustrated by the following warning it spat out when I tried it on a random administration script we have:

In errmail line 22:
  trap "rm -rf $TMPDIR; exit" 0 1 2 13 15
       ^-- SC2064: Use single quotes, otherwise this expands now rather than when signalled.

Actually, having this expand immediately is deliberate. But shellcheck can't tell that because it requires contextual knowledge, and shellcheck has decided to tilt strongly towards having opinions on your style instead of just reporting on things that are almost certainly errors or problems. A linter having style opinions is a completely fair thing and can be valuable, but it basically always raises the linter's noise level (and only really works if you agree with the linter's opinions).

(This also shows the limitations of Shellcheck's abilities, because $TMPDIR never changes in the rest of the script so the actual behavior of the trap would be the same either way. But Shellcheck either can't see that or doesn't care, so it emits this warning for an issue that doesn't actually exist in the real script.)

PS: To be absolutely clear, I don't think this makes Shellcheck a bad idea or a bad implementation. It's just not a tool that I'm likely to find very useful in practice, and thus not one I'm likely to use (despite initial hopes for it, because of course I'd love a good way to find lurking errors in my Bourne shell scripts).

Sidebar: The signal to noise ratio in practice

Some people will say that a linter that produces some signal makes it worth wading through whatever noise in order to get said signal. This might be correct in theory, for extremely high-assurance programming, but it is wrong in practice. Besides other issues, it's generally very hard for humans to find that small amount of signal in a lot of noise.

For less critical programming, it is unquestionably a bad ratio of work to reward. There are almost always going to be more productive things you can do with X amount of time than to pick through lots of linter noise in search of some signal that may or may not be there for your current code.


Comments on this page:

From 67.81.140.22 at 2017-07-06 08:07:53:

In this case, your first example will change all whitespace to one space. Additionally, some shells will interpret escape characters in the string and some will not. This seems to be, in fact, an example of how shellcheck is conservative with warnings, assuming you are familiar with how echo works in your particular shell.

In the second example, the exact warning given may not be useful, but it points at a separate problem: if $TMPDIR contains any whitespace, the removal will not proceed correctly. moreover, it is questionable whether this trap is written well semantically, as the system-wide temporary directory may be accidentally deleted if the variable was not set. In my opinion, this is not a "style" issue but instead falls into the second category of shellcheck's goals: "To point out and clarify typical intermediate level semantic problems that cause a shell to behave strangely and counter-intuitively.".

   ./testit '*'

I'd say it found a bug. ;)

By cks at 2017-07-06 11:51:20:

Wow, you're right. For some reason I had completely forgotten that glob characters in shell variables are processed when the shell variable is expanded in an unquoted context. Shellcheck even mentioned 'globbing' in its warning message and I passed right over that without pausing to wonder.

(This sort of multi-level expansion doesn't happen in my usual shell, so I don't ever trip over it when I write things on the command line and so on. And in practice I almost never need to have my shell scripts process things with globbing characters in them.)

The point about all whitespace being condensed down is also a good one. In some contexts this might matter (and of course in some contexts you actively want the whitespace to be made uniform).

By John Wiersba at 2017-07-06 14:14:47:

Even better than a fix with just "$1" would be using printf 'Error: %s\n' "$1".

Hi, I'm the ShellCheck author.

Your comments are fair. You're in the underserved group of users who are fully capable of writing complex, working shell scripts, and mostly just want to sanity check them after they're done.

ShellCheck (today at v0.4.6) is pretty good at actively guiding novice level scripters through the many weird sh caveats that make no sense in any other language, but it's not as good as a passive safety valve for advanced scripters.

This is a known issue and I want to fix it. The primary plan to achieve this is with a verbosity knob. Hope you'll try it again when this is in place!

Regards, Vidar

PS: As for your script missing quoting, here are some additional issues not yet mentioned:

   ./testit '/*/*/*/*/*/*/*'  # Local DoS (may churn for 20+ minutes or gobble RAM)
   ./testit '-e \x1b[33mTest' # Ascii input results in binary output (when sh is bash)
   ./testit '-n foo'          # -n disappears

Meanwhile, tr -s ' ' or fmt -s will let you squeeze spaces with far fewer caveats than word splitting.

You may be running such a script in a context where quoting doesn't matter, but I think you'll agree that the fix is much shorter and easier than that justification.

Perhaps it would be good style to write '"$1"' instead, but then it might be good style to litter your Go code with markers that you're explicitly ignoring errors (and thus make errcheck shut up). Neither are going to happen in this reality.

For me personally, it is – it has, in fact. I do reflexively quote every single use of a variable in my shell scripts, and omit the quotes only when I positively actively want word splitting.

I’m less consistent about the quotes when I’m typing commands into the prompt because in that situation I know with some certainly what values the code will encounter, but even then I still quote some fraction of stuff just out of reflex.

I’m just too aware that uninvited control and meta characters happen and that word splitting is very complex semantically. I won’t refuse it when it fits the bill, but I do make sure to only invoke it deliberately.

Living in rc may have allowed you to drop your guard in those respects.

(When it comes to errcheck, OTOH, I suspect my attitude would match yours. I just don’t use Go enough to know that.)

Some people will say that a linter that produces some signal makes it worth wading through whatever noise in order to get said signal.

That is effectively virtue signalling, I think. Even these people actually have a threshold. Ask them if it’s worth going through 1,000 messages to find a single valuable one. 10,000? 100,000? Some may even continue to insist that it’s worthwhile in the face of increasingly long odds, but even they will not act accordingly in practice. A scarce few might even do it once or twice out of conviction or to demonstrate their commitment… but they will soon stop. The reward is simply too low.

In reality, someone claiming that just unconsciously considers their own threshold to be “infinite for all purposes and intents”, i.e. they implicitly take it as an inarguably reasonable cut-off point.

There is no productive discussion to be had there.

Written on 06 July 2017.
« How I shot my foot because the Bourne shell is different
Link: Survey of [floating point] Rounding Implementations in Go »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Thu Jul 6 02:04:04 2017
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.