Sometimes the simple answers are the right ones (a lesson from bash)

April 20, 2012

A co-worker recently had a cronjob report the following error message, which he asked us for help with:

bash: /a/local/script: /bin/sh: bad interpreter: No such file or directory

At the time that this happened, other messages got logged suggesting that the machine was also apparently under memory pressure.

When I saw this, my mind immediately jumped to ELF interpreters, better known as the dynamic loader. I promptly suggested that maybe the kernel wasn't able to load the dynamic loader for sh, perhaps because of memory pressure or something. However I was wrong, as some web searchers for the error message showed me when I bothered to do them. What's going on is much simpler (although maybe odder) than something complicated about some part of the dynamic loader not working right.

In fact, what's going on is right there in the error message if I had bothered to read it. Here, let me show you with a little test:

$ bash
$ cat /tmp/exmpl
echo hi there
$ /tmp/exmpl
bash: /tmp/exmpl: /bin/shx: bad interpreter: No such file or directory

As odd as it sounds, this error message was almost certainly generated because (temporarily) there was no /bin/sh. Since /bin/sh is a magically maintained symlink on many current Linuxes, this is slightly less odd and peculiar than it seems (it's possible that some package manipulation made the symlink disappear temporarily), but it's still pretty odd.

To sum up: bash really was telling my co-worker what was wrong and the error was report was not some peculiar coded message in a bottle that needed a complex, obscure interpretation. The simple answer was the right one. Sometimes that's just how it goes; not everything in system administration is a complex puzzle.

(As it happens I feel that if I'd paid attention to the error message and how it was structured, I would have seen that my complex theory was pretty sure to be wrong. But that's sufficiently tangled to need another entry.)

Comments on this page:

From at 2012-04-20 03:03:03:

I encountered this kind of problem several times, when someone edits his scripts on a windows machine. So the first line really reads "#!/bin/sh^M", and this interpreter of cause does not exists. The message usually prints the CR as well, so this differs from this problem description.

From at 2012-04-20 11:32:30:

I've encountered errors like then when the script is on a filesystem which is mounted with the noexec option.

Written on 20 April 2012.
« An interesting experience with IP-based SMTP blocks
Bash's superintelligent errors about exec failures »

Page tools: View Source, View Normal, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Fri Apr 20 01:32:47 2012
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.