Wandering Thoughts archives

2012-04-22

I may be wrong about my simple answer being the right one

In a recent entry I wrote about how I had misinterpreted an error message from bash about a script failing, and I also mentioned in passing that if I had paid attention to the structure of the error message I would have known that I was wrong. I take that back. Detailed investigation has now left me more confused than I was before and less confidant of what exactly my co-worker's problem was (and absolutely sure that paying attention to the structure of the error message does not really help). The problem is related to bash being too smart for its own good in error messages; because of bash's smartness but not huge smartness, we cannot tell what the actual error is.

As a reminder, here's bash's error message:

bash: /a/local/script: /bin/sh: bad interpreter: No such file or directory

You would think that this means that /bin/sh is not present; after all, it is the straightforward interpretation of the error, plus bash has actually gone out of its way to give you a more detailed error message. Unfortunately, that is the wrong interpretation of the error message. What bash is really reporting is two separate facts:

  • /bin/sh is the listed interpreter for /a/local/script
  • when bash attempted to exec() the script, the kernel told it ENOENT, 'No such file or directory'.

Bash does not mean that /bin/sh is missing; it never bothers to check that (and arguably it can't do so reliably). This matters because as we saw in my previous entry, the kernel will also report ENOENT if the ELF interpreter for a binary is missing. Now, you guessed it, if your script has a #! line that points to a binary which has a missing ELF interpreter:

bash: /tmp/exmpl: /tmp/a.out: bad interpreter: No such file or directory

(/tmp/a.out exists and is nominally executable, but I binary edited it to have a nonexistent ELF interpreter.)

So in my co-worker's case, we can't definitively conclude that /bin/sh was temporarily missing. All we know is that for some reason the exec() returned ENOENT, and that there are at least two potential reasons for it. A /bin/sh symlink being missing is still probably the most likely explanation, but on a system that's under unusual stresses things start getting rather uncertain here.

(I am far from certain that I could predict all of the reasons that the Linux kernel would return ENOENT on exec() without actually tracing the kernel code. And even then I'm not sure, since there's a lot of deep bits involved and thus a lot of code to really understand.)

linux/BashNoInterpreterMsgII written at 03:12:22; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.