A fun Bash buffering bug (apparently on Linux only)

January 6, 2016

I'll present this in the traditional illustrated form:

$ cat repro
function cleanup() {
   r1=$(/bin/echo one)
   r2=$(echo two)
   #echo $r1 '!' $r2 >>/tmp/logfile
   echo $r1 '!' $r2 1>&2
trap cleanup EXIT
sleep 1
echo final

$ ./repro | false
one final ! final two

Wait, what? This should print merely 'one ! two'. The 'echo final' should be written to stdout, which is a pipe to false (and anyways, it's closed by the time the echo runs, since false will already have exited by the time 'sleep 1' finishes). Then the cleanup function should run on termination, with $r1 winding up being "one" and $r2 winding up being "two" from the command substitutions, and they should then get echoed out to standard error as 'one ! two'.

(Indeed you can get exactly this output with a command line like './repro >/dev/null'.)

What is actually happening here (or at least appears to be happening) is an interaction between IO buffering, our old friend SIGPIPE, and forking children while you have unclean state. Based on strace, the sequence of events is:

  1. Bash gets to 'echo final' and attempts to write it to the now-closed standard output pipe by calling 'write(1, "final\n", 6)'. This gets an immediate SIGPIPE.
  2. Since we've set a general EXIT trap, bash immediately runs our cleanup function (since the script is now exiting due to the SIGPIPE).

  3. Bash forks to run the $(/bin/echo one) command substitution. In the child, it runs /bin/echo and then, just before the child exits, does a 'write(1, "final\n", 6)'. This succeeds, since the child's stdout is connected to the parent's pipe. In the main bash process, it reads back one\n and then final\n from the child process, and turns this into "one final" as the value assigned to $r1.

  4. For '$(echo two)', the child process just winds up calling write(1, "final\ntwo\n", 10). This becomes "final two" for $r2's value.

    (This child doesn't fork to run /bin/echo because we're using the Bash builtin echo instead.)

  5. At last, the main process temporarily duplicates standard error to standard output and calls 'write(1, ....)' to produce all of the output we see here.

What appears to be going on is that when the initial 'echo final' in the main Bash process was interrupted by a SIGPIPE, it left "final\n" in stdout's IO buffer as unflushed output. When Bash forked for each $(...) command substitution, the child inherited this unflushed buffer. In the $r1 case, the child noticed this unflushed buffer as it was exiting and wrote it out at that point; in the $r2 case, the child appended the output of its builtin echo command to the unflushed stdout buffer and then wrote the whole thing out. Then, finally, when the parent ran the echo at the end of cleanup(), it too appended its echo output to the stdout buffer and wrote everything out.

There are two Bash bugs here. First, the output from the initial failed 'echo final' should have been discarded from stdout's IO buffer, not retained to show up later on. Second, the children forked for each $(...) should not have inherited this unflushed IO buffer, because allowing unflushed buffers to make it into children is a well known recipe for having exactly this sort of multiple-flush happen.

(Well, allowing any unflushed or un-cleaned-up resource into children will do this, at least if your children are allowed to then notice it and attempt to clean it up.)

I don't know why this bug seems to be Linux specific. Perhaps Bash is using stdio, and Linux's stdio is the only version that behaves this way (either on the initial write that gets SIGPIPE, or in allowing the unflushed buffer state to propagate into forked children). If this is Linux stdio at work, I don't know if the semantics are legal according to POSIX/SUS and in a way it doesn't matter, as stdio libraries with this behavior are on a lot of deployed machines so your code had better be prepared for it.

(Regardless of what POSIX and SUS say, in practice 'standard' Unix is mostly defined this way. Code that you want to be portable has to cope with existing realities, however non-compliant they may be.)

By the way, finding this Bash issue from the initial, rather drastic symptoms that manifested in a complex environment was what they call a lot of fun (and it was Didier Spezia who did the vital work of identifying where the failure was; I just ground away from there).

PS: If you want to see some extra fun, switch which version of the final echo is run in cleanup() and then watch the main Bash process with strace.

Written on 06 January 2016.
« Illumos's problem with its VCS commit messages
The format of strings in early (pre-C) Unix »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Wed Jan 6 01:37:13 2016
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.