A surprisingly arcane little Unix shell pipeline example
In The output of Linux pipes can be indeterministic (via), Marek Gibney noticed that the following shell command has indeterminate output:
(echo red; echo green 1>&2) | echo blue
This can output any of "blue green" (with a newline between them), "green blue", or "blue"; the usual case is "blue green". Fully explaining this requires surprisingly arcane Unix knowledge.
The "blue green" and "green blue" outputs are simply a scheduling
race. The '
echo green' and '
echo blue' are being run in separate
processes, and which one of them gets executed first is up to the
whims of the Unix scheduler. Because the left side of the pipeline
has two things to do instead of one, often it will be the '
blue' process that wins the race.
The mysterious case is when the output is "blue" alone, and to
explain this we need to know two pieces of Unix arcana. The first
is our old friend
SIGPIPE, where if a
process writes to a closed pipe it normally receives a
signal and dies. The second is that '
echo' is a builtin command
in shells today, and so the left side's '
echo red; echo green
1>&2' is actually all being handled by one process instead of the
echo red' being its own separate process.
We get "blue" as the sole output when the '
echo blue' runs so
soon that it exits, closing the pipeline, before the
left side can finish '
echo red'. When this happens the
right left side gets a
SIGPIPE and exits without
echo green' at all. This wouldn't happen if
a specially handled builtin; if it was a separate command (or even
if the shell forked to execute it internally), only the '
process would die from the
SIGPIPE instead of the entire left
side of the pipeline.
So we have three orders of execution:
- The shell on the left side gets through both of its
echos before the '
echo blue' runs at all. The output is "green blue"
- The '
echo red' happens before '
echo blue' exits, so the left side doesn't get
SIGPIPE, but '
echo green' happens afterwards. The output is "blue green".
- The '
echo blue' runs and exits, closing the pipe, before the '
echo red' finishes. The shell on the left side of the pipeline writes output into a closed pipe, gets
SIGPIPE, and exits without going on to do the '
echo green'. The output is "blue".
The second order seems to be the most frequent in practice, although
I'm sure it depends on a lot of things (including whether or not
you're on an SMP system). One thing that may contribute to this is
that I believe many shells start pipelines left to right, ie if you
have a pipeline that looks like '
a | b | c | d', the main shell
will fork the
a process first, then the
b process, and so on.
All else being equal, this will give
a an edge in running before
(This entry is adopted from my comment on lobste.rs, because why not.)