A surprisingly arcane little Unix shell pipeline example

March 4, 2019

In The output of Linux pipes can be indeterministic (via), Marek Gibney noticed that the following shell command has indeterminate output:

(echo red; echo green 1>&2) | echo blue

This can output any of "blue green" (with a newline between them), "green blue", or "blue"; the usual case is "blue green". Fully explaining this requires surprisingly arcane Unix knowledge.

The "blue green" and "green blue" outputs are simply a scheduling race. The 'echo green' and 'echo blue' are being run in separate processes, and which one of them gets executed first is up to the whims of the Unix scheduler. Because the left side of the pipeline has two things to do instead of one, often it will be the 'echo blue' process that wins the race.

The mysterious case is when the output is "blue" alone, and to explain this we need to know two pieces of Unix arcana. The first is our old friend SIGPIPE, where if a process writes to a closed pipe it normally receives a SIGPIPE signal and dies. The second is that 'echo' is a builtin command in shells today, and so the left side's 'echo red; echo green 1>&2' is actually all being handled by one process instead of the 'echo red' being its own separate process.

We get "blue" as the sole output when the 'echo blue' runs so soon that it exits, closing the pipeline, before the right left side can finish 'echo red'. When this happens the right left side gets a SIGPIPE and exits without running 'echo green' at all. This wouldn't happen if echo wasn't a specially handled builtin; if it was a separate command (or even if the shell forked to execute it internally), only the 'echo red' process would die from the SIGPIPE instead of the entire left side of the pipeline.

So we have three orders of execution:

  1. The shell on the left side gets through both of its echos before the 'echo blue' runs at all. The output is "green blue"

  2. The 'echo red' happens before 'echo blue' exits, so the left side doesn't get SIGPIPE, but 'echo green' happens afterwards. The output is "blue green".

  3. The 'echo blue' runs and exits, closing the pipe, before the 'echo red' finishes. The shell on the left side of the pipeline writes output into a closed pipe, gets SIGPIPE, and exits without going on to do the 'echo green'. The output is "blue".

The second order seems to be the most frequent in practice, although I'm sure it depends on a lot of things (including whether or not you're on an SMP system). One thing that may contribute to this is that I believe many shells start pipelines left to right, ie if you have a pipeline that looks like 'a | b | c | d', the main shell will fork the a process first, then the b process, and so on. All else being equal, this will give a an edge in running before d.

(This entry is adopted from my comment on lobste.rs, because why not.)


Comments on this page:

We get "blue" as the sole output when the 'echo blue' runs so soon that it exits, closing the pipeline, before the right side can finish 'echo red'. When this happens the right side gets a SIGPIPE and exits without running 'echo green' at all.

should read

We get "blue" as the sole output when the 'echo blue' runs so soon that it exits, closing the pipeline, before the left side can finish 'echo red'. When this happens the left side gets a SIGPIPE and exits without running 'echo green' at all.

By cks at 2019-03-06 11:46:39:

Oh, what an embarrassing mistake for me to have made there. Thank you for noticing and I've corrected it now.

Bizarrely, on my system (heavily customized zsh) I get blue way more often than either of the other ones

On my system (Ubuntu 16.04), echo isn't a builtin; it's /bin/echo. I get the blue <newline> green output consistently, as far as I can tell.

Actually, after more testing, I do see the green <newline> blue output occasionally.

By Joscho at 2019-03-10 13:21:43:

@Peter Donis: Which shell are you using in Ubuntu 16.04? If it is bash, I think you are wrong about echo being external. It has been a builtin in sh for decades and bash is a superset of sh. In particular "which echo" is not a good way to check for builtins. You should use "type echo" to do so.

Of course you also(!) have an external echo as you verified with "which echo". But as long as you do not specify its full path the builtin will be used.

By John Lauro at 2019-03-10 19:27:58:

If you use ksh you will never miss the green, even if you add a sleep which will also miss it on bash. ( echo red ; sleep 1 ; echo green 1>&2 ) | echo blue

Written on 04 March 2019.
« Understanding a change often requires understanding how the code behaves
Using Prometheus subqueries to look for spikes in rates »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Mon Mar 4 23:55:34 2019
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.