Sorting out what
exec does in Bourne shell pipelines
Today, I was revising a Bourne shell script. The original shell script
ended by running
rsync with an
exec like this:
exec rsync ...
(I don't think the
exec was there for any good reason; it's a
I was adding some filtering of errors from
rsync, so I fed its
standard error to
egrep and in the process I removed the
so it became:
rsync ... 2>&1 | egrep -v '^(...|...)'
Then I stopped to think about this, and realized that I was working
on superstition. I 'knew' that
exec and anything else didn't work, and in fact I had
a memory that it caused things to malfunction. So I decided to
investigate a bit to find out the truth.
To start with, let's talk about what we could think that
here (and what I hoped it did when I started digging). Suppose that
you end a shell script like this:
#!/bin/sh [...] rsync ... 2>&1 | egrep -v '...'
When you run this shell script, you'll wind up with a hierarchy of
three processes; the shell is the parent process, and then generally
rsync and the
egrep are siblings. Linux's
represent this as '
sh───2*[sleep]', and my favorite tool shows it like so:
pts/10 | 17346 /bin/sh thescript pts/10 | 17347 rsync ... pts/10 | 17348 egrep ...
exec worked here the way I was sort of hoping it would, you'd
get two processes instead of three, with whatever you
rsync or the
egrep) taking over from the parent
shell process. Now that I think about it, there are some reasonably
decent reasons to not do this, but let's set that aside for now.
What I had a vague superstition of
exec doing in a pipeline was
that it might abruptly truncate the pipeline. When it go to the
exec the shell just did what you told it to, ie
exec the process,
and since it had turned itself into a process it didn't go on to
set up the rest of the pipeline. That would make '
... | egrep' be the same as just '
exec rsync ...', with the
egrep effectively ignored. Obviously you wouldn't want that,
hence me automatically taking the
Fortunately this is not what happens. What actually does happen is
not quite that the
exec is ignored, although that's what it looks
like in simple cases. To understand what's going on, I had to start
by paying careful attention to how
exec is described, for example
in Dash's manpage:
Unless command is omitted, the shell process is replaced with the specified program [...]
I have emphasized the important bit. The magic trick is what 'the shell process' is in a pipeline. If we write:
exec rsync ... | egrep -v ...
When the shell gets to processing the
exec, what it considers
'the shell process' is actually the subshell running one step of
the pipeline, here the subshell that exists to run
subshell is normally invisible here because for simple commands
like this, the (sub)shell will immediately
exec just instructs this subshell to do what it was already
going to do.
We can cause the shell to actually materialize a subshell by putting multiple commands here:
(/bin/echo hi; sleep 120) | cat
If you look at the process tree for this, you'll probably get:
pts/9 | 7481 sh pts/9 | 7806 sh pts/9 | 7808 sleep 120 pts/9 | 7807 cat
The subshell making up the first step of the pipeline could end by
sleep, but it doesn't (at least in Dash and
Bash); once the shell has decided to have a real subshell here, it
stays a real subshell.
If you use
exec in the context of such an actual subshell, it
will indeed replace 'the shell process' of the subshell with the
$ (exec echo hi; echo ho) | cat hi $
exec replaced the entire subshell with the first
so it never went on to run the second
(Effectively you've arranged for an early termination of the subshell.
There are probably times when this is useful behavior as part of a
pipeline step, but I think you can generally use
exit and what you're
actually doing will be clearer.)
(I'm sure that I once knew all of this, but it fell out of my mind until I carefully worked it out again just now. Perhaps this time around it will stick.)
Sidebar: some of this behavior can vary by shell
Let's go back to '
(/bin/echo hi; sleep 120) | cat'. In Dash
and Bash, the first step's subshell sticks around to be the parent
sleep, as mentioned. Somewhat to my surprise, both the
Fedora Linux version of official ksh93
and FreeBSD 10.4's
optimize away the subshell in this situation. They directly
sleep, as if you wrote:
(/bin/echo hi; exec sleep 120) | cat
There's probably a reason that Bash skips this little optimization.