A Bourne shell gotcha with ( ... ) command grouping

July 15, 2009

Here is a mistake that I spent part of today discovering that I'd made.

Consider the following Bourne shell script fragment:

(
for i in $SOMETHING; do
  if ! some-command $i; then
    echo $0: failed on $i 1>&2
    exit 1
  fi
done
) | sort | ....

Tragically, this shell script fragment is broken. The exit is not doing what you think it is doing.

(If it actually is doing what you think it is doing, you need to stop being so clever in your Bourne shell scripts. Use 'break' instead, so that people can understand you later.)

When I wrote this shell script, I clearly thought that this exit would exit from the entire shell script, aborting it with a false status so that various other things could notice that something had gone wrong. But this is incorrect; commands in a ( ... ) command group run in a separate context, so the exit just stopped the for loop, exactly as if it was a break statement. The overall script continued to run and indeed exited with a success status, despite things having blown up.

(Since this involved a pipeline, the same thing would have happened if I wrote the for loop without the ( ... ) around it. Although a bare for loop is legal here, I habitually add the parentheses for clarity.)

For this particular script, I got around the problem by having the failure case echo a magic marker into the for loop's output, and then having the main portion of the script look for the magic marker. You could also do something like capture standard error in a file and check in the main portion to make sure that the file was empty.

(I don't like capturing stderr in scripts if I can help it, so I go out of my way to avoid it.)


Comments on this page:

From 78.35.25.22 at 2009-07-15 06:35:09:
trap 'exit 1' USR1
( kill -USR1 $$ )
echo 'never runs'

Aristotle Pagaltzis

From 71.250.234.178 at 2009-07-15 09:48:54:

I knew what it would do, but probably because I learned $() about 2 years after I learned backtics (`), so I always see parens as separate processes now.

I'm sure you know, but for your other readers' sakes, if you don't care about stderr, you can just send it to null:

somecommand 2> /dev/null

The stuff that throws me is when you've got multiple directions, like in a short oracle script:

sqlplus /nolog << EOF > output.log
connect internal;
select whatever;
exit;
EOF

I don't know why it took me so long to figure out where to put that output.log part

Matt Simmons
http://www.standalone-sysadmin.com

By Dan.Astoorian at 2009-07-15 10:39:34:

I somtimes use {...} instead of (...) to group commands for this purpose; e.g.:

   run_command || { echo "Error..." >&2; exit 1; }

An if-then block would probably be clearer, but more verbose.

This doesn't help for pipelines, though, where the subshell is implicit.

Note that this distinction applies to shell functions as well; compare:

   fn() {
       echo "$*"
       exit 1
   }
   fn Hello ; echo World

versus:

   fn() (
       echo "$*"
       exit 1
   )
   fn Hello; echo World

--Dan

Written on 15 July 2009.
« Some stuff on NFS access restrictions
Another reason to safely update files that are looked at over NFS »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Wed Jul 15 01:04:53 2009
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.