== Unix shell pipelines have two usage patterns I've seen a variety of recommendations for safer shell scripting that use Bash and set its 'pipefail' option (for example, [[this one from 2015 https://vaneyckt.io/posts/safer_bash_scripts_with_set_euxo_pipefail/]]). This is a good recommendation in one sense, but it exposes a conflict; this option works great for one usage pattern for pipes, and potentially terribly for another one. To understand the problem, let's start with what Bash's pipefail does. To quote the [[Bash manual https://www.gnu.org/savannah-checkouts/gnu/bash/manual/bash.html]]: > The exit status of a pipeline is the exit status of the last command > in the pipeline, unless the _pipefail_ option is enabled. If > _pipefail_ is enabled, the pipeline’s return status is the value of > the last (rightmost) command to exit with a non-zero status, or zero > if all commands exit successfully. [...] The reason to use _pipefail_ is that if you don't, a command failing unexpectedly in the middle of a pipeline won't normally be detected by you, and won't abort your script if you used '_set -e_'. You can go out of your way to carefully check everything with _$PIPESTATUS_, but that's a lot of extra work. Unfortunately, this is where [[our old friend _SIGPIPE_ ../linux/BashPipes]] comes into the picture. What _SIGPIPE_ does in pipelines is force processes to exit if they write to a closed pipe. This happens if a later process in a pipeline doesn't consume all of its input, for example if you only want to process the first thousand lines of output of something: .pn prewrap on > generate --thing | sed 1000q | gronkulate The _sed_ exits after a thousand lines and closes the pipe that _generate_ is writing to, _generate_ gets _SIGPIPE_ and by default dies, and suddenly its exit status is non-zero, which means that with _pipefail_ the entire pipeline 'fails' (and with '_set -e_', your script will normally exit). (Under some circumstances, [[what happens can vary from run to run due to process scheduling ShellPipelineIndeterminate]]. It can also depend on how much output early processes are producing compared to what later processes are filtering; if _generate_ produces 1000 lines or less, _sed_ will consume all of them.) This leads to two shell pipeline usage patterns. In one usage pattern, all processes in the pipeline consume their entire input unless something goes wrong. Since all processes do this, no process should ever be writing to a closed pipe and _SIGPIPE_ will never happen. In another usage pattern, at least one process will stop processing its input early; often such processes are in the pipeline specifically to stop at some point (as _sed_ is in my example above). These pipelines will sometimes or always generate _SIGPIPE_s and have some processes exiting with non-zero statuses. Of course, you can deal with this in an environment where you're using _pipefail_, even with '_set -e_'. For instance, you can force one pipeline step to always exit successfully: > (generate --thing || true) | sed 1000q | gronkulate However, you have to remember this issue and keep track of what commands can exit early, without reading all of their input. If you miss some, your reward is probably errors from your script. If you're lucky, they'll be regular errors; if you're unlucky, they'll be sporadic errors that happen when one command produces an unusually large amount of output or another command does its work unusually soon or fast. (Also, it would be nice to only ignore _SIGPIPE_ based failures, not other failures. If _generate_ fails for other reasons, we'd like the whole pipeline to be seen as having failed.) My informal sense is that the 'consume everything' pipeline pattern is far more common than the 'early exit' pipeline pattern, although I haven't attempted to inventory my scripts. It's certainly the natural pattern when you're filtering, transforming, and examining all of something (for example, to count or summarize it).