Why it's sensible for large writes to pipes to block
Back in this entry I said that large writes to pipes blocking instead of immediately returning with a short write was a sensible API decision. Today let's talk about that, by way of talking about how deciding the other way would be a bad API.
Let's start with a question: in a typical Unix pipeline program like
grep, what would be the sensible reactions to trying to write a large
amount of data returning a short write indicator? This is clearly not
an error that should cause the program to abort (or even to print a
warning); instead it's a perfectly normal thing if you're producing
output faster than the other side of the pipe can consume it. For most
programs, that means the only thing you can really do is pause until you
can write more to the pipe. The conclusion is pretty straightforward;
in a hypothetical world where such too-large pipe writes returned short
write indicators instead of blocking, almost all programs would either
wrap their writes in code that paused and retried them or arrange to set
a special flag on the file descriptor to say 'block me until everything
is written'. Either or both would probably wind up being part of stdio.
If everything is going to have code to work around or deal with something, this suggests that you are picking the wrong default. Thus large writes to pipes blocking by default is the right API decision because it means everyone can write simpler and less error-prone code at the user level.
(There are a number of reasons this is less error-prone, including both
programs that don't usually expect to write to pipes (but you tell them
to write to
/dev/stdout) and programs that usually do short writes
that don't block and so don't handle short writes, resulting in silently
not writing some amount of their output some of the time.)
There's actually a reason why this is not merely a sensible API but a good one, but that's going to require an additional entry rather than wedging it in here.
Sidebar: This story does not represent actual history
The description I've written above more or less requires that there is
some way to wait for a file descriptor to become ready for IO, so that
when your write is short you can find out when you can usefully write
more. However there was no such mechanism in early Unixes;
only appeared in UCB BSD (and
poll() and friends are even later).
This means that having nonblocking pipe writes in V7 Unix would have
required an entire set of mechanisms that only appeared later, instead
of just a 'little' behavior change.
(However I do suspect that the Bell Labs Unix people actively felt
that pipe writes should block just like file writes blocked until
complete, barring some error. Had they felt otherwise, the Unix API
would likely have been set up somewhat differently and V7 might
have had some equivalent of
If you're wondering how V7 could possibly not have something like
select(), note that V7 didn't have any networking (partly because
networks were extremely new and experimental at the time). Without
networking and the problems it brings, there's much less need (or use)