2013-05-13
The Unix philosophy is not an end to itself
Today I feel like opening a can of worms that I've alluded to before.
Here is something very important about the Unix philosophy (regardless of what exactly that is): the Unix philosophy was not conceived as an empty philosophy that was an end to itself. Instead it is above all a theory about how to make computers easy, powerful, and useful. This philosophy (or at least the things built by people following it at Bell Labs and elsewhere) has been extraordinarily successful, and I'm not just talking about Unix; concepts first pioneered in Unix and C now form core pieces of pretty much every computer system in the world.
But it's possible to take this too far. To put it one way, it's my strong view that the core goal of Unix is to be useful, not to be philosophically pure. The underlying purpose comes first and fitting how to be useful into 'the Unix way of doing things' comes second. If Unix has to be non-Unixy for a while (or even permanently) in order to be useful, then, well, I pick usefulness. Excessive minimalism and 'Unixness' for the sake of minimalism and Unixness is a kind of masochism.
(Of course the devil is in the details, as it always is. It's certainly possible to ruin Unix without getting anything worth it in exchange.)
What this biases me towards is an environment where one solves the
problem first then try to make it fit into the traditional 'Unix way'
second. Which is why part of me thinks that GNU sort's -h option is perfectly fine because it solves a real problem (and
solves it now).
(The counterargument is that Unix cannot be all things to all people. As with all systems, at some point you have to draw a line and say 'this doesn't fit, you need to go elsewhere'. I don't know how to balance this. I do know that a certain amount of griping about 'the one true Unix way' and how (some) modern Unixes are ruining it reminds me an awful lot of the griping of Lisp adherents at the rise of Unix, and for that matter the griping of Unix people (myself sometimes included) at the rise of Windows and Macs.)
2013-05-05
Unix is not necessarily Unixy
As I've written about before, in some quarters there is a habit of saying that everything added to Unix needs to be 'Unixy'. One of the many problems with this is that a number of aspects of Unix itself are not 'Unixy'. I don't mean that in a theoretical way, where we debate about whether a particular API or approach is really 'Unixy'. I mean that in a concrete sense, in that Bell Labs, generally regarded as the home of Unix and the people who understand its essential nature best, built various things differently than mainline Unix. In some cases they did this after mainline Unix had established something, which is a clear sign that they felt that other Unix developers had gotten it wrong.
(In the end their vision of the right way to do things was so extreme that they started over from scratch so they didn't have to worry about backwards compatibility. The result of that was Plan 9.)
The easiest place to see this is in the approach that Bell Labs took to
networking. Unfortunately I don't believe that manual pages from post-V7
Research Unix are online, but the next best thing is the networking
manual pages for Plan 9 (which has essentially the same interface from
what I understand). Plan 9 networking is completely different from the
BSD sockets API that is now the Unix standard; it is in large part much
more high level. You can read about it in the Plan 9 dial(2) manpage, and a version of
this interface without the Plan 9 bits has resurfaced in the Go net
package's Dial() and Listen() APIs.
You can certainly argue that these APIs are fundamentally not comparable to the BSD sockets API because they're on a different level (the BSD sockets API is a kernel API, while most of the Plan 9 API is implemented in library code). But in a sense this is besides the point, which is that the Plan 9 API is how Bell Labs thought programs should do networking.
(You can also argue that the Plan 9 API is insufficient in practice and that programs need and want more control over networking than it offers. I'm sympathetic to this argument but it does open up a can of worms about when one should discount the Bell Labs view on 'what is Unix' and what can replace it.)
2013-05-01
Two xargs gotchas that you may not know about
I know, I've been harping on xargs a bit lately. But this stuff is
important because most people's vague intuitions about how xargs
behaves is actually wrong.
If you're like most people, you probably vaguely think that xargs
operates on lines of input and the purpose of the GNU -0 extension to
xargs (and find et al) is so that some joker putting a newline in a
file name doesn't cause the world to blow up. Actually it's much worse
than that.
The simple way to put this is xargs doesn't operate on lines, it
operates on words. Words are the same as lines only if your lines
don't have any whitespace, backslashes, single quotes (') or double
quotes ("), all of which xargs will interpret in various ways. Oh,
and blank lines are neither errors nor empty arguments under normal
circumstances, they are simply word-separating whitespace. In short,
newlines are only the beginning of the things that nasty people can put
in their filenames to give you heartburn.
(Normally you don't see any of this because your input to xargs is
well formed and simple.)
The other trap (as I alluded to) is the
portable behavior of xargs if you don't give an explicit -E
argument. If you don't, some versions of xargs will assume that a
line with only an underscore (_) actually means the (logical) end
of file and won't read any further input. It will probably surprise no
one that Solaris 10 update 8 (that bastion of old times) behaves this
way. Fortunately Linux, FreeBSD, and OpenBSD don't appear to do so.
(One of the morals here is that sometimes GNU programs make important
innovations, as I believe that xargs -0 and find ... -print0 came
from GNU.)