Wandering Thoughts archives

2008-12-28

Discovering things while researching Unix history

One of the nice things about researching things like the history of dump is that it teaches me things about Unix's history that I didn't know. For example, until I decided to write up dump's history, I had no idea that it had actually originated as far back as V6; for some reason I had it in my mind as a UCB invention.

(Perhaps because the idea of dump always struck me as the kind of baroque thing that UCB would come up with, instead of the sort of nice clean solution that I like to think of Bell Labs creating. Although, if we are being honest, V7 had its share of hacks too.)

Unix history is a bit arcane and I didn't start using Unix early enough to be fully familiar with all of it, so I can be fuzzy (or outright mistaken) about the exact details. Fortunately there are places like tuhs.org where I can actually check the primary sources. (Why I care about this stuff is another question entirely, one that I don't have a satisfactory answer to.)

On a side note, I put an unjustified slam on cpio into my original entry. According to Wikipedia, cpio seems to have been invented more or less at the same time as tar, just by a different group inside of AT&T (see the history of PWB/UNIX).

UnixHistoryDiscoveries written at 23:35:00; Add Comment

2008-12-17

Some reasons why I like vi

I will reluctantly admit it: I sort of like vi. It is a grudging like, for vi is not my favorite editor and it has its weaknesses, but still, there are things about vi that I keep finding cool and nice and attractive. One significant reason is vi's regularity and what I will call 'composability'.

(Let me note that by 'vi' I mean the entire vi family.)

To make up terminology, vi more or less has two sorts of commands: commands that change the text (call them 'action commands'), and commands that move the cursor (movement commands). Vi's regularity and composability is that pretty much every action command can be used with pretty much any movement command to specify what to operate on.

Once you learn this and it really sinks in (which took a while in my case), what you get is an immense amount of leverage for your vi knowledge. If you know a navigation trick, you can extend it to select text for action commands. When you master a new movement command, all of your action commands also get more powerful; when you master a new action command, all of your movement commands get more useful. When you learn something new, it doesn't just add to your abilities, it multiplies them in a virtuous feedback loop.

(I think that this is a great way of growing people's expertise and rewarding them for it. Unfortunately it is hard to copy in other contexts.)

This interacts nicely with another reason I like vi: its support for pipes, specifically piping text through commands. Pragmatically, this means that I can extend vi (in some ways) by writing shell scripts and hooking it to specialized superintelligent programs like par. Since piping is just another action command, it takes all of the usual movement commands for text selection.

As a side note, this composability also gives you little things to figure out and learn all the time. Even if you're not learning new movement and action commands, you can always see if you can come up with anything useful to do with a combination you haven't tried before.

(Since this experimentation gives you a stream of little rewards for playing around with vi, it is perhaps no wonder that people wind up loving it.)

LikingVi written at 01:30:02; Add Comment

2008-12-05

A little gotcha when implementing shell read

Reading a line from standard input, as the shell's read builtin does, certainly seems like it should be easy to implement or reimplement (if one's shell is sufficiently primitive). However, it turns out that there is a subtle problem that forces a pretty inefficient implementation, and makes it basically impossible to duplicate read with any common Unix program.

I can neatly illustrate the problem with a little script:

(echo a; echo b) |
   (echo `sed 1q`; echo `sed 1q`)

This attempts to duplicate the effects of read with 'sed 1q' (one could substitute 'head -1' if desired). What you'd expect and like this to produce as output is two lines, one with 'a' and one with 'b'. However, if you try it you'll discover it produces something else: it produces 'a' and then a blank line.

What is going on is that sed, like basically all conventional Unix programs, is reading from standard input in buffered mode. Thus, when the first sed does its read it reads both lines (and then prints one and exits), leaving nothing for the second sed to read and print.

In order for read to behave correctly, it cannot over-read; it can never read more than one line, because it can't guarantee that it can put back the excess. Unfortunately, on Unix the only way to be sure that you read exactly one line is to read() character by character until you read the terminating newline, which is inefficient (since you are making a system call for each character).

(Note that the sed-based version would work if you didn't feed it from a pipe and instead ran it interactively, because the kernel line-buffers tty input for you and sed will quit immediately after reading and printing the first line. This can make the problem hard to see or to debug.)

ReadBufferingIssue written at 00:00:28; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.