V7 Unix programs are often not written the way you would expect

April 19, 2019

Yesterday I wrote that V7 ed read its terminal input in cooked mode a line at a time, which was an efficient, low-CPU design that was important on V7's small and low-power hardware. Then in comments, frankg pointed out that I was wrong about part of that, namely about how ed read its input. Here, straight from the V7 ed source code, is how ed read input from the terminal:

	if (read(0, &c, 1) <= 0)
		return(lastc = EOF);
	lastc = c&0177;

	while ((c = getchr()) != '\n') {

(gettty() reads characters from getchr() into a linebuf array until end of line, EOF, or it runs out of space.)

In one way, this is surprising; it's very definitely not how we'd write this today, and if you did, many Unix programmers would immediately tell you that you're being inefficient by making so many calls to read() and you should instead use a buffer, for example through stdio's fgets(). Very few modern Unix programs do character at a time reads from the kernel, partly because on modern machines it's not very efficient.

(It may have been comparatively less inefficient on V7 on the PDP-11, if for example the relative cost of making a system call was lower than it is today. My impression is that this may have been the case.)

V7 had stdio in more or less its modern form, complete with fgets(). V6 had a precursor version of stdio and buffered IO (see eg the manpage for getc()). However, many V7 and V6 programs didn't necessarily use them; instead they used more basic system calls. This is one of the things that often gives the code for early Unix programs (V7 and before) an usual feel, along with the short variable names and the lack of comments.

The situation with ed is especially interesting, because in V5 Unix, ed appears to have still been written in assembly; see ed1.s, ed2.s, and ed3.s here in 's1' of the V5 sources. In V6, ed was rewritten in C to create ed.c (still in a part of the source tree called 's1'), but it still used the same read() based approach that I think it used in the assembly version.

(I haven't looked forward from V7 to see if later versions were revised to use some form of buffering for terminal input.)

Sidebar: An interesting undocumented ed feature

Reading this section of the source code for ed taught me that it has an interesting, undocumented, and entirely characteristic little behavior. Officially, ed commands that have you enter new text have that new text terminate by a . on a line by itself:

$ ed newfile
this is new text that we're adding.

This is how the V7 ed manual documents it and how everyone talks about. But the actual ed source code implements this on input is, from that gettty() function:

if (linebuf[0]=='.' && linebuf[1]==0)

In other words, it turns a single line with '.' into an EOF. The consequence of this is that if you type a real EOF at the start of a line, you get the same result, thus saving you one character (you use Control-D instead of '.' plus newline). This is very V7 Unix behavior, including the lack of documentation.

This is also a natural behavior in one sense. A proper program has to react to EOF here in some way, and it might as well do so by ending the input mode. It's also natural to go on to try reading from the terminal again for subsequent commands; if this was a real and persistent EOF, for example because the pty closed, you'll just get EOF again and eventually quit. V7 ed is slightly unusual here in that it deliberately converts '.' by itself to EOF, instead of signaling this in a different way, but in a way that's also the simplest approach; if you have to have some signal for each case and you're going to treat them the same, you might as well have the same signal for both cases.

Modern versions of ed appear to faithfully reimplement this convenient behavior, although they don't appear to document it. I haven't checked OpenBSD, but both FreeBSD ed and GNU ed work like this in a quick test. I haven't checked their source code to see if they implement it the same way.

Written on 19 April 2019.
« Links: A Practitioner's Guide to System Dashboard Design (with a bonus)
My view on upgrading Prometheus (and Grafana) on an ongoing basis »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Fri Apr 19 23:49:59 2019
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.