Wandering Thoughts

2019-02-12

Using grep with /dev/null, an old Unix trick

Every so often I will find myself writing a grep invocation like this:

find .... -exec grep <something> /dev/null '{}' '+'

The peculiar presence of /dev/null here is an old Unix trick that is designed to force grep to always print out file names, even if your find only matches one file, by always insuring that grep has at least two files as arguments. You can wind up wanting to do the same thing with a direct use of grep if you're not certain how many files your wildcard may match. For example:

grep <something> /dev/null */*AThing*

This particular trick is functionally obsolete because pretty much all modern mainstream versions of grep support a -H argument to do the same thing (as the inverse of the -h argument that always turns off file names). This is supported in GNU grep and the versions of grep found in FreeBSD, OpenBSD, NetBSD, and Illumos. To my surprise, -H is not in the latest Single Unix Specification grep, so if you care about strict POSIX portability, you still need to use the /dev/null trick.

(I am biased, but I am not sure why you would care about strict POSIX portability here. POSIX-only environments are increasingly perverse in practice (arguably they always were).)

If you stick to POSIX grep you also get to live without -h. My usual solution to that was cat:

cat <whatever> | grep <something>

This is not quite a pointless use of cat, but it is an irritating one.

For whatever reason I remember -h better than I do -H, so I still use the /dev/null trick every so often out of reflex. I may know that grep has a command line flag to do what I want, but it's easier to throw in a /dev/null than to pause to reread the manpage if I've once again forgotten the exact option.

GrepDevNull written at 23:40:09; Add Comment

2019-02-02

A little appreciation for Vim's 'g' command

Although I've used vim for what is now a long time now, I'm still somewhat of a lightweight user and there are vast oceans of useful vim (and vi) commands that I either don't know at all or don't really remember and use only rarely. A while back I wrote down some new motion commands that I wanted to remember, and now I have a new command that I want to remember and use more of. That is vim's g command (and with it, its less common cousin, v), or if you prefer, ':g'.

Put simply, g (and v) are filters; you apply them to a range of lines, and for lines that they match (or don't match), they then run whatever additional commands you want. For instance, my recent use of g was that I had a file that listed a bunch of checks to do to a bunch of machines, one per line, and I wanted to comment out all lines that referred to a test machine. With g, this is straightforward:

:g/cksvm3/ s/^/#/

(There's a whole list of additional things and tricks you can do with g here.)

Since I just tested this, it's valid to stack g and v commands together, so you can comment out all mentions of a machine except for one check with:

:g/cksvm3/ v/staying/ s/^/#/

This works because the commands run by g and v are basically passed the matching line numbers, so the v command is restricted to checking the line(s) that g matched.

There are probably clever uses of g and v in programming and in writing text, but I expect to mostly use them when editing configuration files, since configuration files are things where lines are often important in isolation instead of as part of a block.

Vim (and vi before it) inherited g and v from ed, where it appears even in V7 ed. However, at least vim has expanded them from V7 ed, because in V7 ed you can't stack g and v commands (a limitation which was carried forward to 4.x BSD's ed).

(Amusingly, what prompted me about the existence of g and v in Vim was writing my entry on the differences between various versions of ed. Since they were in ed, I was pretty sure they were also in Vim, and then recently I had a use for g and actually remembered it.)

VimGCommandPraise written at 18:59:19; Add Comment

2019-01-07

Daemons and the pragmatics of unexpected error values from system calls

When I wrote about the danger of being overly specific in the errno values you look for years ago, I used the example of a SMTP server daemon that died when it got an unexpected error from accept(). Recently, John Wiersba asked in a comment:

I'm not clear what you're suggesting here. Isn't logging the error code and aborting the right thing to do with unexpected errors? [...]

In practice, there are two situations in Unix programs, especially in daemons. The first situation is where a system call is more or less done once, is not expected to fail at all, and cannot really be fixed if it does fail. Here you generally want to fail out on any error. The second situation is where the system call may fail for transient reasons. One case is certainly accept(), since accept() is trying to return two sorts of errors, but there are plenty of other cases where a system call may fail temporarily and then work later (as dozzie mentioned in comments to yesterday's entry on accept()).

In the second situation, you cannot tell transient errors from persistent ones, not in general, because Unixes add both transient and persistent errno values to system calls over time. In a program run by hand you can often punt; you assume that all errno values you don't specifically recognize mean persistent errors, exit on them, and leave it up to the user to run you again and hope that this time around it will work. In a daemon you don't have this luxury, so the pragmatic question is whether it's more likely that your daemon has hit a new transient errno value or a new persistent one.

My view is that in most environments, the more likely, better, and safer answer for a daemon is that the unrecognized new errno value is a transient error. You already know that transient errors are possible for this system call and you're handling some of them, and you know that over sufficiently large amounts of time your list of transient errno values will be incomplete. Often you don't really expect the system call to ever fail with a persistent error, because your program is not supposed to do things like close the wrong file descriptor. In the unlikely event that you hit an unrecognized persistent error and keep retrying futilely, you'll burn extra CPU and perhaps spam logs. If you exit instead, in the much more likely event that you hit an unrecognized transient error, you'll take down the daemon (as happened for our SMTP server).

(If you do expect a certain amount of persistent errors even in the normal operation of your daemon, you may want a different answer.)

PS: Even for non-daemon programs, 'exit and let the user try again' is not necessarily the best or the most usable answer. As a hypothetical example, if your program first tries to make an IPv6 connection and then falls back to trying an IPv4 one if it gets one of a limited set of errnos, a new or just unexpected 'this IPv6 connection will never work' errno will probably make your program unusable.

(For instance, you might be running on one of the uncommon Linux machines that has IPv6 dual binding turned off, giving you some new errno values you hadn't seen before.)

DaemonsAndUnexpectedErrors written at 21:21:30; Add Comment

2019-01-06

accept(2)'s problem of trying to return two different sorts of errors

A long time ago, I wrote about the dangers of being overly specific in the errno values you looked for, with the specific case being a daemon that exited because an accept() system call got an ECONNRESET that it didn't expect. Recently, John Wiersba left a comment on that entry asking what else the original programmer should have done, given an unexpected error from accept(). In thinking about the issues, I realized that part of the problem is that accept() is actually returning two different sorts of errors and the Unix API doesn't provide it any good way to let people tell the two different sorts apart.

(These days accept() is standardized to return ECONNABORTED instead of ECONNRESET in these circumstances, although this may not be universal.)

The two sorts of errors that accept() is trying to return are errors in the accept() call, such as a bad file descriptor (EBADF, ENOTSOCK) or a bad parameter (EFAULT), and errors in the new connection that accept() may or may not be returning (EAGAIN, ECONNABORTED, etc). One of the differences between the two is that the first sort of errors are probably permanent unless fixed by the program somehow and generally indicate an internal program error, while the second sort of errors will go away if you correctly loop through your accept() sequence again.

A sensibly behaving network daemon should definitely not exit when it gets the second sort of error; it should instead just continue on with its processing loop. However, it's perfectly sensible and probably broadly correct to exit if you get the first sort of error, especially if it's an unknown error and you have no idea how to correct it in your code. If someone has closed a file descriptor on you or it's become a non-socket somehow, continuing will generally just get you an un-ending stream of the same error over and over (and burn CPU, and perhaps flood logs). Exiting is a perfectly sensible way out and often really the only thing you can do.

However, you can't reliably distinguish between these two types of errors unless you believe you can know all of the possible errnos for one or the other of them. Given the general habit of Unixes of adding more errno returns for system calls over time, the practical reality is that you can't. This unfortunately leaves authors of Unix network daemons sort of up in the air; they have to pick one way or the other, and either way might give the wrong answer in some circumstances.

(Perhaps accept() should never have returned the second sort of errors, leaving them all to be discovered on a subsequent use of the file descriptor it returned. But that ship sailed a very long time ago; accept() returning these sorts of errors is even in the Single UNIX Specification for accept().)

I suspect that accept() is not the only the only system call with this sort of split in types of errors (although I can't think of any others off the top of my head). But thankfully I don't think there are too many others, because accept()'s pattern of operation is an unusual one.

PS: The Linux accept() manpage actually has a warning about Linux's behavior here, in the RETURN VALUE section. Linux opts to immediately return a lot of errors detected on the new socket, while other Unixes generally postpone some of them. But note that any Unix can return ECONNABORTED.

AcceptErrnoProblem written at 23:11:46; Add Comment

2018-12-27

The many return values of read() (plus some other cases)

In the Hacker News discussion on my entry on finding a bug in GNU Tar, there grew a sub-thread about the many different cases of read()'s return value. This is a good example of the potential complexity of the Unix API in practice, and to illustrate it I'm going to run down as many of the cases that I can remember. In all cases, we'll start with 'n = read(fd, buf, bufsize)'.

The simplest and most boring case is a full read, where n is bufsize. This is the usual case when reading from files, except at the end of file. However, you can get a full read in various other cases if there is enough buffered input waiting for you. If you get a full read from a TTY while in line buffered mode, the final line in your input buffer may not be newline terminated. In some cases this may even be the only line in your input buffer (if you have a relatively small input buffer and someone stuffed some giant input into it).

A partial read is where n is larger than zero but less than bufsize. There are many causes of a partial read; you may have hit end of file, you may be reading from a TTY in either regular line buffered mode or raw mode, you may be reading from the network and that's all of the network input that's currently available, or if the read() was interrupted by a signal after it transferred some data into your buffer. There are probably other cases, especially since it's not necessarily standardized what conditions do and don't produce partial reads instead of complete failures.

(The standard says that read() definitely will return a partial read if it's interrupted by a signal after it's already read some data, but who knows if all actual Unixes behave that way for all types of file descriptors.)

If you're reading from a TTY in line buffered mode, a partial read doesn't mean that you have a full line (someone could have typed EOF on a partial line), or that you have only one line in your input buffer, not several (if a lot of input has built up since you last read() from standard input).

A zero read is where n is zero. The most common case is that you are at the end of file; alternately, you may have deliberately performed a zero-sized read() to see if you get any errors but not gotten any. However, implementations are not obliged to return available errors on a zero sized read; the standard merely says that they 'may' do so. On both TTYs and regular files, end of file is not necessarily a persistent condition; one read() may return 0 bytes read and then a subsequent read() can return a non-zero result. End of file is guaranteed to be persistent on some other types of file descriptors, such as TCP sockets and pipes.

What I'll call a signalling error read is where n is -1 and errno is used to signal a number of temporary conditions. These include that the read() was interrupted by a signal, including SIGALRM, where you get EINTR, and that read() was used on a file descriptor marked non-blocking and would have blocked (EAGAIN and EWOULDBLOCK). It's very possible to get signalling error reads in situations where you don't expect them, for instance if someone passed you a file descriptor that is in non-blocking mode (this can happen with TTYs and is often good for both comedy and an immediate exit by many shells). At this point they are ordinary error reads.

An ordinary error read is where n is -1 and errno is telling you about various other error conditions. Some of these error conditions may vanish if you read() again, either at the same offset (in a seekable file) or at a different offset, and some are effectively permanent. It is possible to get ordinary error reads on many sorts of file descriptors, including on TTYs (where you may get EIO under some circumstances). In practice, there is no limit to what errnos may be returned by read() under various circumstances; any attempt to be exhaustive is futile, especially if you want to do so in portable code. Official documentation on possible errno values is no more than a hint.

(Because people need to specifically recognize signalling error read errno values, they are much better documented and much more adhered to. You can be pretty confident of what EAGAIN means if you get it as an errno value on a read(), although whether or not you expected it is another matter.)

In addition to actual return values from read(), at least two additional things can happen when you perform a read(). The first is that your read() can stall for an arbitrary amount of time, up to and including days, even in situations where you expect it to complete rapidly and it normally does (for example, reading from files instead of the network). Second, in some circumstances trying to read() from a TTY will result in your program being abruptly suspended entirely (in fact, your entire process group). This happens if you're in an environment with shell job control and you aren't the foreground process group.

All of this adds up to a relatively complex API, and a significant amount of it is implicit. There are some issues that are pretty well known, such as partial reads on network input, but there are others that people may not run into outside of unusual situations.

ReadManyReturnValues written at 00:05:53; Add Comment

2018-12-21

Working around an irritating recent XTerm change in behavior

Suppose, not hypothetically, that you have just performed an operating system upgrade that upgraded the version of xterm. On the surface, this version of xterm looks just like the old one; for example, your default xterm looks like this, which is normal and how it used to.

Ordinary xterm

However, you also have 'root xterms' for your su sessions, with a red foreground to make them stand out as special (cf). When you call one of these up for the first time, it looks like this.

An xterm with a red foreground

That obtrusive, ugly black border doesn't belong there, and it didn't used to be there in older versions of xterm (and this is not a window manager decoration; xterm draws it itself, inside any window manager framing). This border appears because of a change in xterm patch #334:

  • color the inner border using the same borderColor as the outer border, rather than filling with the VT100's default background.

What this innocent sounding change appears to do is colour this inner border in your default foreground colour, whatever that is (black, in my case, since my default xterm colours are black on white), unless you are using entirely default colours. The moment you set any or all of foreground, background, or cursor colour, you get the border drawn in. In xterm patch #334 (which is the version in Fedora 29), the only way to avoid this is to explicitly set the border colour to your background colour, whatever that is for any particular xterm.

(The border colour may be set with either the borderColor X resource or the -bd command line argument. Which one you use depends on how you're changing the other colours; for my sins, I have a mixture, so I get to make changes all over.)

In xterm patch #335, this behavior was effectively reversed (as far as I can tell), although the patch notes will not tell you that. Instead you need to consult the xterm manpage, which now says that by default xterm is back to filling the inner border with the background color and you can set the colorInnerBorder resource to change that. Of course, the default xterm resources that your Unix ships may be altered to change this; if they are, you will need to change things back to restore the old default behavior.

The good news is that patch #335 came out only a few days after patch #334. The bad news is that xterm #334 made it into at least one Unix (specifically, Fedora 29). Hopefully your Unix will skip past #334, and also hopefully it won't pick a version of xterm that has other surprises.

As a look at the xterm changelog will show you, xterm undergoes a lot more change than you might expect in such an old and venerable program (certainly more change than I expect, at least). Some of these changes are fixes for bugs and issues, and some of them are useful improvements, but every so often there are things like this change that make me less than happy.

(There was the xterm patch #301 change to $SHELL handling in 2015, for example, which was later recognized as flawed and augmented with the validShells X resource in xterm patch #334. Yes, the same patch that added this new behavior. The irony is thick.)

PS: It is worth looking at the XTerm default X resources file that your Unix provides along with its xterm, because historically various packagers and distributors have been unable to keep their hands off it. The upstream XTerm default resources have been quite stable over the years, a stability that's appreciated by long-term xterm users like me; distribution ones, not so much.

XTermBorderIssue written at 00:25:58; Add Comment

2018-12-13

Some new-to-me features in POSIX (or Single Unix Specification) Bourne shells

As I mentioned when I found Bourne shell arithmetic to be pretty pleasant, I haven't really paid attention to what things are now standard POSIX Bourne shell features. In fact, it's more than that; I don't think I've really appreciated that POSIX and then the Single Unix Specification actually added very much to the venerable Bourne shell. I knew that shell functions were standard, and then there was POSIX command substitution, but then I sort of stopped. In light of discovering that shell arithmetic is now POSIX standard, this view is clearly out of date, so I've decided to actually skim through the POSIX/SUS shell specification and see what new things I want to remember to use in the future. In the process I found an addition that surprises me.

First (and perhaps obviously), the various character classes such as [:alnum:] are officially supported in shell wildcard expansion and matching. I'm not a fan of writing, say, '[[:upper:]]' instead of '[A-Z]', but the latter has some dangerous traps in some shells, including shells that are commonly found as /bin/sh in some environments.

The big new feature that I should probably plan to make use of is the various prefix and suffix pattern substitutions, such as '${var%%word}'. To a fair extent these let you do in shell things that you previously had to turn to programs like basename and dirname. For instance, in a recent script I wanted the bare program name without any full path, so I used:

prog="${0##*/}"

This feels one part clever and half a part perhaps too clever, but I hope it's an idiom. Another use of this is to perform inline pattern matching in an if statement, for example to check if a parameter is a decimal number:

if [ -n "${OPTARG##*[!0-9]*" ]; then
  echo "$prog: argument not a number" 1>&2
  exit 1
fi

I previously would have turned to case statements for this, which is more awkward. Again, hopefully this is not too clever.

(I learned this trick from Stackoverflow answers, perhaps this one or this one.)

The Single Unix Specification actually has some useful and interesting examples for the prefix and suffix pattern substitutions, along with some of the other substitutions.

Next, as pointed in a comment back in 2011 here, POSIX arithmetic supports hex numbers with a leading 0x, which means that it can be used as a quick hex to decimal converter in addition to hex math calculations. I don't know if there's any way to do decimal to hex output with builtins alone; I suspect that the best way is with printf. The arithmetic operators are available are actually pretty extensive, including 'a ? b : c' for straightforward conditionals.

Unfortunately, while POSIX sh has string length (with '${#var}'), it doesn't seem to have either a way to count the number of $IFS separated words in a variable or to trim off an arbitrary number of leading or trailing spaces from one. You can get both through brute force with simple shell functions, but I'm probably better off avoiding situations when I need either.

The one feature in POSIX sh that genuinely surprises me is tilde username expansion. I knew this was popular for interactive use in shells but I would have expected POSIX to not care and to primarily focus on shell scripts, where I at least have the impression it's not very common. But there it is, and the description doesn't restrict it to interactive sessions either; you can use '~<someuser>' in your shell scripts if you want to. I probably won't, though, especially since we have our own local solution for this.

(The version of the Single Unix Specification that I'm looking at here is the 2017 version, which will no doubt go out of data just as the 2008, 2013, and 2016 versions did.)

PosixShellNewFeatures written at 00:12:16; Add Comment

2018-11-21

What I really miss when I don't have X across the network

For reasons beyond the scope of this entry, I spent a couple of days last week working from home. One of the big differences when I do this is that I don't have remote X; instead I wind up doing everything over SSH. At a nominal level the experience is much the same, partly because I've deliberately arranged it that way; using sshterm to start a SSH session to a host is very similar to using rxterm to start an xterm on it, for example. But at a deeper level there are two things I wound up really missing.

The obvious thing I missed was exmh, which is the core of how I efficiently deal with email at work. Exmh is text based so it works well within the limitations of modern X network transparency; at work I run it on one of our login servers, with direct access to my email, and it displays on my desktop. In theory the modern replacement for exmh and this style of working would be a local IMAP mail client, if I could find a Linux one that I liked.

(I mean, apart from the whole thing where I'm extremely attached to (N)MH and don't want to move to IMAP any sooner than I have to. An alternate approach would be to find and set up some good text-mode MH visual client, probably GNU Emacs' MH-E, which I used to use years ago.)

But the surprising subtle thing that I wound up missing was the ability to open up a new xterm on the remote machine from within my current session. While starting an xterm this way obviously skips logging in, the real great advantage of doing this is that the new xterm completely inherits my current context, both my current directory and my current privileges (if I'm su'd to root, for example, which is when this is especially handy). It is in a way the Unix shell session equivalent of a browser's 'Open in New Tab/Window', and it's useful for much the same reasons; it gives you an additional view on what you're currently doing or about to do.

There is no good replacement for this that I can see outside of remote X or something very similar to it. You can't get it with job control and you can't really get it with screen or tmux, and a remote windowing protocol that deals with entire desktops instead of individual windows would create a completely different environment in general. This makes me sad that in the brave future world of Wayland, there still doesn't seem to be much prospect of remote windows.

(This entry is sort of prompted by reading The X Network Transparency Myth.)

PS: If you want, you can consider this the flipside of my entry X's network transparency has wound up mostly being a failure. X's network transparency is not anywhere near complete, but within the domain of mostly text-focused programs running over 1G LANs it can still deliver very nice benefits. I take advantage of them every day that I'm at work, and miss them when I'm not.

RemoteXWhatIMiss written at 00:14:25; Add Comment

2018-11-10

Character by character TTY input in Unix, then and now

In Unix, normally doing a read() from a terminal returns full lines, with the kernel taking care of things like people erasing characters and words (and typing control-D); if you run 'cat' by itself, for example, you get this line at a time input mode. However Unix has an additional input mode, raw mode, where you read() every character as it's typed (or at least as it becomes available to the kernel). Programs that support readline-style line editing operate in this mode, such as shells like Bash and zsh, as do editors like vi (and emacs if it's in non-windowed mode).

(Not all things that you might think operate in raw mode actually do; for example, passwd and sudo don't use raw mode when you enter your password, they just turn off echoing characters back to you.)

Unix has pretty much always had these two terminal input modes (kernel support for both goes back to at least Research Unix V4, which seems to be the oldest one that we have good kernel source for through tuhs.org). However, over time the impact on the system of using raw mode has changed significantly, and not just because CPUs have gotten faster. In practice, modern cooked (line at a time) terminal input is much closer to raw mode than it was in the days of V7, because over time we've moved from an environment where input came from real terminals over serial lines to one where input takes much more complicated and expensive paths into Unix.

In the early days of Unix, what you had was real, physical terminals (sometimes hardcopy ones, such as in famous photos of Bell Labs people working on Unix in machine rooms, and sometimes 'glass ttys' with CRT displays). These terminals were connected to Unix machines by serial lines. In cooked, line at a time mode, what happened when you hit a character on the terminal was that the character was sent over the serial line, the serial port hardware on the Unix machine read the character and raised an interrupt, and the low level Unix interrupt handler read the character from the hardware, perhaps echoed it back out, and immediately handled a few special characters like ^C and CR (which made it wake up the rest of the kernel) and perhaps the basic line editing characters. When you finally typed CR, the interrupt handler would wake up the kernel side of your process, which was waiting in the tty read() handler. This higher level would eventually get scheduled, process the input buffer to assemble the actual line, copy it to your user-space memory, and return from the read() to user space, at which point your program would actually wake up to handle the new line it got.

(Versions of Research Unix through V7 actually didn't really handle your erase or line-kill characters at interrupt level. Instead they push everything into a 'raw buffer', and only once a CR was typed was this buffer canonicalized by applying the effects of characters to determine the final line that was returned to user level.)

The important thing here is that in line at a time tty input in V7, the only code that had to run for each character was the low level kernel interrupt handler, and it deliberately did very little work. However, if you turned on raw mode all of this changed and suddenly you had to run a lot more code. In raw mode, the interrupt handler had to wake the higher level kernel at each character, and the higher level kernel had to return to user level, and your user level code had to run. On the comparatively small and slow machines that early Unixes ran on, going all the way to user-level code for every character would have been and probably was a visible performance hit, especially if a bunch of people were doing it at the same time.

Things started changing in BSD Unix with the introduction of pseudo-ttys (ptys). BSD Unix needed ptys in order to support network logins over Telnet and rlogin, but network logins and ptys fundamentally change what the character input path looks like in practice. Programs reading from ptys still ran basically the same sort of code in the kernel as before, with a distinction between low level character processing and high level line processing, but now getting characters to the pty wasn't just a matter of a hardware interrupt. For a telnet or rlogin login, the path looked something like this:

  • the network card gets a packet and raises an interrupt.
  • the kernel interrupt handler reads the packet and passes it to the kernel's TCP state machine, which may not run entirely at interrupt level and is in any case a bunch of code.
  • the TCP state machine eventually hands the packet data to the user-level telnet or rlogin daemon, which must wake up to handle it.
  • the woken-up telnetd or rlogind injects the new character into the master side of the pty with a write() system call, which percolates down through various levels of kernel code.

In other words, with logins over the network, a bunch of code, including user-level code, had to run for every character even for line at a time input.

In this new world, having the shell or program that's reading input from the pty operate in line at a time mode remained somewhat more efficient than raw mode but it wasn't anywhere near the difference in the amount of code that it was (and is) for terminals connected over serial lines. You weren't moving from no user level wakeups to one; you were moving from one to two, and the additional wakeup was on a relatively simple code path (compared to TCP packet and state handling).

(It's a good thing Vaxes were more powerful than PDP-11s; they needed to be.)

Things in Unix have only gotten worse for the character input path since then. Modern input over the network is through SSH, which requires user-level decryption and de-multiplexing before you end up with characters that can be written to the master side of the pseudo-tty; the network input may also involve kernel level firewall checks or even another level of decryption from a VPN (either at kernel level or at user level, depending on the VPN technology). Windowing systems such as X or Wayland add at least two processes to the stack, as generally the window server has to read and process the keystroke and then pass it to the terminal window process (as a generalized event). Sometimes there are more processes, and keyboard event handling is generally complicated in general (which means that there's a lot of code that has to run).

I won't say that character at a time input has no extra overhead in Unix today, because that's not quite true. What is true is that the extra overhead it adds is now only a small percentage of the total cost (in time and CPU instructions) of getting a typed character from the keyboard to the program. And since readline-style line editing and other features that require character at a time input add real value, they've become more and more common as the relative expensive of providing them has fallen, to the point where it's now a bit unusual to find a program that doesn't have readline editing.

The mirror image of this modern state is that back in the old days, avoiding raw mode as much as possible mattered a lot (to the point where it seems that almost nothing in V7 actually uses its raw mode). This persisted even into the days of 4.x BSD on Vaxes, if you wanted to support a lot of people connected to them (especially over serial terminals, which people used for a surprisingly long time). This very likely had a real influence on what sort of programs people developed for early Unix, especially Research Unix on PDP-11s.

PS: In V7, the only uses of RAW mode I could spot were in some UUCP and modem related programs, like the V7 version of cu.

PPS: Even when directly connected serial terminals started going out of style for Unix systems, with sysadmins and other local users switching to workstations, people often still cared about dial-in serial connections over modems. And generally people liked to put all of the dial-in users on one machine, rather than try to somehow distribute them over a bunch.

RawTtyInputThenAndNow written at 19:20:34; Add Comment

2018-10-20

The original Unix ed(1) didn't load files being edited into memory

These days almost all editors work by loading the entire file (or files) that you're editing into memory, either into a flat buffer or two or into some more sophisticated data structure, and then working on them there. This approach to editing files is simple, straightforward, and fast, but it has an obvious limitation; the file you want to edit has to fit into memory. These days this is generally not much of an issue.

V7 was routinely used on what are relatively small machines by modern standards, and those machines were shared by a fairly large number of people, so system memory was a limited resource. Earlier versions of Research Unix had to run on even smaller machines, too. On one of those machines, loading the entire file you wanted to edit into memory was somewhere between profligate and impossible, depending on the size of the file and the machine you were working on. As a result, the V7 ed does not edit files in memory.

The V7 ed manpage says this explicitly, although it's tempting to regard this as hand waving. Here's the quote:

Ed operates on a copy of any file it is editing; [...]. The copy of the text being edited resides in a temporary file called the buffer.

The manual page is being completely literal. If you started up V7 in a PDP-11 emulator and edited a file with ed, you would find a magic file in /tmp (called /tmp/e<something>, the name being created by the V7 mktemp()). That file is the buffer file, and you will find much or all of the file you're editing in it (in some internal format that seems to have null bytes sprinkled around through it).

(V6 is substantially the same, so you can explore this interactively here. I was surprised to discover that V6 doesn't seem to have sed.)

I've poked through the V7 ed.c to see if I could figure out what ed is doing here, but I admit the complete lack of comments has mostly defeated me. What I think it's doing is only allocating and storing some kind of index to where every line is located, then moving line text in and out of a few small 512-byte buffers as you work on them. As you add text or move things around, I believe that ed writes new copies of the line(s) you've touched to new locations in the buffer file, rather than try to overwrite the old versions in place. The buffer file has a limit of 256 512-byte blocks, so if you do enough editing of a large enough file I believe you can run into problems there.

(This agrees with the manpage's section on size limitations, where it says that ed has a 128 KB limit on the temporary file and the limit on the number of lines you can have is the amount of core, with each line taking up one PDP-11 'word' (in the code this is an int).)

Exploring the code also led me to discover how ed handled errors internally, which is by using longjmp() to jump back to main() and re-enter the main command loop from there. This is really sort of what I should have expected from a V7 program; it's straight, minimal, and it works. Perhaps it's not how we'd do that today, but V7 was a different and smaller place.

PS: If you're reading the V7 ed source and want to see where this is, I think it runs through getline(), putline, getblock(), and blkio(). I believe that the tline variable is the offset that the next line will be written to by putline(), and it gets stored in the dynamically allocated line buffer array that is pointed to by zero. The more I look at it, the more that the whole thing seems pretty clever in an odd way.

(My looking into this was sparked by Leah Neukirchen's comment on my entry on why you should be willing to believe that ed is a good editor. Note that even if you don't hold files in memory, editing multiple files at once requires more memory. In ed's scheme, you would need to have multiple line-mapping arrays, one for each file, and probably you'd want multiple buffer files and some structures to keep track of them. You might also be more inclined to do more editing operations in a single session and so be more likely to run into the size limit of a buffer file, which I assume exists for a good reason.)

EdFileNotInMemory written at 01:17:05; Add Comment

(Previous 10 or go back to October 2018 at 2018/10/18)

Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.