Wandering Thoughts archives

2012-06-25

Why sophisticated line editing is not in the kernel

At the end of my previous entry on TTY line disciplines I wrote:

Given all of this, you might wonder why sophisticated readline style editing took off as user level code (both in programs and as various libraries) instead of being integrated into the kernel. [...]

Well, let us start out with what we could call the pragmatic reason but should more honestly call the political one: it is a lot easier to add something to your user-level program than to get it included in any Unix kernel. Kernel developers are usually very conservative about adding features; getting them to accept sophisticated and thus complex line editing in the kernel would have been an uphill struggle (one involving a lot of arguing) even with working code. Then if you go this added to the new version of some Unix's kernel it might take more years for it to filter out to systems in the real world, to be reluctantly adopted by other Unix variants that you want your program to work well on, and so on.

(And just imagine the ferocious arguments over whether the kernel should support vi style editing, Emacs style editing, or both.)

By contrast you can add readline line editing to your own program as a small matter of programming and you don't have to argue with anyone about it. You can ship immediately and it works on all Unixes that you support. Apart from any other issues, it should be unsurprising that people opted to implement sophisticated line editing themselves at the user-level.

But there's a technical reason too, and that's filename completion (and more sophisticated completion as well). A not insignificant amount of filename completion needs to know things like someone's home directory or the current value of an environment variable (or a shell variable), so that it can complete things like '~/...', '~user/...', '$THING/...', and so on. All of these are more or less inaccessible inside the kernel, and so completion using them can't be supported in a kernel implementation of sophisticated line editing. And this goes even more so for more sophisticated completion such as completing command line arguments, hostnames, and so on, where the information needed is pretty much totally beyond the kernel (at least in practice).

(Even something as apparently simple as completing program names is subject to this twice over. First, what programs are available depends on what $PATH is. Second, in something like gdb you don't complete program names you complete the names of internal commands, and what commands are available is not something that you know from outside the program.)

So if advanced line editing was supplied by the kernel instead of user-level programs, it wouldn't be anywhere near as sophisticated as it is today; at most it could do relatively simple filename completion.

WhyNotKernelLineEditing written at 02:18:56; Add Comment

2012-06-04

Why the TTY line discipline exists in the kernel

Via Hacker News, I was just reading The TTY demystified, which is a decent introduction to Unix TTYs and a number of their peculiarities. One of the things that it talks about in passing is the line discipline; to summarize, the line discipline is the part of the kernel tty driver turns tty input into something that you get a line at a time, and that handles things like letting you edit the line as you type (by handling, eg, backspace) and also generates various signals when you type special characters like ^C.

(To go back to a past entry, the line discipline is the part of the tty driver that handles typing a ^D.)

In an environment where it seems that everyone gets sophisticated line editing via readline or similar libraries, you might wonder why the line discipline exists at all. There are two pragmatic reasons (which are partly historical) and a convincing general one for this, and the upshot is that having line editing in the kernel is the right approach for Unix.

(I'm going to assume that the kernel tty driver would still handle generating signals when you hit special keys like ^C and ^\; we're only concerned about the complex dance of line editing.)

First, Unix started on what were relatively small and constrained machines. The line discipline does input processing in the kernel; in fact, historically line disciplines have done a significant amount of processing during the actual serial interrupt handling. This is significantly more CPU-efficient than having to wake up a user level process on every keystroke, and it's also more responsive (much more responsive when the line discipline does work at interrupt time). Pushing basic line handling into the kernel kept basic serial input reasonably responsive on old Unix machines even when they were under significant load.

(It didn't help once you were logging in over the network or using anything that requires ptys, because then user level code is intrinsically involved. But for a long time early Unix machines were usually used with non-networked serial terminals. Today basically everything uses ptys, so this reason is less convincing.)

Second, Unix spent a long time without shared libraries. This means that without line disciplines, every program that wanted to have any sort of convenient line editing would have had to duplicate the code for it, using up extra memory in addition to the extra CPU cycles. You might also get inconsistencies in what line editing various programs supported (although this could be reduced by having a common library for it that everyone used). An in-kernel line discipline is effectively factoring out the common code that everyone would have had to include anyways.

Finally, the general reason is that giving programs the responsibility for line editing complicates many programs that only read from ttys some of the time (or only read from ttys rarely). Consider cat, for example; often it's not reading from a tty but every so often people use it that way. But because it's sometimes reading from a tty, cat would not only have to have code to handle line editing, it would need to carefully use that code only if standard input was a tty. This sort of conditional checking would have to be spread over basically any program that read from standard input, because any of them can be used on a tty sometime. Inevitably some programs would decide that reading from a tty was so rare that they weren't going to implement line editing and this check, and some other programs would decide that reading from the tty was so common that they weren't going to handle the case where they weren't. What you would get is an inconsistent mess. Centralizing all of this in the kernel makes everything consistent and creates an important simplification in Unix IO; at least in theory, ordinary programs don't have to care whether they're doing IO with a terminal or with a pipe or file.

Given all of this, you might wonder why sophisticated readline style editing took off as user level code (both in programs and as various libraries) instead of being integrated into the kernel. I think that there are several reasons for it but covering them will take another entry.

TTYLineDisciplineWhy written at 23:00:32; Add Comment

By day for June 2012: 4 25; before June; after June.

Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.