How Unix erases things when you type a backspace while entering text

January 29, 2017

Yesterday I mentioned in passing that printing a DEL character doesn't actually erase anything. This raises an interesting question, because when you're typing something into a Unix system and hit your backspace key, Unix sure erases the last character that you entered. So how is it doing that?

The answer turns out to be basically what you'd expect, although the actual implementation rapidly gets complex. When you hit backspace, the kernel tty line discipline rubs out your previous character by printing (in the simple case) Ctrl-H, a space, and then another Ctrl-H.

(In Unix kernel source code you'll generally see this using not the raw byte value but the C escape sequence for Ctrl-H, \b. Printing Ctrl-H is not the only use for \b, but it's certainly one reason it's one of the few control characters with one (cf).)

Of course just backing up one character is not always the correct way of erasing input, and that's when it gets complicated for the kernel. To start with we have tabs, because when you (the user) backspace over a tab you want the cursor to jump all the way back, not just move back one space. The kernel has a certain amount of code to work out what column it thinks you're on and then back up an appropriate number of spaces with Ctrl-Hs.

(By the way, the kernel assumes that tabstops are every 8 characters. I'm not sure any Unix version lets you change this with stty or the equivalent.)

Then we have the case when you quoted a control character while entering it, eg by typing Ctrl-V Ctrl-H; this causes the kernel to print the Ctrl-H instead of acting on it, and it prints it as the two character sequence ^H. When you hit backspace to erase that, of course you want both (printed) characters to be rubbed out, not just the 'H'. So the kernel needs to keep track of that and rub out two characters instead of just one.

A final complication for some kernels is multibyte characters with a display width bigger than one (yes, really, some kernels try to handle this). These kernels get to go through interesting gyrations; you can see an example in Illumos's ldterm.c in the ldterm_csi_erase function.

(FreeBSD also handles backspacing a space specially, because you don't need to actually rub that out with a '\b \b' sequence; you can just print a plain \b. Other kernels don't seem to bother with this optimization. The FreeBSD code for this is in sys/kern/tty_ttydisc.c in the ttydisc_rubchar function.)

PS: If you want to see the kernel's handling of backspace in action, you usually can't test it at your shell prompt, because you're almost certainly using a shell that supports command line editing and readline and so on. Command line editing requires taking over input processing from the kernel, and so such shells are handling everything themselves. My usual way to see what the kernel is doing is to run 'cat >/dev/null' and then type away.

Written on 29 January 2017.
« What we still use ASCII CR for today (on Unix)
How you can abruptly lose your filesystem on a software RAID mirror »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sun Jan 29 02:13:44 2017
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.