How Unix erases things when you type a backspace while entering text
Yesterday I mentioned in passing that printing a DEL character doesn't actually erase anything. This raises an interesting question, because when you're typing something into a Unix system and hit your backspace key, Unix sure erases the last character that you entered. So how is it doing that?
The answer turns out to be basically what you'd expect, although the actual implementation rapidly gets complex. When you hit backspace, the kernel tty line discipline rubs out your previous character by printing (in the simple case) Ctrl-H, a space, and then another Ctrl-H.
(In Unix kernel source code you'll generally see this using not the
raw byte value but the C escape sequence for Ctrl-H, \b
. Printing
Ctrl-H is not the only use for \b
, but it's certainly one
reason it's one of the few control characters with one (cf).)
Of course just backing up one character is not always the correct way of erasing input, and that's when it gets complicated for the kernel. To start with we have tabs, because when you (the user) backspace over a tab you want the cursor to jump all the way back, not just move back one space. The kernel has a certain amount of code to work out what column it thinks you're on and then back up an appropriate number of spaces with Ctrl-Hs.
(By the way, the kernel assumes that tabstops are every 8 characters.
I'm not sure any Unix version lets you change this with stty
or the
equivalent.)
Then we have the case when you quoted a control character while
entering it, eg by typing Ctrl-V Ctrl-H; this causes the kernel to
print the Ctrl-H instead of acting on it, and it prints it as the
two character sequence ^H
. When you hit backspace to erase that,
of course you want both (printed) characters to be rubbed out, not
just the 'H'. So the kernel needs to keep track of that and rub out
two characters instead of just one.
A final complication for some kernels is multibyte characters with
a display width bigger than one (yes, really, some kernels try to
handle this). These kernels get to go through interesting gyrations;
you can see an example in Illumos's ldterm.c
in the ldterm_csi_erase
function.
(FreeBSD also handles backspacing a space specially, because you
don't need to actually rub that out with a '\b \b' sequence; you
can just print a plain \b
. Other kernels don't seem to bother
with this optimization. The FreeBSD code for this is in
sys/kern/tty_ttydisc.c
in the ttydisc_rubchar
function.)
PS: If you want to see the kernel's handling of backspace in action,
you usually can't test it at your shell prompt, because you're
almost certainly using a shell that supports command line editing
and readline and so on. Command line editing requires taking over
input processing from the kernel, and so such shells are handling
everything themselves. My usual way to see what the kernel is doing
is to run 'cat >/dev/null
' and then type away.
|
|