Wandering Thoughts archives

2011-11-29

The alternate screen terminal emulator plague

There is one bit of behavior in modern X environments that drives me up the wall: the use of 'alternate screens' in terminal emulators (and in programs). That's probably obscure to most people, so let me put it the other way; what drives me up the wall is when I edit a file in vi or view it in a pager like less, quit, and all of the text that I was just looking at instantly disappears in favour of what was on the screen before I started whatever it was.

(Having this happen in less is especially infuriating, since less's entire purpose is showing me information. But the moment I quit, I can't have it.)

What is happening is not the fault of vi, less, and so on, or at least not exactly. Unix systems have terminfo, which is essentially a big database of terminals and escape sequences that programs can send to make them do things; two of the defined terminfo capabilities are escape sequences that you are supposed to send when your full screen program starts and stops. Many terminal emulators (and a certain number of real terminals) support an alternate screen, and the people who wrote terminfo database entries for them decided that full screen programs should always use this alternate screen so they put the escape sequences for 'switch to alternate screen' and 'switch to main screen' into the initialization and de-initialization sequences. When programs like vi, emacs, and less dutifully send the escape sequences that the terminfo database tells them to, they shoot your foot off.

My personal opinion is that this is an unfortunate example of the slow fossilization of X terminal emulators. xterm probably does this because some real terminal had its terminfo set up this way in the depths of time, and everyone else has probably slavishly copied how xterm works. No one has stopped to ask if the end result makes sense and is usable, because if they had a great many people would have told them that it doesn't and isn't.

(Xterm gets partial credit because it has a way to turn this off, so at least the xterm people recognize that it's a terrible idea even if they feel constrained by backwards compatibility to not fix the default.)

Unfortunately there is no general fix for this. Some programs can be told to not send the terminfo initialization and de-initialization strings; some terminal emulators can be told to ignore them. Sadly, sending the strings and paying attention to them is the default behavior; this leads to fixing a lot of programs on a lot of systems, one by one.

(For extra fun, some Unixes do things right to start with. For example, Solaris has never put the alternate screen escape sequences into their terminfo entries for xterm.)

Sidebar: the quick fixes

If you use xterm, set the XTerm*titeInhibit resource to true to make it ignore the alternate screen escape sequences. As usual, gnome-terminal has no way of controlling this.

For less, do 'export LESS=X' or otherwise give it the -X switch.

For vi, add 'set t_ti= t_te=' to your .vimrc.

(Some information taken from here. See also.)

AlternateScreenPlague written at 01:10:01; Add Comment

2011-11-27

Why processing things in inode order is a good idea

In a note on yesterday's entry on readdir()'s ordering, a commentator wrote (in part):

Note many utils use FTS which sorts directory entries by inode [...]

It may not be obvious why this is a good thing to do, so let me take a shot at it.

Suppose that you want to do something that either looks at or touches the inodes of the files in a directory; perhaps your ls or find needs to stat() them, or your chmod needs to change their permissions. What is the right order to process the files in?

As always, modern disks are seek limited. You can't do anything to change how many seeks it takes to read the directory (or in general how fast it happens), because you don't control anything about the order that you get directory entries in; as discussed last entry, the kernel returns directory entries in whatever it wants to. But you can control what order the kernel reads inodes. So we want to ask for inodes in whatever order minimizes seeks.

In general, you don't know exactly how a filesystem organizes where inodes go on the disk and they are usually not all in one contiguous area but scattered in various spots over the disk. However, filesystems have historically put inodes on the disk in increasing order of inode number; you can be pretty certain that inode X+1000 is in a block that is after the block that inode X is in (at least as far as logical block numbers go). Asking for inodes in increasing numerical order thus at least means that the disk only seeks forward (and probably the minimum distance possible) and it maximizes the chances that the kernel will be able to read several inodes of interest in one block. Asking for inodes in any other order increases the chances that the kernel will have to seek back and forth over the disk to give them to you.

(There are some filesystems where this is no longer true, primarily filesystems (such as ZFS) which never rewrite things in place. That means that every time an inode is modified it has to be written to a new place on disk, which means that the (fixed) inode number of a file has no bearing on where on disk the inode has wound up.)

InodeOrderReason written at 02:32:13; Add Comment

2011-11-26

About the order that readdir() returns entries in

In a mostly unrelated article (seen via Planet Sysadmin) I recently noticed the following:

readdir(3) just returns the directory entries in the order they are linked together, which is also not related to inode numbering but as best as I can tell is from outer leaf inwards (since the most recently created file is listed first).

On most systems, readdir(3) is a modestly warmed over version of the underlying 'read directory entries' system call, and returns directory entries in the same order that the system call does. In theory a Unix kernel can return directory entries in whatever order it wants (including, say, sorted alphabetically). In practice kernels almost always give you directory entries in what I will call 'traversal order', whatever the natural order is for entries in the on-disk data structures that represent a directory.

For a very long time, Unix directories on disk were simple linear arrays (first with fixed-size entries and then with variable sized ones when BSD introduced long filenames). When a new entry was added, it was generally put in the first spot where there was room; at the end if none of the directory's entries had been deleted, and perhaps earlier if a filename of a suitable length had been deleted earlier. The kernel read directories in forward linear order, starting at the first block of the directory and going up, and so returned entries in this order.

(In the original simple Unix filesystems, inode numbers were also allocated in a straightforward 'first free number' order, so the order of directory entry creation could correspond quite well to inode order. The Berkeley FFS changed this somewhat by allocating inodes in a more scattered manner.)

Modern Unix systems commonly use some sort of non-linear directories under at least some circumstances (a linear data structure may still be more efficient for small directories); generally these are some variant of balanced trees. The natural traversal order is tree order, but what that is is very implementation dependent. I believe it's common to hash filenames and then insert entries into the tree in hash order, but hashes (and thus the hash order) vary tremendously between filesystems, and I'm sure that somewhere there is a filesystem that doesn't hash names and just inserts them straight in some order.

(Because this is a per-filesystem thing, it follows that the traversal order can be different for different directories on the same system even if they have the same entries created in the same order, either because the directories are using different filesystem types or just because some parameters were set differently on the different filesystems.)

ReaddirOrder written at 01:47:30; Add Comment

2011-11-17

The drawback of modern X font handling

In some ways, font handling in modern versions of X is quite nice. We have a decent number of good, high quality fonts in modern scalable formats like TrueType, and one can use fonts from Windows and Mac OS X if one wants to. Thanks to various reforms in font handling, specifying fonts is generally easier and more flexible (hands up everyone who ever tried to generate or read an XLFD string for a particular font), and you can install personal fonts without huge contortions. But it does have one drawback, at least for someone like me.

In the old days of X font handling, the X server did all of the work. X clients simply told the server to render some text in a particular font; it was the server itself that was responsible for generating the font bitmaps and drawing them (sometimes the X server delegated generating font bitmaps to a separate program, such as xfs, the X font server). This meant that you only had to tune fonts in one place and your tuning applied to every X client that you ran, no matter what they were or where they running. Or to put it another way, I could carefully select an xterm font (and size) that I really liked and it would stick everywhere.

(The fly in this 'all in the server' ointment was default X application resources, but you could fix that with some more work.)

In the new world of X fonts, fonts are rendered separately by each X client (using various layers of font selection and rendering) and sent to the server as precomputed bitmaps. If all of your clients are running on the same machine and using the same set of font libraries, the result is the same as in the old world. But if some of your clients are running on different machines and displaying remotely (or some of your local clients have decided to use their own copies of libraries), they can render the same nominal font quite differently. This is especially so if you use generic font names like 'monospace' or 'serif', because what actual fonts those generic names map to is system-specific; one machine may very well map 'monospace' to 'DejaVu Sans Mono', while another maps it to 'Liberation Mono'.

(The corollary to this is that font availability is also a per-machine thing. If you install a new font you like onto your local workstation, an xterm or Firefox or whatever running from a remote server cannot use it.)

In the new world, what you see for something like 'DejaVu Sans Mono 10' depends on the specific version of the font each system has installed, what exact rendering library version each system is using, and what rendering settings each system is using for things like subpixel anti-aliasing. This drastically complicates efforts to, say, pick a single modern font for all of your terminal windows.

(I'm aware that the modern answer to this drawback is that I should run all of my X programs locally and just use ssh. This is what you could politely call a fail.)

Sidebar: a concrete example

Both of the following images are xterm using DejaVu Sans Mono 10, displaying on a Fedora 15 machine's X server. One of the xterms is running locally on the Fedora 15 machine; the other is running on a 32-bit Ubuntu 10.04 machine.

Fedora 15 xterm Ubuntu 10.04 xterm

One of these I rather like, one of these I can't stand.

(Part of the difference is clearly in different settings for subpixel anti-aliasing; the Ubuntu 10.04 version has colour fringes that the Fedora 15 version does not. But I don't think the difference in line width that makes the 10.04 version visibly blacker is due to that.)

ModernXFontDrawback written at 17:22:18; Add Comment

2011-11-08

Files and fundamental filesystem activities (on Unix)

Back in a discussion of filesystem deduplication I said that writing blocks is a fundamental filesystem activity while writing files is not. On the surface this sounds like a strange thing, so today I'm going to defend it.

At one level, it's clear how writing blocks is a fundamental filesystem activity. Filesystems allocate disk space in blocks and pretty much only write blocks; if you try to write less than a block, the filesystem actually usually does a 'read modify write' cycle. Although this was once forced by physical disk constraints, that's no longer true today; until recently, disks used smaller physical blocks than the filesystem block size, so the filesystem could do sub-block writes if it wanted to. Filesystems just don't, by and large.

What's not clear is why writing files is not. To see why, let's ask a question: what does it mean to write a file, and when are you done? In the simple case the answer is that you write all of the data in the file in sequential order, and then close the file descriptor. This probably describes a huge amount of the file writes done on a typical Unix system, and it's certainly what most people think of, since this describes things like saving a file in an editor or writing out an image in your image editor. But there's a lot of files on Unix that aren't 'written' this way. Databases (SQLite included) are the classic case, but there are other examples; even rsync may 'write' files in non-sequential chunks in some situations. Some of these cases may not close the file for days or weeks, although they may go idle for significant amounts of time.

The result is that writing files is a diffuse activity while writing blocks is a very sharp one. You can clearly write to a file in a way that touches only a small portion of the file, and if the end of writing a file is when you close it you can write files very slowly, with huge gaps between your actual IO. And the system makes all of these patterns relatively efficient, unlike partial-block writes.

This cause problems for a number of things that want to react when a file is written. File level deduplication is one example; another is real time virus scanners, even with system support to hook events.

(The more I think about it, the more I think that this is not just a Unix thing. Although I may have blinkered vision due to Unix, it's hard to see a viable API that could make writing files a fundamental activity. There's many situations where you just can't pregenerate all of the file before writing it even if you're writing things sequentially, plus there's random write IO to consider unless you make that an entirely separate 'database' API.)

FundamentalFileOperation written at 01:14:58; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.