Wandering Thoughts archives

2010-06-22

Why feed readers are not good for skimming things

Here's a somewhat counterintuitive thing that I've come to believe: the conventional syndication feed reader design is the wrong thing for fast, casual skimming.

(I call this counterintuitive because feed readers were initially presented as a great way to deal with a a lot of feeds and feed entries, as just the thing to deal with the river of news and so on.)

With fast skimming, the goal is essentially to throw things away. You want to drastically limit your time consumption and limit the mental load that the whole effort puts on you. But feed readers are by and large oriented to keeping things around and in your face, with mechanisms like unread entries (which are often given extra prominence so you can spot them easily), having a history of entries (often keeping even old, read entries visible), and so on.

From the perspective of conventional reading, all of these are a good thing. But from the perspective of rapid skimming, they are a drawback. If you want to limit the amount of time you spend reading things, you don't want things to stay around and nag people, you want them to go away if they aren't interesting enough to get you to look at them the first time you see them. When skimming, things do not improve with age.

(And you want your display to be very dense. Having full entry text right at hand is nice if you often read entries, but if you're skimming you're mostly not reading entries so the text is usually pointless; the same screen space (and mental attention) could be used to summarize more entries for you to skim.)

From this perspective, the only thing wrong with the social web approach to feeds for skimming is that they don't necessarily fade old entries out fast enough. It might be interesting to do an interface where any old entries (well, entry headlines) that were still present were literally faded out, rendered in a muted and easily skimmed over colour.

Sidebar: how we got here

My theory of how we got here is that conventional feed reader design applies the basic approach of email reading to syndication feeds. You map syndication feed entries to email messages and feeds themselves to folders, and you come up with the usual interface I've seen. But, the conventional assumption with email is that you basically want to read all of it. This might be right for some syndication feed reader usage, but it is definitely wrong for people who are skimming.

(In the process, feed readers inherited the conventional email approach of really only being single context applications; they were designed to expect you to only be doing a single thing at once, ie reading a single entry in a single feed.)

SkimmingAndFeedReaders written at 01:27:38; Add Comment

2010-06-20

How my mail notifier avoids interrupting me

For years, I've used a program called xlbiff to handle notifying me about new mail, and I have it set to check quite frequently. On the surface, this ought to be a really disruptive thing; new mail notification is widely held to be one of the best ways to yank your attention away from what you're currently doing. But in practice xlbiff doesn't act this way for me; it's far less interrupting than it sounds while still letting me deal very rapidly with important things.

(As a sysadmin, outright ignoring email is not an option for very long. Being interrupted all the time by sufficiently important things is part of my job.)

I believe that there's two major reasons why this is the case. First, xlbiff is just informative enough to let me make a decision to ignore the new email; in practice, seeing the author and the subject is enough to let me filter out almost all of the email I don't need to read right now. Anything less informative would force me to immediately do more checking in order to decide if the mail was important; anything more informative would take up more screen real estate (and might show me enough information for the mail message to be tempting).

Second, xlbiff is a dead end; it gives me no way to look at the mail further with simple actions inside it. If I want to actually read the email, I have to open up my full mail reading environment and get to it. This means that it is not easy to get distracted by reading new mail. Such distraction is not temptingly right at hand, only a mouse click or two away (or a keyboard binding or two away), the way it would be if this notification either was integrated into my mail reader or if the notifier could helpfully open my mail reader at a mouse click.

(The lesson I draw from this is that convenience is in some way the enemy of avoiding distraction. If you want to avoid the latter, you need to avoid conveniently fast and effortless ways of getting to it; instead, put some things in the way to slow people down and make them hold back.)

Oh, and the third reason is that xlbiff does not nag at me with any sort of persistent 'you have some unread messages' marker. Instead, it notifies me and then it goes away when I tell it to; it will only notify me again if new mail comes in. Persistent markers are a persistent nag, and sooner or later they will win the battle for your attention (or you will do something just to shut them up).

From this I can construct the platonic ideal of a terrible modern mail notifier: a persistent 'you have new mail' status icon in your notifier area, where the only thing you can do with it is click on it to open up your regular (IMAP) mail reader. This combines minimal usefulness with maximal distraction potential when you try to find out anything more about that new mail you have. Giving it a tooltip containing the author and subject of your new mail would be only a moderate improvement.

(Sadly, I'm sure that there are any number of mail notifiers that work just like this.)

AvoidNotifierInterrupts written at 01:36:54; Add Comment

2010-06-04

How disk write caches can corrupt filesystem metadata

It's intuitively obvious how (volatile) disk write caches can result in you losing data in files if something goes wrong; you wrote data, the disk told your program it had been written, the disk lost power, and the data vanishes. But it may be less obvious how this can result in corrupted or destroyed filesystems and thus why you need (working) cache flush operations even just to keep your filesystems intact (never mind what user level programs may want).

Consider a filesystem where you have two pieces of metadata, A and B, where A points to B; A might be a directory and B a file inode, or A might be a file inode and B a table of block pointers. Since filesystem metadata is often some sort of tree, this sort of pointing is common (nodes higher up the tree point to nodes lower down). Now suppose that you are creating a new B (say you are adding a file to a directory). In order to keep the metadata consistent, you want to write things bottom first; you want to write the new B and then the new version of A.

(It's common to have several layers of pointing; A points to B which points to C which points to D and so on. In such cases you usually don't have to write each one by one, pausing before the next. Instead you just need everything else written, in some order, before you make the change visible by writing A.)

In theory disks with volatile write caches don't upset this; your metadata is still consistent if the disk loses power and neither A nor B get written. What breaks metadata consistency is that disks with write caches don't necessarily write things in order; it's entirely possible for a disk to cache both the B and A writes, then write A, then lose power with B unwritten. At this point you have A pointing to garbage. Boom. And disks with write caches are free to keep things unwritten for random but large amounts of time for their own inscrutable reasons (or very scrutable ones, such as 'A keeps getting written to').

(Note that copy-on-write filesystems are especially exposed to this, because they almost never update things in place and so are writing a lot of new B's and changing where the A's point. And the A is generally the very root of the filesystem, so if it points to nowhere you have a very bad problem.)

In the simple case you can get away with just a disk write barrier for metadata integrity, so that you can tell the disk that it can't write A before it's written B out. However, this isn't sufficient when you're dealing with multi-disk filesystems, where A may be on a different disk entirely than B. There you really do need to be able to issue a cache flush to B's disk and know that B has been written out before you queue up A's write on its disk. (Otherwise you could again have A written but not B, because B's disk lost power but A's did not.)

The multi-disk filesystem case is a specific example of the general case where write barriers aren't good enough: where you're interacting with the outside world, not just with things on the disk itself. Since all sorts of user level programs interact with the outside world, user programs generally need real 'it is on the disk' cache flush support.

(This is the kind of entry that I write to make sure I understand the logic so that I can explain it to other people. As usual, it feels completely obvious once I've written it out.)

Sidebar: write cache exposure versus disk redundancy

I believe that in a well implemented redundant filesystem, the filesystem's metadata consistency should survive so long as the filesystem can find a single good copy of B. For example if you have an N-way mirror, you're still okay even if N-1 disks all lose the write (such as by losing power simultaneously); you're only in trouble if all of them do. This may give you some reassurance even if you have disks that ignore or don't support cache flushes (apparently this includes many common SSDs, much to people's displeasure).

(In disk-level redundancy instead of filesystem-level redundancy you may have problems recognizing what's a good copy of B. Let's assume that you have ZFS-like checksums and so on.)

Of course, power loss events can be highly correlated across multiple disks (to put it one way). Especially if they're all in the same server.

MetadataWriteCacheHole written at 00:22:42; Add Comment

By day for June 2010: 4 20 22; before June; after June.

Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.