Wandering Thoughts archives

2018-08-20

Explicit manipulation versus indirect manipulation UIs

One of the broad splits in user interfaces in general is the spectrum between what I'll call explicit manipulation and indirect manipulation. Put simply, in a explicit manipulation interface you see what you're working on and you specify it directly, and in an indirect manipulation interface you don't; you specify it indirectly. The archetypal explicit manipulation interface is the classical GUI mouse-based text selection and operations on it; you directly select the text with the mouse cursor and you can directly see your selection.

(This directness starts to slip away once your selection is large enough that you can no longer see it all on the screen at once.)

An example of an indirect manipulation interface is the common interactive Unix shell feature of !-n, for repeating (or getting access to) the Nth previous command line. You aren't directly pointing to the command line and you may not even still have it visible on the screen; instead you're using it indirectly, through knowledge of what relative command number it is.

A common advantage of indirect manipulation is that indirect manipulation is compact and powerful, and often fast; you can do a lot very concisely with indirect manipulation. Typing '!-7 CR' is unquestionably a lot faster than scrolling back up through a bunch of output to select and then copy/paste a command line. Even the intermediate version of hitting cursor up a few times until the desired command appears and then CR is faster than the full scale GUI text selection.

(Unix shell command line editing features span the spectrum of strong indirect manipulation through strong explicit manipulation; there's the !-n notation, cursor up/down, interactive search, and once you have a command line you can edit it in basically an explicit manipulation interface where you move the cursor around in the line to delete or retype or alter various bits.)

Indirect manipulation also scales and automates well; it's generally clear how to logically extend it to some sort of bulk operation that doesn't require any particular interaction. You specify what you want to operate on and what you want to do, and there you go. Abstraction more or less requires the use of indirect manipulation at some level.

The downside of indirect manipulation is that it requires you to maintain context in order to use it, in contrast to explicit manipulation where it's visible right in front of you. You can't type '!-7' without the context that the command you want is that one, not -6 or -8 or some other number. You need to construct and maintain this context in order to really use indirect manipulation effectively, and if you get the context wrong, bad things happen. I have accidentally shut down a system by being confidently wrong about what shell command line a cursor-up would retrieve, for example, and mistakes about context are a frequent source of production accidents like 'oops we just mangled the live database, not the test one' (or 'oops we modified much more of the database than we thought this operation would apply to').

My guess is that in much the same way that custom interfaces can be a benefit for people who use them a lot, indirect manipulation interfaces work best for frequent and ongoing users, because these are the people who will have the most experience at maintaining the necessary context in their head. Conveniently, these are the people who can often gain the most from using the compact, rapid power of indirect manipulation, simply because they spend so much time doing things with the system. By corollary, people who only infrequently use a thing are not necessarily going to remember context or be good at constructing it in their head and keeping track of it as they work (see also).

(The really great trick is to figure out some way to provide the power and compactness of indirect manipulation along with the low need for context of explicit manipulation. This is generally not easy to pull off, but in my view incremental search shows one path toward it.)

PS: I'm using 'user interface' very broadly here, in a sense that goes well beyond graphical UIs. Unix shells have a UI, programs have a UI in their command line arguments, sed and awk have a UI in the form of their little languages, programming languages and APIs have and are UIs, and so on. If people use it, it's in some sense a user interface.

(I'd like to use the term 'direct manipulation' for what I'm calling 'explicit manipulation' here, but the term has an established, narrower definition. GUI direct manipulation interfaces are a subset of what I'm calling explicit manipulation interfaces.)

tech/ExplicitVsIndirectManipulation written at 22:12:16; Add Comment

It's worth testing that obvious things actually do work

We've reached the point in putting together our future ZFS on Linux NFS fileservers where we believe we have everything built and now we're testing it to make sure that it works and to do our best to verify that there are no hidden surprises. In addition to the expected barrage of NFS client load tests and so on, my co-worker decided to verify that NFS locks worked. I would not have bothered, because of course NFS locks work, they are a well solved problem, and it has been many years since NFS locks (on Linux or elsewhere) had any chance of not working. This goes to show that my co-worker is smarter than I am, because when he actually tried it (using a little lock testing program that I wrote years ago), well:

$ ./locktest plopfile
Press <RETURN> to try to get a flock shared lock on plopfile:
Trying to get lock...
  flock lock failure: No locks available

With some digging we were able to determine that this was caused by rpc.statd not being started on our (Linux) fileserver. We're using NFS v3, which requires some extra daemons to handle aspects of the (separate) locking protocol, and presumably NFSv3 is unfashionable enough these days that systems no longer bother to start them by default.

(Perhaps I'm making excuses for Ubuntu 18.04 here.)

Had we taken the fileserver into production without discovering this, the good news is that important things like our mail system would probably have failed safe by refusing the proceed without locks. But we would certainly have had a fun debugging experience, and under more stress than we did in testing. So I'm very glad that my co-worker was carefully thorough here.

The obvious moral I take from this is that it's worth testing that the obvious things do work. The obvious things are probably not broken in general (otherwise you would hopefully have heard about it during system research and design), but there's always the possibility of setup or configuration mistakes, or that you have a sufficiently odd system that you're falling into a corner case. You may not want to test truly everything, but it's certainly worth testing important but obvious things, such as NFS locking.

(There's also the unpleasant possibility that you've wound up with some fundamental misunderstanding about how the system is designed to work. This is going to force some big changes, but it's better to find this out before you try to take your mistake into production, rather than afterward as things are exploding.)

How much and how thoroughly you test in general depends on your resources and the importance of what you're doing. Some places might find and run a test suite that verified that their new NFS fileservers were delivering full POSIX compatibility (or as much as you can on NFS in general), for example. Making a point of testing the obvious is only an issue if you're only going to do partial tests, and so you might otherwise be tempted to skip the 'it's so obvious it must work' bits in the interests of time.

You may also want to skip explicitly testing the obvious in favour of doing end to end tests that will depend on the obvious working. For example, we might set up an end to end test of mail delivery and (IMAP) mail reading, and if we had, that would almost certainly have discovered the locking issue. There are trade-offs involved in each level of testing, of course.

(The short version is that end to end testing can tell you that it works but it can't tell you why, and it can be dangerous to infer that why yourself. If you actually want a low level functionality test, do the test directly.)

Sidebar: The smoking gun symptom

The fileserver's kernel logs had a bunch of messages reporting:

lockd: cannot monitor <host>

This comes from kernel code that attempts to make an upcall to rpc.statd, which led us to look at ps to make sure that rpc.statd was there before we went digging further.

sysadmin/TestTheObvious written at 01:17:56; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.