Wandering Thoughts archives

2011-02-13

A humbling experience with '#' characters in filenames

It's always a humbling experience to realize that you've made a terrible mistake when configuring a program, even when it only really affects you.

Once upon a time, I was building and thus configuring my first MH setup. MH puts each message in its own file and normally 'removes' messages by renaming them by putting a prefix character on the front; this lets you relatively easily un-delete messages for some amount of time if you made a mistake in removing them. When you are configuring MH, you can chose what this prefix character is.

For some reason (perhaps because it was mentioned as an option in the configuration documentation), I picked the prefix character '#'. This seemed to work well enough, and so I carried this particular MH configuration choice forward to every version of MH I configured for the next, oh, fifteen years. Then in the fall of 2006 I moved to an environment where for the first time in many years I was not using a version of MH that I had compiled myself; instead I was using a prepacked one. This version used the normal MH default character of ','.

Let me assure you that it is much easier to work with files that start with a ',' than it is to work with files that start with a '#'. Until my MH environment switched, I had not really been conscious of how subtly annoying the old way was, but it turned out that it was. As usual, it wasn't an issue of it being impossible or very hard to work with filenames with a '#'; it's that such filenames added friction to the process, enough friction that I avoid dealing with them. Friction matters more than we think it does.

Fortunately my bad configuration decision a very long time ago didn't affect very many people. In all of the environments I've worked in, only a few people ever used MH, and in several of them I was basically the sole user.

MHFilenameMistake written at 00:04:38; Add Comment

2011-02-04

Why people put NFS mounts in subdirectories

One of the little pieces of Unix wisdom is that you should put NFS mounts (well, their mount points) in their own subdirectory. You don't mount NFS filesystems directly in /, you don't mount them in a directory with local subdirectories that you care about, and ideally you don't mix filesystems from different servers in the same subdirectory.

(In other words, an ideal mount point is, say, '/nfs/<server>/<fs>'.)

What is behind this is a combination of Unix directory traversal and that if you stat() or otherwise attempt to touch a NFS mount point from a server that isn't responding, your program hangs. In a classical Unix system a surprising number of programs walk directories and stat() at least some of what they find, even programs you might not think of like pwd. Some of them walk up the filesystem hierarchy, or at least wind up looking at the root directory.

(Even if an NFS server is responding it might be rather slow.)

It's unavoidable that programs that really want to deal with filesystems from an unavailable NFS server will have problems. But we would like unrelated processes to not be hampered by a hung NFS server; if your process or session doesn't care about the unavailable filesystems and would be unaffected if they weren't mounted at all, it shouldn't hang. Which means that any directory traversal that you do needs to be kept away from such NFS mounts, so that you don't wind up stalling yourself because you stat()'ed a directory entry for an NFS mount that you don't care about.

Segregating NFS mount points from regular directories and then further segregating them by their server minimizes the chances that you'll trip over an unrelated NFS filesystem during this sort of directory traversal.

(And putting NFS mounts directly in / means that any program that looks at the root directory and stat()'s things in it might hang or be delayed due to any of the NFS servers having problems.)

As a pragmatic matter, some of this is no longer applicable on many modern Unix systems. So this is probably on its way to sliding into a Unix superstition (or at least a sysadmin one).

Sidebar: how classic pwd works

The classic version of pwd has a simple algorithm:

  • stat . and remember its identity
  • read through .., stat'ing every entry until we find the one for .; we now know the name of the current directory
  • go up one directory and repeat the process
  • stop when we hit a directory where .. points to itself, because that means we've hit the root directory and we should be done

I call this pwd's algorithm, but it also appeared as getcwd() and was used by anything that needed to know the current directory, such as 'df .'.

Modern systems make getcwd() into a system call because they can; they keep enough extra information in kernel memory to return the information immediately.

NFSMountsInSubdirectories written at 01:25:10; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.