Wandering Thoughts archives

2006-07-31

The limitations of Unix atime

Recently, I've come to believe that Unix's file (okay, inode) access time is increasingly useless. The problem isn't that Unix is slipshod about maintaining atime; the problem is that there are now an increasing number of things that read files as a routine matter, and so file atime is increasingly nothing more than the last time one of them ran.

The most obvious example is any backup program that works through the filesystem, which most of them do (especially the commercial ones). Modern GUI desktops often open files that they come anywhere near (to show you pretty pictures and other previews of them); after that there's the growing popularity of desktop search systems, which read all of your files to index them.

(Note that resetting the access time with utime(2) is a cure worse than the disease, since doing that updates ctime and in turn makes any competent backup system think that all of the files have changed and need to be backed up.)

Despite all of this atime is still somewhat useful, so I haven't disabled it on most of my systems since the overhead is low (and it requires explicit steps to disable).

Sidebar: why disable atime?

The reason to disable atime updates is disk IO bandwidth; you don't have to write all of those updated inodes to disk, which in turn can save you a boatload of seeks since inodes are often scattered more or less randomly around the disk. Since seeks are the really time-consuming thing on modern disks, avoiding a bunch of them can be a serious win, especially in environments where writes hit multiple disks like RAID-1 or RAID-5 setups.

(One of the original groups that really loved the ability to turn off atime updates was Usenet server admins, back in the days of 'one article per file' news spool directories where even skimming through a newsgroup might require reading a bit of hundreds of files. News was often IO constrained and the atime of Usenet articles was basically pointless, so.)

AtimeLimitations written at 17:41:42; Add Comment

2006-07-29

Another little sysadmin twitch or two

One of my little sysadmin twitches is that when I am using mv to move things into a different directory, I try to always write it as:

mv foo bar/

(Note the the trailing slash on the directory.)

This is a safety measure: if I typo the directory name, mv will error out with a no-such-directory error message instead of renaming the file to 'br' or the like. Speaking from personal experience, tracking down just what happened to your file when you make this mistake and don't notice right away is immensely frustratingly difficult.

(The difficulty is compounded by two of my habits: my shell history is per-shell, not global, and I discard shells/windows once I'm done with the particular thing I was using them for, which of course destroys that shell's history. Thus if I typo the mv and don't notice before I discard the shell, the history that would let me back up and see the typo vanishes.)

Of course I am also a strange mutant who likes having rm, mv, and cp aliased so that they have '-i' on. (My personal aliases for them turn off this behavior if I explicitly use '-f', so that things like 'rm -rf blah' are not annoying.)

(As an aside, the habit of advising new sysadmins that they should on no account do this for their own accounts because they'll screw themselves up when they work as root or whatever without it has always struck me as an exercise in masochism. The right solution is to fix your root environment so that it also has things set this way. Nor is it terribly difficult to arrange for different people to have different root environments in most situations, to accommodate co-workers with different preferences.)

ASysadminTwitchII written at 00:47:23; Add Comment

2006-07-27

Unix files do not have creation times

There is a persistent belief, held by a large assortment of even fairly experienced people, that Unix files have a creation time. They don't.

The belief generally arises because one of the three times that all Unix files have is called ctime, going along with mtime and atime. However, the 'c' stands for 'change', not 'creation'; an inode's ctime is updated more or less any time various data fields in the inode itself are changed.

(The exact details of which system calls should update ctime are best dug out of specifications. Then you get to test it to be sure, because some systems and some filesystems get it wrong, most commonly by not updating the ctime of a rename()ed file.)

So while a file creation time might be nice to have, Unix doesn't have it, and ctime is not what you might think it is.

The problems with an inode creation time

I suspect that two of the reasons that Unix doesn't have an inode creation time is that it's hard to come up with just what should count as file 'creation' in order to be useful, and that inode creation time would be less useful than people think because it's too low level.

The most straightforward answer for inode creation time would be 'whenever an inode is allocated'. This has the virtue of being completely technically correct, and the flaw of being near useless in practice, since things like '> file' in the shell on an existing file merely truncate the file and don't delete it and create a new one. If you add 'truncate a file to 0 length' in there, the question becomes what else is destructive enough to count.

The need to account for truncation points to the reason inode creation time is too low-level: what people are really interested is the creation time of logical files, not of inodes. A careful program writing to a logical file may use several inodes under the covers; at this point the 'creation time' of your document, as reported by Unix, turns into 'the most recent time you asked your editor to save it'.

Sidebar: Why rename() updates ctime, and should

There are two reasons why rename() updates ctime: the theoretical and the practical.

The theoretical one is that, logically speaking, rename is a link followed by an unlink; if you implemented it literally as that, you would update the ctime (twice, not that you'd notice).

The practical one is that updating ctime on rename makes life easier for backup programs during incremental backups; it makes ctime a reliable indication of 'something important has changed with this file, better back it up'.

(The pragmatic reason is that real backup programs expect this behavior, because rename() has done this ever since BSD introduced it.)

UnixCtimeMyth written at 01:36:39; Add Comment

2006-07-26

xterm's ziconbeep feature

One of the cornerstones of my X setup is an xterm feature that a lot of people probably haven't heard of: -ziconbeep. The manpage explanation is too long and obscure to quote, so I'll summarize: if an iconified (aka minimized) xterm with -ziconbeep gets output, it puts '*** ' at the front of its title (technically only the icon name) and optionally beeps. (When you de-minimize the xterm, the title addition goes away.)

This probably sounds like a dinky little feature, but it is marvelously convenient for having scads of mostly inactive things sitting around. When one of them wakes up, you know about it right away; better yet, if you use some sort of taskbar equivalent that sorts windows by their alphabetical names, you can zoom right in the activity because the windows with unseen output are right at the front.

(In my X setup I put the start of my taskbar equivalent at the top left corner of the screen. To pop open whatever just woke up I just have to slam my mouse to the top left corner and click, which is damn fast and convenient (at least as long as I'm in the first virtual screen). If several things woke up, I just have to click several times.)

The one thing the xterm manpage doesn't really explain about -ziconbeep is how to suppress the bell. From the right viewpoint it's obvious: use a negative value like '-100' for the volume percentage.

I don't know if 'modern' terminal programs like gnome-terminal and kconsole have similar features. If not, they should get them. (I put modern in quotes because I tend to have strong and not always positive feelings about attempts at supplanting xterm.)

XtermZiconbeep written at 02:14:10; Add Comment

2006-07-24

A brief history of cut and paste in X

Here's a brief history and explanation of how cut and paste works in the X Windows system (to give it its formal name), prompted by this blog entry with the entirely reasonable gripe that selecting something in a program and then exiting the program causes the selection to disappear.

There are two separate cut and paste mechanisms in X, from two separate eras: the cut buffer and selections. The cut buffer is the older mechanism, and works by having programs store the cut data into an X property on the root window, which you can see directly with 'xprop -root CUT_BUFFER0'. (Technically there are eight or so cut buffers, but almost no one uses anything except the first one.)

The cut buffer has some limitations:

  • it doesn't handle large amounts of data, because it stores things in X properties.
  • it only really handles text, and text without character set information at that.
  • you can only make the data available in one representation, when it may have several possible ones.

These were fine as long as the X was mostly used to run xterm and xclock (to swipe a slam from the Unix-Haters Handbook), but not so good when people started trying to do more sophisticated things. Enter selections.

Selections solve the cut buffer limitations by making the program that generated the selection hold onto the selection itself. When other programs want to paste something, they talk to the holding program directly to get a copy, and negotiate things like what format it's going to be in.

The drawback of the selection model is what Martin Krafft experienced: if the program that set up a selection goes away, so does the selection, because no one else has a copy. (In fact the creating program can make the selection go away any time it feels like, and often has to take extra steps to take a private copy just for the selection.)

As a pragmatic matter, any program that selects text should export the selection into the cut buffer and anything that can paste text in should read from the cut buffer if there's no current selection, because that avoids most of the problem. Unfortunately there has been something of a movement for ignoring the cut buffer as 'obsolete', so things like Firefox and gnome-terminal never go near it.

(A more complete technical explanation is in Jamie Zawinski's X Selections, Cut Buffers, and Kill Rings.)

Sidebar: selections and the clipboard

The clipboard is effectively a second selection; it uses the same mechanisms and also goes away when the owning program exits. What shows up in the clipboard versus what is just a plain selection depends on the application, but the general rule is that copying or selecting something explicitly will make it into the clipboard instead of a mere selection. For most purposes you can ignore the difference.

The xclipboard program keeps a copy of the clipboard contents, which can sometimes be convenient. It also makes a handy text window to scribble stuff in.

(A similar xselection program is possible, but I don't know if anyone's already written it. It might cause significantly more X traffic and CPU usage, since programs change the selection a fair bit more than they change the clipboard.)

XCutAndPasteHistory written at 13:43:51; Add Comment

2006-07-19

A sysadmin habit: screen locking

Years and years ago, I bound a function key in my window manager to start xlock and then carefully trained myself to hit F10 (the aforementioned function key) the moment I got up from my chair to leave my cubicle-office. It didn't matter how trivial the errand or how short it would be; I'd hit F10 and locked the screen. By now I have this habit so ingrained that it's become one of my little twitches.

We have an access-controlled office area and everyone in here is some sort of system administrator, so it's probably pretty harmless to not lock your display when you walk away from the computer. Probably.

The reason I bound starting xlock to a function key was to make it as fast and easy to do as possible; the faster it is to lock my screen, the more likely it is that I'll do it all the time. No exceptions, no 'I'll just be away for 30 seconds and it'd be too much of a pain to find the menu entry'. I figured this was well worth stealing a function key away from programs that might want to use it.

(This reminds me that I need to turn off the automatic display locking timeout on Fedora Core 5, because it interacts badly with a KVM; I'm a bit tired of finding the display on my test machine locked just because I switched to another machine on the KVM for a while.)

ScreenlockHabit written at 17:02:58; Add Comment

2006-07-07

Sysadmins are an overhead

It's important to remember that system administration is pretty much just overhead, much like the janitorial staff (but rather less important; the university can run a lot longer without people tending the machines than without people taking out the trash).

The people who are important are those who are doing the organization's work. The actual work of universities is research and then teaching students, so the people with real power here are professors (especially if they have grant funding) and, paradoxically, students. (Students have a great deal of power because if they go away, so does the university. It's just that it's usually difficult for them to exercise that power directly.)

As a sysadmin, I try to remember what this means: my job is to get out of the way. In fact, my job is to quietly sweep things away, much as the janitors quietly sweep dust out. If I am doing my job right people don't actually notice anything, much like people rarely notice the absence of dust and grime.

For some reason, system administration seems to build a kind of arrogance that makes this difficult. I suspect that a fair part of it is because good system administrators know a great deal about a lot of fairly deep technical issues, and we become annoyed when ignorant outsiders wander by to shove some oars in.

SysadminOverhead written at 02:01:37; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.