2010-11-09
Directory link counts and a find trick
In the beginning, Unix directories were remarkably non-special. One
of the ways that they were non-special is that connections between
directories were not special in any way; a directory entry for a
directory was just a hard link to the directory's inode, the same as
with any other sort of inode. This applied to the two special names .
and .. too; each of them was a regular directory entry, the first a
hard link to the directory itself and the second a hard link to the
parent directory.
(Thus if you tried hard or had bad luck with disk corruption it was
possible to wind up with a directory where chdir(".") would put you in
a different directory, because the . entry was pointing to the wrong
inode.)
So far this is just Unix being simple and regular. But it had an
interesting consequence: the inode link count for directories meant
something. A directory always had at least two links (one for its own
. entry, one for its entry in its parent directory) and then it had
one more link for every subdirectory it had (each subdirectory had a
.. entry pointing back to the directory).
This leads immediately to two related optimizations for find and
other directory walkers:
- when you encounter a directory that has a link count of 2, you know
that it has no subdirectories and you do not need to
stat()any of its directory entries at all. - when you encounter a directory with a link count larger than 2, you
can keep track of how many subdirectories you've seen; once you've
seen N-2, you can stop
stat()'ing all the remaining directory entries because none of them can point to subdirectories.
(These are the clever optimizations I was talking about yesterday.)
There are two flaws in these optimizations. The first is that they assume that you're dealing with a sufficiently Unix-like filesystem to actually have valid directory link counts. This is no longer always the case, although I believe that it's still true for all serious Unix filesystems.
The second is that they assume that you're dealing with static,
unchanging directories where you can look at the link count once and
have it remain valid for your entire scan of the directory. This is
not necessarily the case. You can sort of get around this by repeatedly
fstat()'ing the directory; this is still a system call, but at least
it isn't going to touch the disk since you're guaranteed that the
directory's inode is in memory while you have it open.
(I don't consider it a flaw that these optimizations can break in the face of sufficient filesystem corruption. Nothing works in the face of sufficient filesystem corruption.)
2010-11-08
A find optimization and a piece of history, all in one
One of the floating pieces of modern Unix lore is that if you are doing
a find that matches against both filenames and other properties of the
file, it's best to put the filename match first. That is, if you want to
find zero-sized object files the right order is:
find . -name '*.o' -size 0 -print
I called this a piece of modern Unix lore for good reason; this wasn't
necessarily true in the old days (and even today it isn't always true,
depending on the filesystem and how smart your version of find is).
First, let's cover why this can be the faster order. When find is
processing a given directory entry it already has the name, but it
doesn't know the file size; to find out the file size it would have to
stat() the file, which takes an extra system call and possibly an
extra disk read IO. So if find can make a decision on the directory
entry just by checking its name, it can save a stat().
But wait. In order to properly traverse a directory tree, find needs
to know if a directory entry is a subdirectory or something else, and in
the general case that takes a stat(). This gets us back to being just
as slow, because regardless of the order of find operations find is
going to have to stat() the name sooner or later just to find out if it
needs to chdir() into it. So how can find still optimize this?
(There are some clever optimizations that find can do under some
circumstances, but we'll skip those for now.)
What happened is that a while back, Unix filesystem developers realized
that it was very common for programs reading directories to need to know
a bit more about directory entries than just their names, especially
their file types (find is the obvious case, but also consider things
like 'ls -F'). Given that the type of an active inode never changes,
it's possible to embed this information straight in the directory entry
and then return this to user level, and that's what developers did; on
some systems, readdir(3) will now return directory entries with an
additional d_type field that has the directory entry's type.
(This required changes to both filesystems, to embed the information in the on-disk information, and the system call API, to get it to user space. Hence it only works on some filesystems on some versions of Unix.)
Given d_type, find can completely avoid stat()'ing directory
entries if it only needs to know their name or their type. However,
it has to stat() the directory entry if it needs to know more
information, such as the size.
(And if the d_type of directory entries ever gets corrupted, you
can get very odd results.)
2010-11-05
GNU sort and -k: a gotcha
Sometimes I don't like GNU utilities.
Suppose that you have a file that looks like this:
oxygen fred nitrogen bob xenon fred carbon cks iron jim
Further suppose that you want to sort it by the second field, to group
machines by the users that own them. So of course you do 'sort -k2
file', because that's the obvious answer. Except that it doesn't work;
it sorts in some peculiar, non-obvious way, and it's not that you need
to specify field 1 or field 3 or anything like that. Perhaps you scratch
your head, grind your teeth, and move on. (That's what I did until
recently.)
Congratulations, you've been hit by a GNU sort gotcha; sort doesn't define fields the way you think it does. Pretty much every other sensible Unix program that deals with multi-field lines says that fields are separated by one or more whitespace characters. GNU sort, just to be different, says that fields are not so much separated by whitespace characters but created by whitespace characters and the whitespace characters become part of the next field.
(This is spelled out in the info document for GNU sort in the section
on the -t argument. Read it carefully.)
This works out the way you innocently expect if each line separates
fields with the same number of whitespace characters, or if you are
using -n even with a variable number of whitespace characters (at
least in my testing). It goes off the rails badly in cases like this
example, where fields are separated by a variable number of whitespace
characters.
The solution is to add the -b argument, which makes GNU sort work
the way you expect it to. I am tempted to make an alias (well, a
cover script) that always supplies -b, because I can't think of
any situation where I don't want this sane behavior.
(GNU sort's behavior is in fact in violation
of the Single Unix Specification for sort;
see the description of the -t option.)