2009-11-25
Why I love Unix, number N (for some N)
Suppose that you want to find all two and three digit primes where all of the component digits are also prime (eg, 37 would qualify but 41 would not, since 4 is not prime).
Here is the simple Unix approach:
factor $(seq 10 999) | awk 'NF == 2 {print $2}' | egrep '^[12357]*$'
(I won't claim that this is obvious until you're immersed in the Unix pipe-bashing mindset, at which point it becomes all too easy to get sucked into the Turing tar pit.)
On a side note, seq is one of those programs that get more and more
useful the more often I use it. It's not at all interesting on its own,
but it really comes into its real power when used as an ingredient in
shell script iteration.
And, oh yeah, it's a GNU tool. Another way that they've contributed to Unix in the Unix way.
(Okay, credit where credit is due; I believe that seq first showed up
in Plan 9. But I will point out
that the GNU tools people are the only people smart enough to reuse the
idea.)
Update: Oops. As pointed out by a commentator, 1 is not a prime.
This shows one drawback of these neat one-line things; they're so
short and simple that you may be tempted to not double check your
work. (This is especially embarrassing for me because I looked at the
output of 'factor $(seq 1 9)' to make sure that I had the right set of
single-digit primes, but discounted what factor printed for 1 without
looking into it further.)
2009-11-05
Why the NFS client is at fault in the multi-filesystem NFS problem
In yesterday's entry, I said that the NFS clients were at fault in dealing with the duplicate inode number problem. Now it's time for the details, because on first look this appears a bit odd; how can it be the client's responsibility to avoid duplicate inode numbers, when the server gives it the inode numbers?
In the NFS v3 specification,
inode numbers only appear in one spot; they're part of the file
attribute structure that the server returns for GETATTR requests.
While it is used for more than just stat(), GETATTR is the NFS
analog of the stat() system call and the fattr3 structure that it
returns is the analog of the kernel's struct stat that stat() fills
in, and much the same information appears in both.
In particular, the fattr3 structure has both a fileid (the inode
number) and a fsid, the 'file system identifier for [the file's] file
system'. While NFS v3 requires that the inode number to be unique it
only requires that it be unique within a single server filesystem,
that is, for files with the same fsid. And an NFS server is free to
give you files with different fsids even though you have only made one
NFS mount from it, of what you think is a single filesystem.
The simple way for clients to map between GETATTR and stat() is to
turn the fileid into the inode number, fill in st_dev based on
some magic internal number you're using for this NFS mount, and throw
away the fsid. A kernel that does this has the duplicate inode number
problem.
Unfortunately, fixing this is complicated. The NFS client cannot simply
use the fsid for st_dev, because st_dev must be unique on
the local machine and the fsid comes from the server; thus, it can
potentially collide both with local filesystems and with filesystems
from other NFS servers. Using fsid at all in the stat() results
requires somehow inventing a relatively persistent and unique st_dev
value for every different fsid that every NFS server gives you, which
is non-trivial.
(If you have a very big st_dev you can deal with the problem by
mangling the fsid together with a unique local number for this NFS
mount. But fsid is a 64-bit number, so you'd need a pretty epic
st_dev.)
Sidebar: the Linux solution to this problem
The Linux NFS client has a creative solution to this problem: it
actually creates new NFS-mounted filesystems on the fly, complete
with new local st_dev values, every time you traverse through
a point where the fsid changes. Comments in the source code say
that this has the side effect of making df work correctly, at
least as long as you are not dealing with something like ZFS.
2009-11-04
The cause of the multi-filesystem NFS export problem
There is a famous irritation with managing NFS filesystems which boils
down to that NFS clients have to know about your filesystem boundaries.
It goes like this: suppose that /home and /home/group1 are separate
filesystems and you NFS export both of them. What you would like is
that clients NFS mount /home and automatically get /home/group1
too, because this lets you transparently add /home/group2 next month.
However, this doesn't work (although some systems will try hard to fake
it if you tell them to).
(This issue is a lot more pertinent these days in light of things like ZFS, where filesystems are cheap objects.)
Although it superficially looks like the NFS re-export problem, the problem here isn't telling NFS filehandles for the different real filesystems apart. Provided that all of the filesystems can be NFS exported normally, your NFS server can just give out the same filehandles it would if the client had explicitly mounted the filesystems separately (the filehandle is opaque to the client, after all).
The real problem is what common NFS clients expect about the inode numbers; specifically, they expect the inode number to be unique in the client's view of the filesystem, and from the client's view it only mounted one filesystem. Meanwhile, on the server there are multiple filesystems and their inode numbers are almost certain to overlap. The result is explosions in some programs on the client under some circumstances, as the programs see duplicate inode numbers for files that are not actually hardlinks to each other.
(The client kernels generally don't care; the inode numbers that user programs see are unrelated to the NFS filehandles that the kernel uses.)
Technically this is a client side problem, but I doubt that any NFS client implementation actually gets it right. (And it is very hard to get right, since the client has to somehow make up unique yet ideally persistent inode numbers.)
(This is the kind of thing that I write down in part so that I can remember the logic the next time I wonder about it.)
Sidebar: the more subtle failures
Okay, that's not quite all that goes wrong if the server lets NFS
clients transparently cross filesystem boundaries, because there are
various operations that don't work across server filesystem boundaries
despite looking like they should on the client. For example, if
/home on the client is all one single NFS mount, a program is
rationally entitled to believe that it can hardlink /home/fred/a to
/home/group1/jim/b. In practice this is going to fail with an error
because on the server that's a cross-filesystem hardlink.