Wandering Thoughts archives

2007-05-30

A gotcha with the automounter and loopback mounts

On Solaris, there is a combination gotcha with the automounter and mounts on the same host. It goes like this:

  1. your fileserver normally has /dev/whatever mounted on /export/foo.
  2. your generic automounter configuration mounts fileserver:/export/foo as /foo.
  3. you need to do some maintenance to the filesystem, so you unshare and unmount /export/foo.
  4. after you're done you try to remount it, but you get a message that the mount point is busy. The only mention of /export/foo that mount shows is something that looks like:
    /foo on /export/foo ...

What has happened is that something on your fileserver tried to touch /foo during your maintenance, so the automounter went ahead and mounted it from where you told it to. Loopback mounts don't check NFS share permissions so the mount wasn't denied, and loopback mounts (like NFS mounts) just put directory A on directory B, so the automounter didn't stop because there was no filesystem there; there was an /export/foo directory and that was good enough.

The direct way out is umount /foo. Unfortunately this may not be good enough if something is actively banging on that name, because the automounter will just remount it again; you may need to find the something and shoot it.

(In our case it was mail delivery. Why we are doing mail delivery directly on the fileservers is a long story.)

AutomountLoopbackGotcha written at 22:48:38; Add Comment

2007-05-28

Why ZFS's data integrity is less important than Solaris's usability

Mark Musante:

The bottom line is that Solaris is hard to administer (yeah, it's a fair cop), so server data is just going to have to suffer. Hopefully some day Solaris will be as easy as redhat, or debian, or ubuntu, or <insert name of distro here>. Some day. Meanwhile, I'll choose data integrity over ease of administration.

The problem with this is that quiet disk corruption is not currently a big issue for most people; it just don't happen all that often, at least that people notice, or people would be howling in pain right now. You can argue that people just haven't noticed the corruption that they're experiencing now, but the counter-argument is that if people haven't noticed it it's clearly not that important to them (yet, more or less).

Or to put it another way: the problem for Sun is that they are trying to sell a better mousetrap when people don't feel that they have a mouse problem (or at least not a mouse problem that their existing mousetraps can't deal with).

(Perhaps Sun has done studies that show that disk systems and so on are going wrong much more often than people expect, or that future disk systems will inevitably have higher error rates, or the like. That would be newsworthy and I would expect to find that sort of stuff mentioned at the ZFS pages.)

Even without a mouse problem, people would still go for the better mousetrap if it was otherwise a more or less neutral choice, but it is not. To extend the metaphor, the better Sun mousetrap is uncomfortable and has sharp bits that poke you reasonably frequently. That it is cool and nifty starts to fade after the first few times you have to apply bandaids.

And that is why ZFS's data integrity features are less important than Solaris's ease of administration. In practice, ease of administrations matters more to more people, because right now relatively few people are seriously worried about silent data corruption whereas everyone has to administer their machines.

(In other words, people will indeed often choose practical ease of administration over (theoretical) data integrity, whether or not they are willing to admit it out loud.)

ZFSvsSolaris written at 21:56:04; Add Comment

By day for May 2007: 28 30; before May; after May.

Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.