A Solaris 8 Disksuite single user mode surprise

September 8, 2006

If you boot a Disksuite-using Solaris 8 machine into single-user mode to do maintenance and do a metastat, you'll discover that all of your mirrored metadevices are marked as needing to be metasync'd, even if they actually are fully consistent.

What seems to be going on is that Disksuite doesn't update things from the on-disk metadata state database when the kernel brings up the metadevices themselves in early boot. Instead, it defers this until you explicitly run 'metasync -r', which is normally done in /etc/init.d/lvm.sync, which is only run as part of going into runlevel 2.

(At least I assume that the kernel is bringing up the Disksuite devices itself in early boot, since these machines have their root filesystem on Disksuite mirrors. I am not quite up on the black box of early Solaris boot.)

The fix is pretty simple; once you're up in single-user mode, just remember to run '/etc/init.d/lvm.sync start' before you start futzing around much with the disks.

(Our experience is that it goes like lightning unless something is genuinely troublesome, which is about what you'd expect. But check with metastat afterwards, just to be sure. You probably don't need to do this if you're bringing a system down from normal operation into single-user mode, just if you're booting straight into single-user mode, but I haven't tested this to be sure.)

This makes a certain sort of sense from the right viewpoint, since it means that the system is doing as little as possible when coming up into single user mode. I have no idea how the kernel picks what to write to when it has to write to a metadevice, though. And it does mean you have to remember an extra step for most routine boots in single-user mode.

(The good news is that the excitement this caused us when we stumbled over this will probably insure that I don't forget this any time soon.)


Comments on this page:

By Dan.Astoorian at 2006-09-11 16:01:00:

(At least I assume that the kernel is bringing up the Disksuite devices itself in early boot, since these machines have their root filesystem on Disksuite mirrors. I am not quite up on the black box of early Solaris boot.)

Note that the reason that the Solaris 8 boot disk must be on a simple metadevice or simple submirror is that the boot code has no concept of striping: AFAIK there is no Disksuite magic at all in the boot block.

Metadevices are configured in /etc/rcS.d/S35lvm.init (via metainit -r), just before remounting / and /usr read-write.

What seems to be going on is that Disksuite doesn't update things from the on-disk metadata state database when the kernel brings up the metadevices themselves in early boot. Instead, it defers this until you explicitly run 'metasync -r', which is normally done in /etc/init.d/lvm.sync, which is only run as part of going into runlevel 2.

Which seems sensible, given that it's foreseeable that the reason you've booted single-user could have been to change the configuration of your mirrors. Disksuite 4.2.1 mirrors or RAID5 devices cannot be detached or otherwise unconfigured while they're being sync'd (short of yanking a drive to force the resync to fail), so if the metasync were done automatically in single-user mode, you might not be able to reconfigure your devices until the useless resyncs completed.

What's wrong with waiting until you're done in multi-user mode before resyncing the mirrors? (I.e., what's the actual advantage of running /etc/init.d/lvm.sync start from single-user mode?)

--Dan Astoorian

By cks at 2006-09-11 17:08:38:

For some reason I missed the lvm.init in /etc/rcS.d/ when I looked to see if it was run, but things make some more sense now. (I wonder if the metainit -r yanks around the kernel's idea of the root disk in the process.)

In this case we were about to break the mirror apart to defragment the filesystem 'in place', so we really wanted to be sure that the submirrors were consistent with each other before we started blowing things up.

By cks at 2006-09-11 20:11:49:

Whoops, I missed the general answer while giving a specific one:

What's wrong with waiting until you're done in multi-user mode before resyncing the mirrors?

Because I'd expect any filesystem writes one does to make the mirrors inconsistent and require a full resync at some point. (It's possible that Disksuite is smart enough to not do that, and replicates writes to all copies of even nominally unsynced mirrors.)

Unless Disksuite is really smart, this means that I'm not actually protected by its mirroring when in single user mode, unless I specifically activate it. But I want to be protected by mirroring as much as possible, even in single-user mode when I'm installing patches or moving data around or etc; disks can fail in single-user mode as much as they can multi-user.

(In some ways disk failure in single-user mode is likely to be more severe than in multi-user, because I'm probably in single-user mode because I'm doing something especially delicate and dangerous.)

By Dan.Astoorian at 2006-09-12 11:56:08:

What's wrong with waiting until you're done in multi-user mode before resyncing the mirrors?

Because I'd expect any filesystem writes one does to make the mirrors inconsistent and require a full resync at some point. (It's possible that Disksuite is smart enough to not do that, and replicates writes to all copies of even nominally unsynced mirrors.)

I don't know whether it does that. My intuition suggests it would instead use the same mechanism as metaoffline / metaonline, which keeps track of the dirty regions that need to be resync'd.

By cks at 2006-09-12 16:58:19:

The problem with Disksuite merely keeping track of the dirty region for later resyncs is that it still leaves you with a non-redundant mirror, although less of your data is exposed. I'd much rather be operating with full redundancy.

Written on 08 September 2006.
« I hate hardware (AMD CPU edition)
Something I really wish vendor product pages did »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Fri Sep 8 22:15:48 2006
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.