Wandering Thoughts archives

2006-09-08

A Solaris 8 Disksuite single user mode surprise

If you boot a Disksuite-using Solaris 8 machine into single-user mode to do maintenance and do a metastat, you'll discover that all of your mirrored metadevices are marked as needing to be metasync'd, even if they actually are fully consistent.

What seems to be going on is that Disksuite doesn't update things from the on-disk metadata state database when the kernel brings up the metadevices themselves in early boot. Instead, it defers this until you explicitly run 'metasync -r', which is normally done in /etc/init.d/lvm.sync, which is only run as part of going into runlevel 2.

(At least I assume that the kernel is bringing up the Disksuite devices itself in early boot, since these machines have their root filesystem on Disksuite mirrors. I am not quite up on the black box of early Solaris boot.)

The fix is pretty simple; once you're up in single-user mode, just remember to run '/etc/init.d/lvm.sync start' before you start futzing around much with the disks.

(Our experience is that it goes like lightning unless something is genuinely troublesome, which is about what you'd expect. But check with metastat afterwards, just to be sure. You probably don't need to do this if you're bringing a system down from normal operation into single-user mode, just if you're booting straight into single-user mode, but I haven't tested this to be sure.)

This makes a certain sort of sense from the right viewpoint, since it means that the system is doing as little as possible when coming up into single user mode. I have no idea how the kernel picks what to write to when it has to write to a metadevice, though. And it does mean you have to remember an extra step for most routine boots in single-user mode.

(The good news is that the excitement this caused us when we stumbled over this will probably insure that I don't forget this any time soon.)

solaris/SingleUserDisksuite written at 22:15:48; Add Comment

I hate hardware (AMD CPU edition)

I have been trying to spec out a new machine lately, which is reminding me all over again how I hate hardware. This time around, the target of my particular hate is AMD CPUs, especially the new AM2 ones, where the performance picture has become so complicated that you need a large chart to understand it.

Choosing CPUs used to be simple: within a given CPU family, the only thing that changed performance was the clock speed, so you could just buy the fastest CPU your budget and desires afforded and be done with it.

Athlons are no longer like that. Within the Athlon 64 X2 AM2 family, there are now three variables: clock speed, L2 cache size per core, and achievable main memory speed (the clock multiplier, as explained by AnandTech). Models with increasing nominal clock speeds zig-zag in the other attributes, to the point where I had to consult the large Wikipedia page of Athlon 64 processors to keep things straight.

(Thank god for Wikipedia. Good luck finding AMD discussing this anywhere you can conveniently find it; I'm not sure they even have a comparison chart of L2 cache sizes on their website.)

Then, once I'd worked all this out, it turns out that the supply of 1MB L2 parts seems to have dried up around here; local computer shops can't even get the Socket 939 versions with 1MB L2 caches, much less the AM2 ones. (Rumour has it that AMD has starved the distributor pipeline in favour of redirecting most of the supply to certain large computer vendors.)

I could try to view the 1MB L2 part drought as a way of simplifying my life, but instead it just irritates me that I can't spec the CPUs I really want.

(I care about the cache size and main memory speed because I tend to think that they dominate performance for the kind of CPU-intensive things I'm likely to do with my machines. Not that I've actually measured this to find out for sure, which makes me some sort of fool.)

tech/AMDCpuIrritation written at 15:44:49; Add Comment

Link: IRON File Systems

IRON File Systems [PDF] is a paper from the 2005 ACM Symposium on Operating Systems Principles. To quote from the abstract:

Commodity file systems trust disks to either work or fail completely, yet modern disks exhibit more complex failure modes. We suggest a new fail-partial failure model for disks, which incorporates realistic localized faults such as latent sector errors and block corruption. We then develop and apply a novel failure-policy fingerprinting framework, to investigate how commodity file systems react to a range of more realistic disk failures. [...]

They did their primary analysis on Linux ext3, ReiserFS 3, and (Linux) JFS; the results are comprehensive, interesting, and sometimes scary.

links/IronFileSystems written at 12:06:17; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.