Wandering Thoughts archives

2010-09-20

Dear Solaris boot sequence: SHUT UP

Here is how Solaris reduced me to a grim rage just now.

We had a power glitch this weekend, and one of our Solaris servers wouldn't come up afterwards; it hit some problem that left it needing attention in single-user mode.

What problem? I couldn't see. Solaris printed the problem to the machine console, but then it kept producing more boot-time output. Lots of boot time output, all of it unimportant (one or two lines per iSCSI target that it could connect to). This was more than enough to scroll the problem message off the top, and of course the Solaris console has no scrollback buffer. So I was left with a system that needed manual attention, but I had no idea of exactly what manual attention it needed.

The Solaris boot sequence is an egregious offender in the category of bad boot time messages; it is a peculiar mixture of completely uninformative and pointlessly verbose. In our environment, possibly the entire blame for this rests on iSCSI, but it's an OS component and so I blame Solaris cultural attitudes as a whole.

(Frankly, it's yet another really odd way that an 'enterprise' OS is not actually enterprise ready. One defining trait of enterprise environments is that they have a lot of whatever; a lot of disks, a lot of iSCSI targets, a lot of RAM, and so on. Thus I'd think than an enterprise ready OS would think about scale issues in messages and the like. But Solaris? Not so much.)

As usual, I got out of this by figuring out how to turn off iSCSI entirely, so that I could sort of see the boot messages. Fortunately our Solaris servers don't actually require iSCSI to boot, or we'd be in much more trouble.

PS: for people who are wondering why we don't have a serial console, one reason is this. I refuse to touch Solaris 10 serial support until there is some user friendly guide that actually works and explains what is going on and how you do things.

Sidebar: what it took to turn off iSCSI

Roughly:

  • fsck, so that the filesystems were clean
  • mount -o remount,rw /; this complained but seems to have actually worked.
  • iscsiadm to turn off discovery. Without the read-write remount of /, this completed without visible problems but didn't actually turn things off permanently.

Ironically, something in this whole sequence made the 'needs attention' issue go away too, but I don't know exactly what it was (or what the issue was), since I couldn't see the message that Solaris printed and as far as I know, it wasn't logged anywhere. Possibly it was my old friend boot archives.

solaris/ShutUpPlease written at 11:18:05; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.