2010-09-26
The problems with OpenSolaris
Even before Oracle summarily killed off OpenSolaris, I didn't find it very attractive. I had two serious general problems with it.
The first is that Sun didn't seem to make any effort to make OpenSolaris into a useful and usable distribution. Instead what you got (at least from an outside perspective) was periodic snapshots of a development tree, something like Debian testing or Fedora Rawhide. This made perfect sense from Sun's perspective, but I didn't have any particular interest in building things on sand.
(When Sun offered OpenSolaris support, the support period was too short to be useful to us.)
The second is that OpenSolaris has always depended on Sun's goodwill, and not just for continual source updates. The OpenSolaris source code is incomplete; both building a binary system from source and important bits of system functionality (like a good chunk of the NFS code) required using closed source components that Sun generously gave people for free. If Sun had wanted to cripple OpenSolaris usage in practice, all it ever had to do was change that and we would have been left with something that was about as directly useful as 4.3 BSD Lite.
(Part of this illustrates how big a debt the open source world owes to gcc, which in many ways is the unsung foundation of both the *BSDs and Linux. Yes, today we have open source alternatives. Today.)
These problems meant that I couldn't consider OpenSolaris to be an insurance policy and viable alternative in case Sun went crazy with Solaris. Although Sun might not be able to retract the source code, they could go just as crazy with OpenSolaris and make it unusable if they ever felt like it.
2010-09-20
Dear Solaris boot sequence: SHUT UP
Here is how Solaris reduced me to a grim rage just now.
We had a power glitch this weekend, and one of our Solaris servers wouldn't come up afterwards; it hit some problem that left it needing attention in single-user mode.
What problem? I couldn't see. Solaris printed the problem to the machine console, but then it kept producing more boot-time output. Lots of boot time output, all of it unimportant (one or two lines per iSCSI target that it could connect to). This was more than enough to scroll the problem message off the top, and of course the Solaris console has no scrollback buffer. So I was left with a system that needed manual attention, but I had no idea of exactly what manual attention it needed.
The Solaris boot sequence is an egregious offender in the category of bad boot time messages; it is a peculiar mixture of completely uninformative and pointlessly verbose. In our environment, possibly the entire blame for this rests on iSCSI, but it's an OS component and so I blame Solaris cultural attitudes as a whole.
(Frankly, it's yet another really odd way that an 'enterprise' OS is not actually enterprise ready. One defining trait of enterprise environments is that they have a lot of whatever; a lot of disks, a lot of iSCSI targets, a lot of RAM, and so on. Thus I'd think than an enterprise ready OS would think about scale issues in messages and the like. But Solaris? Not so much.)
As usual, I got out of this by figuring out how to turn off iSCSI entirely, so that I could sort of see the boot messages. Fortunately our Solaris servers don't actually require iSCSI to boot, or we'd be in much more trouble.
PS: for people who are wondering why we don't have a serial console, one reason is this. I refuse to touch Solaris 10 serial support until there is some user friendly guide that actually works and explains what is going on and how you do things.
Sidebar: what it took to turn off iSCSI
Roughly:
fsck, so that the filesystems were cleanmount -o remount,rw /; this complained but seems to have actually worked.iscsiadmto turn off discovery. Without the read-write remount of/, this completed without visible problems but didn't actually turn things off permanently.
Ironically, something in this whole sequence made the 'needs attention' issue go away too, but I don't know exactly what it was (or what the issue was), since I couldn't see the message that Solaris printed and as far as I know, it wasn't logged anywhere. Possibly it was my old friend boot archives.