Solaris is not an enterprise operating system

November 16, 2011

Why I say this is best explained as a two-part quiz. The first question: supposing that you are developing an operating system, when do you bring iSCSI disks online during boot?

  1. before bringing up ZFS pools and mounting all non-system filesystems.
  2. after bringing up ZFS pools and so on.
  3. while bringing up ZFS pools and so on, so that the two activities can race with each other in case you have ZFS pools on iSCSI disks.

Supposing that your original answer to this question was #1 and your new answer to this question is #3. How long do you allow this bug to remain unfixed?

  1. clearly this is important, so a more or less immediate fix.
  2. certainly a fix in six months, and definitely before the next significant release.
  3. more than a year and one significant release (so far).

Oracle's current answers are #3 and #3 respectively. Perhaps all of their 'enterprise' customers avoid iSCSI and it is strictly a minor hobbyist protocol, or perhaps all of their enterprise customers are busy assiduously avoiding ZFS in favour of other options (I can't say I blame them).

Regardless of other issues, what this says to me is that Oracle does not consider iSCSI support to be at all a priority and therefor we are being extremely unwise to build anything on top of Solaris plus iSCSI. Evidently when iSCSI works it is because we are lucky, not because Oracle actually thinks it's important. Among other things, this does render pretty much moot all of my thinking about enticing ZFS features that might cause us to upgrade from Solaris 10 Update 8 (which does not have this bug, obviously).

(Fortunately my co-worker discovered and isolated this while our recently built S10U9 server was not yet in real production. And clearly we are going to have to add more things to our testing procedures, such as 'reboot machine to make sure pools return'.)

PS: if you chase the cross references from bug 6992124 to bug 6907687, you wind up at patch 144501. Well, our test system has this patch installed and I can assure you that the problem is not fixed.

Update: this issue is fixed in x86 patch 147441-05 (with the bug listed as '7012256 pools on iSCSI devices unavailable upon boot'). This was released (to general Solaris users) no earlier than November 4th and possibly later, more than a year after the bug was first seen, so I believe that my point remains. It's nice to see that Oracle did finally get around to fixing this, though.

Comments on this page:

From at 2011-11-16 17:40:07:

The Solaris SMF can be told what services depend on what - or does SMF start too late to be relevant to this issue?

By cks at 2011-11-17 16:57:45:

How SMF interacts with bringing up ZFS pools is extremely unclear to me. The shortest summary I can write right now is that there does not seem to be any particular distinct SMF service whose job it is to load ZFS pool information or activate ZFS pools.

(The kernel itself loads ZFS pool information from /etc/zfs/zpool.cache fairly early but I don't think it immediately tries to activate all of the ZFS pools. I think that pool activation happens as a side effect of doing other operations such as mounting all of the ZFS filesystems, but I can't trace the SMF dependencies to see how ordering is forced between bringing up iSCSI targets and mounting ZFS filesystems.)

From at 2011-11-30 11:43:03:

Ugh - not sure this is my exact problem, but rebooting a particular machine lost one of two iscsi luns, and its zpool information (nothing left in zpool.cache); of course the missing LUN has 11 zones - so patching without them is even more fun. Yes, iSCSI -> ZFS -> 11 Zones. Best part of all; it sees one target on the iSCSI array, but not the other and iscsiadm is not that helpful.

Written on 16 November 2011.
« A scroll wheel experiment
A classic and standard C quoting bug »

Page tools: View Source, View Normal, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Wed Nov 16 00:45:44 2011
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.