A grumpy remark about Solaris's scalability

February 2, 2009

For an operating system that is theoretically all about being ready to run on enterprise-sized systems (ie big ones), Solaris 10 software has an awfully bad habit of not dealing well with lots of (iSCSI) disks. I wouldn't be so put out about a single tool having this flaw, but Solaris programmers seem to make this sort of scaling mistake over and over.

The first case I ran into was the version of iscsiadm from the current version of patch 119091 (for x86), where the programmers made 'iscsiadm list target -S' stat() every file in /dev/ for every iSCSI LUN. Since our systems have around 140 LUNs defined and roughly 16,000 non-directory entries in /dev, this works about as well as you'd expect.

(Fortunately the previous version of 119091, 119091-31, does not have this problem and the -32 version doesn't add any bugfixes that we care about, so we reverted. And yes, this bug has been reported to Sun. Four months ago.)

Today's offender was Solaris Live Upgrade, where lucreate does something mysterious that causes prtconf to loop repeatedly examining lots of nodes in /dev. The result is a total stall when attempting to create a new Live Upgrade boot environment (I left it sit for at least 45 minutes without any apparent progress).

It is possible that the Live Upgrade problem is specific to having lots of iSCSI targets, but still, didn't it occur to any programmer at Sun that repeatedly doing any operation to all of /dev or 'all disks in /dev/' might not be the greatest idea?

Actually, I can answer that: I suspect that they never had the issue occur to them because they're using an abstraction layer and the underside of that abstraction layer has an unfortunate implementation, one that makes sense if you call it once or twice but not if you call it lots. Then the problem is that Sun programmers do not routinely test their programs on systems that are big enough to expose issues like this.

(Alternately, the problem is that they continue to turn out low-level implementations of abstractions that behave catastrophically badly if used repeatedly on big systems. If your sales pitch is 'enterprise ready', you should think about such scalability issues as a matter of course.)

Written on 02 February 2009.
« Understanding ZFS cachefiles in Solaris 10 update 6
Why btrfs was inevitable: a corollary to (not) getting ZFS in Linux »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Mon Feb 2 23:21:55 2009
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.