An important gotcha with iSCSI multipathing in Solaris 10
Here's something important to know about Solaris's MPxIO multipathing: MPxIO identifies disks only by their serial numbers and identifiers. So if two Solaris devices have the same serial number, MPxIO concludes that they are two paths to the same physical disk; it has no actual knowledge of underlying path issues, such as iSCSI target identifiers.
This matters a great deal on iSCSI, because at least some iSCSI initiators have serial numbers that are set in software. If you accidentally duplicate some serial numbers between different disks, Solaris's MPxIO will happily decide that they are all the same disk and start distributing IO among them. The result will not make your filesystem very happy. (If you are using ZFS, you have probably just lost the entire pool and possibly the system as well.)
(This is similar to my previous mistake along these lines, but much bigger. I am fortunate that I made this mistake in testing.)
Or in short: when you set up iSCSI targets, make very sure that they have unique SCSI serial numbers et al.
It's hard to fault MPxIO for this behavior, since part of MPxIO's job as a high level multipathing system is to join together the same drive when it's visible over multiple different transport mediums (for example, a drive that is visible over both FibreChannel and iSCSI, however peculiar that may be). Still, it makes adding new targets a bit nerve-wracking, since I know that one mistake or oversight with the configuration of a new iSCSI backend may destroy a pool on an unrelated set of storage.
(This is where I wish Solaris (and our iSCSI backends) had iSCSI specific multipathing, which would avoid this problem because it knows that two completely different targets can never be the same disk.)
Comments on this page:Written on 16 March 2009.