Wandering Thoughts archives


Solaris Volume Manager and iSCSI: a problematic interaction

Solaris Volume Manager (which I still call DiskSuite) keeps information about the state of its logical volumes in what it calls a 'metadevice state database' (a metadb for short). You normally keep a number of replicas of this state database, scattered around the physical devices that DiskSuite is managing for you. When you are using metasets, all of the metadb replicas have to be on disks in the metaset. This is a logical consequence of the DiskSuite tools needing to update the metadata to reflect which machine owns a metaset; if there was metadata on a disk outside the metaset, DiskSuite on another machine wouldn't necessarily be able to update it.

DiskSuite's approach to dealing with unavailable metadb replicas is simple: DiskSuite panics the system if it loses metadb quorum, where quorum is half of the metadb replicas plus one. This is actually spelled out explicitly in the metadb manpage, along with the reasoning.

(Technically it may survive with exactly half of the metadb replicas; I can't test right now.)

Now we get to the iSCSI side of the problem, namely that if the Solaris iSCSI initiator loses connectivity to an iSCSI target it offlines all of the disks exported by that iSCSI target, which in turn immediately tells DiskSuite that the metadb replicas on all of those disks are now unavailable. If this drops you below quorum in DiskSuite (for any metaset), your system promptly panics.

(This is different from the behavior of FibreChannel, where glitches in FC connectivity just produce IO errors for any ongoing IO and don't yank the metadb replicas out from under DiskSuite.)

The net result is that if you are using Solaris Volume Manager to manage iSCSI-based storage in metasets, you need to build metasets that include disks (logical or otherwise) from at least three different iSCSI targets or the loss of connectivity to a single target will kill your entire machine.

(And you need to carefully balance the number of metadb replicas across all of your targets so that one target doesn't have too many replicas.)

solaris/DiskSuiteiSCSIProblem written at 17:15:47; Add Comment

Page tools: See As Normal.
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.