Wandering Thoughts archives

2007-07-12

An interesting mistake with ZFS and iSCSI

First I'll show you the symptoms, then I'll explain what I did to shoot myself in the foot:

# zpool create tank01 c0t38d0
# zpool create tank02 c0t39d0
# zpool replace tank01 c0t38d0 c0t42d0

(Time passes, the resilver finishes, and zpool status tank01 shows no use of c0t38d0.)

# zpool attach -f tank02 c0t39d0 c0t38d0
invalid vdev specification
the following errors must be manually repaired:
/dev/dsk/c0t42d0s0 is part of active ZFS pool tank01. Please see zpool(1M).

(Emphasis mine.)

All of these disks are iSCSI disks being exported from a Linux machine. The error condition was persistent, lasting through reboots, zpool export and zpool import and so on, while at the same time nothing said that c0t38d0 was in use or active.

How I shot myself in the foot is simple: I configured all of the iSCSI disks with the same ScsiId value. When I set up the Linux target software, I'd assumed that its 'SCSI ID' was something like a model name, partly because there's also a ScsiSN parameter for each disk's nominal serial number. I was totally wrong; it needs to be a unique identifier, just like the ScsiSN values (and if left alone, the target software would have handled it).

What is presumably going on is that ZFS noticed that c0t38d0 has the same ScsiId as c0t42d0, concluded that they were two names for the same actual disk (which is easily possible in a multi-path setup), and sensibly refused to let me shoot myself in the foot. The one thing I don't understand is why it only happens with c0t42d0, which is the last of the iSCSI disks.

solaris/ZFSiSCSIMistake written at 21:23:15; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.