2007-07-12
An interesting mistake with ZFS and iSCSI
First I'll show you the symptoms, then I'll explain what I did to shoot myself in the foot:
# zpool create tank01 c0t38d0 # zpool create tank02 c0t39d0 # zpool replace tank01 c0t38d0 c0t42d0
(Time passes, the resilver finishes, and zpool status tank01
shows no
use of c0t38d0.)
# zpool attach -f tank02 c0t39d0 c0t38d0
invalid vdev specification
the following errors must be manually repaired:
/dev/dsk/c0t42d0s0 is part of active ZFS pool tank01. Please see zpool(1M).
(Emphasis mine.)
All of these disks are iSCSI disks being exported from a Linux machine.
The error condition was persistent, lasting through reboots, zpool
export
and zpool import
and so on, while at the same time nothing
said that c0t38d0 was in use or active.
How I shot myself in the foot is simple: I configured all of the iSCSI
disks with the same ScsiId
value.
When I set up the Linux target software, I'd assumed that its 'SCSI ID'
was something like a model name, partly because there's also a ScsiSN
parameter for each disk's nominal serial number. I was totally wrong; it
needs to be a unique identifier, just like the ScsiSN
values (and if
left alone, the target software would have handled it).
What is presumably going on is that ZFS noticed that c0t38d0 has the
same ScsiId
as c0t42d0, concluded that they were two names for the
same actual disk (which is easily possible in a multi-path setup), and
sensibly refused to let me shoot myself in the foot. The one thing I
don't understand is why it only happens with c0t42d0, which is the last
of the iSCSI disks.