The many names of Linux SATA devices

November 23, 2011

If you have SATA devices on your Linux system, the kernel gives them no less than four different names in four different namespaces (although arguably one of the namespaces is not really a kernel one). Because I recently had to deal with this, I feel like running down all of them.

The four namespaces are:

  • Traditional sdX names like sdj. Unless udev is playing around with them for you, these are allocated sequentially in the order that the SATA (or SCSI-like) drives are encountered. If you hotswap or hot-add a new drive, it will generally (but not always) get the next highest unused drive name. These names start from sda and go up.

    (If you hotswapped the highest drive, it will sometimes reuse its old name.)

    As I have found out the hard way, there is no particular guarantee that what Linux sees as sda will be either the first BIOS drive or the drive plugged into the SATA connector on your motherboard that is labeled 'SATA 1', and there is equally no guarantee that the first BIOS drive is what is plugged into 'SATA 1'. This gets very fun.

    sdX names also appear in sysfs as the primary name of drives, in /sys/block.

None of the following three naming schemes for drives change if and when you hotswap a drive, unlike sdX names; they are always constant.

  • What I will call 'SCSI host names' that look like 'sd 4:3:0:0' or 'scsi 4:3:0:0' (kernel messages use both forms). Note that 'host' here does not mean what you think it might mean; it is special SCSI terminology that roughly corresponds to what I would call a channel.

    The first number is the SCSI host (for SATA, a single port) and for SATA the second is the drive on that host. If the port doesn't have a SATA port multiplier, the second number is always 0; if it does, as we have on some machines, it counts up for each disk reached through the port. Note that a single SATA adaptor card can have multiple ports; each port is numbered separately.

    (The actual definition of the numbers can be seen in /proc/scsi/scsi. Note that non-SATA drivers can do numbering quite differently than SATA drivers; on our Dell 2950s using the mptsas driver, there is only one SCSI host, the second number is always 0, and the third number counts up for individual disks.)

    SCSI host numbers start from 0 and are allocated in the order the system encounters disk 'adaptors', broadly construed. They are never reused. Because USB materializes and de-materializes host adaptors as well as the hard drives themselves, on systems where you frequently insert and remove things like USB drives or camera memory cards you can get very high 'sd NN:...' numbers; my home system routinely reaches 'sd 20' (despite the actual memory card generally showing up as sde).

    These days Linux treats almost every disk as a SCSI disk regardless of the physical attachment method, so almost every disk gets a name like this. Although the kernel prints no messages about this, SCSI host numbers are allocated for all SCSI-like disk adaptor regardless of whether or not there are any disks attached to it at the time that it is seen. This means that 'scsi X' numbers for actual disks can be non-contiguous and your first disk is not necessarily 'scsi 0:0:0:0'.

    (We have machines where the disk numbering is 'sd 2:0:0:0', 'sd 3:0:0:0', 'sd 5:0:0:0', 'sd 5:1:0:0', and so on. Yes, this is fun.)

    Whether IDE devices are pulled into this numbering seems to vary based on the kernel version and device drivers involved. On our SunFire X2100s, Ubuntu 8.04's kernel considers them sufficiently SCSI-like devices to be enumerated this way but a stock kernel does not.

    SCSI hosts more or less appear in sysfs in /sys/class/scsi_host; their disks appear directly in /sys/class/scsi_disk. You can also see the same information in /proc/scsi/scsi.

    Kernel messages that use SCSI host names usually also include the sdX device names as well.

  • (s)ATA port and disk names, which look like 'ata5' and 'ata5.03'. Unlike SCSI host names, these are only allocated to real (s)ATA ports; USB drives and the like do not get them. The two parts of the name are the port number and the drive on the port; again, if you have no port multiplexer the drive number will always be '00'.

    Just to confuse you, ATA ports are numbered starting from 1, unlike SCSI hosts (which start from 0). sd 4:3:0:0 is thus ata5.03, assuming that sd 0 through sd 4 are all (s)ATA devices. And yes, the Linux kernel will happily mix both sorts of names in drive error messages.

    SATA port multipliers themselves seem to be called 'ataN.15' in some kernel versions.

    These names do not appear in sysfs at all as far as I can see; they only appear in kernel messages. Unfortunately a drive that's experiencing transient problems is sometimes only (clearly) identified by ATA disk name, especially if you're trying to pick out which drive on a port multiplier port is the one that's having problems.

  • Per-PCI-device names that you find in /dev/disk/by-path, which have the form 'pci-0000:03:00.0-scsi-2:3:0:0'. The bit after the -scsi- has the same general meaning as in SCSI host names except that the host number is per PCI device, not global; several different PCI devices can have a '-scsi-0:0:0:0' disk.

    How SATA ports on your motherboard are split between PCI devices is what you could politely call extremely varied. SATA ports on actual PCI cards are usually more predictable; one card is normally one PCI device. Finding the appropriate PCI device identifier for your card is up to you.

    In sysfs, PCI devices hide out in the depths of /sys/devices/pci* but understanding the layout (and finding your specific device) requires more understanding of PCI bus topology issues than I have.

Only sdX names and the per-PCI-device names appear in /dev, at least conveniently, and so these are your only real choices for referring to specific drive slots. (Specific physical HDs can be referred to by a number of other ways that appear in /dev/disk.)

In our case, today we had to deal with a drive under all four names; sdj, sd 4:3:0:0, ata5.03, and pci-0000:03:00.0-scsi-2:3:0:0 are (or were) all the same drive. The first three names appeared, intermixed, in various kernel messages about retried errors and then permanent errors; the fourth name we use so that the iSCSI backend software has a stable pathname for the drive and its partitions.

(In the case of the sdj name, when we hot-swapped the physical HD the new version of the drive became sdo. None of the other three names for it changed.)

Comments on this page:

(s)ATA port and disk names, which look like 'ata5' [..] These names do not appear in sysfs at all as far as I can see; they only appear in kernel messages.

Actually, they do - under /sys/devices.

Basically you can grep for paths that match a pattern like '/ata[0-9]/.*/sd[a-zA-Z]$'

See also my answer on Unix-SE:

I don't know if 2011-ish kernels already exposed this sysfs structure. FWIW, at that time I also searched for something like this and didn't notice it.

Written on 23 November 2011.
« A cheap caching trick with a preforking server in Python
About SATA port multipliers »

Page tools: View Source, View Normal, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Wed Nov 23 02:40:47 2011
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.