Working out which of your NVMe drives is in what slot under Linux

December 13, 2019

One of the perennial problems with machines that have multiple drives is figuring out which of your physical drives is sda, which is sdb, and so on; the mirror problem is arranging things so that the drive you want to be the boot drive actually is the first drive. In sanely made server hardware this is generally relatively easy, but with desktops you can run into all sorts of problems, such as how desktop motherboards can wire things up oddly. Under some situations, NVMe drives make this easier than with SATA drives, because NVMe drives are PCIe devices and so have distinct PCIe bus addresses and possibly PCIe bus topologies.

First off, I will admit something. The gold standard for doing this reliably under all circumstances is to record the serial numbers of your NVMe drives before you put them into your system and then use 'smartctl -i /dev/nvme0n1' to find each drive from its serial number. It's always possible for a motherboard with multiple M.2 slots to do perverse things with its wiring and PCIe bus layout, so that what it labels as the first and perhaps best M.2 slot is actually the second NVMe drive as Linux sees it. But I think that generally it's pretty likely that the first M.2 slot will be earlier in PCIe enumeration than the second one (if there is a second one). And if you have only one M.2 slot on the motherboard and are using a PCIe to NVMe adapter card for your second NVMe drive, the PCIe bus topology of the two NVMe drives is almost certain to be visibly different.

All of this raises the question of how you get the PCIe bus address of a particular NVMe drive. We can do this by using /sys, because Linux makes your PCIe devices and topology visible in sysfs. In specific, every NVMe device appears as a symlink in /sys/block that gives you the path to its PCIe node (and in fact the full topology). So on my office machine in its current NVMe setup, I have:

; readlink nvme0n1
../devices/pci0000:00/0000:00:03.2/0000:0b:00.0/[...]
; readlink nvme1n1
../devices/pci0000:00/0000:00:01.1/0000:01:00.0/[...]

This order on my machine gives me a surprise, because the two NVMe drives are not in the order I expected. In fact they're apparently not in the order that the kernel initially detected them in, as a look into 'dmesg' reports:

nvme nvme0: pci function 0000:01:00.0
nvme nvme1: pci function 0000:0b:00.0

This is the enumeration order I expected, with the motherboard M.2 slot at 01:00.0 detected before the adapter card at 0b:00.0 (for more on my current PCIe topology, see this entry). Indeed the original order appears to be preserved in bits of sysfs, with path components like nvme/nvme0/nvme1n1 and nvme/nvme1/nvme0n1. Perhaps the kernel assigned actual nvmeXn1 names backward, or perhaps udev renamed my disks for reasons known only to itself.

(But at least now I know which drive to pull if I have trouble with nvme1n1. On the other hand, I'm now doubting the latency numbers that I previously took as a sign that the NVMe drive on the adapter card was slower than the one in the M.2 slot, because I assumed that nvme1n1 was the adapter card drive.)

Once you have the PCIe bus address of a NVMe drive, you can look for additional clues as to what physical M.2 slot or PCIe slot that drive is in beyond just how this fits into your PCIe bus topology. For example, some motherboards (including my home machine) may wind up running the 'second' M.2 slot at 2x instead of x4 under some circumstances, so if you can find one NVMe drive running at x2 instead of x4, you have a strong clue as to which is which (assuming that your NVMe drives are x4 drives). You can also have a PCIe slot be forced to x2 for other reasons, such as motherboards where some slots share lanes and bandwidth. I believe that the primary M.2 slot on most motherboards always gets x4 and is never downgraded (except perhaps if you ask the BIOS to do so).

You can also get the same PCIe bus address information (and then a lot more) through udevadm, as noted by a commentator on yesterday's entry; 'udevadm info /sys/block/nvme0n1' will give you all of the information that udev keeps. This doesn't seem to include any explicit information on whether the device was renamed, but it does include the kernel's assigned minor number and on my machine, nvme0n1 has minor number 1 while nvme1n1 has minor number 0, which suggests that it was assigned first.

(It would be nice if udev would log somewhere when it renames a device.)

PS: Looking at the PCIe bus addresses associated with SATA drives usually doesn't help, because most of the time all of your SATA drives are attached to the same PCIe device.

Written on 13 December 2019.
« Linux makes your PCIe topology visible in sysfs (/sys)
It's unfortunately time to move away from using '/usr/bin/python' »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Fri Dec 13 00:41:42 2019
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.