Linux's /dev/disk/by-id unfortunately often puts the transport in the name

October 27, 2024

Filippo Valsorda ran into an issue that involved, in part, the naming of USB disk drives. To quote the relevant bit:

I can't quite get my head around the zfs import/export concept.

When I replace a drive I like to first resilver the new one as a USB drive, then swap it in. This changes the device name (even using by-id).

[...]

My first reaction was that something funny must be going on. My second reaction was to look at an actual /dev/disk/by-id with a USB disk, at which point I got a sinking feeling that I should have already recognized from a long time ago. If you look at your /dev/disk/by-id, you will mostly see names that start with things like 'ata-', 'scsi-OATA-', 'scsi-1ATA', and maybe 'usb-' (and perhaps 'nvme-', but that's a somewhat different kettle of fish). All of these names have the problem that they burn the transport (how you talk to the disk) into the /dev/disk/by-id, which is supposed to be a stable identifier for the disk as a standalone thing.

As Filippo Valsorda's case demonstrates, the problem is that some disks can move between transports. When this happens, the theoretically stable name of the disk changes; what was 'usb-' is now likely 'ata-' or vice versa, and in some cases other transformations may happen. Your attempt to use a stable name has failed and you will likely have problems.

Experimentally, there seem to be some /dev/disk/by-id names that are more stable. Some but not all of our disks have 'wwn-' names (one USB attached disk I can look at doesn't). Our Ubuntu based systems have 'scsi-<hex digits>' and 'scsi-SATA-<disk id>' names, but one of my Fedora systems with SATA drives has only the 'scsi-<hex>' names and the other one has neither. One system we have a USB disk on has no names for the disk other than 'usb-' ones. It seems clear that it's challenging at best to give general advice about how a random Linux user should pick truly stable /dev/disk/by-id names, especially if you have USB drives in the picture.

(See also Persistent block device naming in the Arch Wiki.)

This whole current situation seems less than ideal, to put it one way. It would be nice if disks (and partitions on them) had names that were as transport independent and usable as possible, especially since most disks have theoretically unique serial numbers and model names available (and if you're worried about cross-transport duplicates, you should already be at least as worried as duplicates within the same type of transport).

PS: You can find out what information udev knows about your disks with 'udevadm info --query=all --name=/dev/...' (from, via, by coincidence). The information for a SATA disk differs between my two Fedora machines (one of them has various SCSI_* and ID_SCSI* stuff and the other doesn't), but I can't see any obvious reason for this.


Comments on this page:

By stafford at 2024-10-28 00:18:53:

It would be nice if disks (and partitions on them) had names that were as transport independent and usable as possible

Partitions do: /dev/disk/by-partuuid/. It's not clear why you want to use by-id. Maybe some concern about duplicating disks via dd, or about security attacks from USB keys?

I'm also curious why you want the disks to have names. I could guess it's for running smartctl, setting power-saving modes via hdparm, or some such thing (although it's also a bit of a crap-shoot whether these work across USB).

Anyway, if you look through /usr/lib/udev/rules.d/60-persistent-storage.rules, you'll see where the names come from, and should be able to guess enough of the syntax to drop something into /etc/udev/rules.d/ that will create non-prefixed names.

That file also has a comment: "Previously, ata_id in the above might not be able to retrieve attributes correctly, and properties from usb_id were used as a fallback. … To keep backward compatibility…". Perhaps that case of having only a "usb-" name is due to an out-of-date udev, or the bug still exists unbeknownst to the comment's author. There are some bug numbers, but it's anyone's guess as to what tracker they're in (the file has a SuSE email address, so maybe theirs?).

On naming disks:

A name is to distinguish a unique thing from all other things with a description that is "shorter" than that thing. In this case it seems like there is more than one HDD that someone wants to refer to "by name". Other common cases are unix user names (on the same machine), unix group IDs (on the same machine), Unix file names, DNS names, IP addresses, physical memory addresses, Hardware Ethernet MAC addresses, database table row IDs, and product serial numbers (see below).

(For that special kind of person who only accepts definitions by authority and proofs by authority, read what Guy Lewis Steele, Jr., Rich Hickey, John von Neumann, Haskell Brooks Curry, Donald Ervin Knuth, Edsger Wyben Dijkstra, and Daniel Julius Bernstein have to say in this matter. No, I don’t like namedroppers, either.)

This idea is so common, that manufacturers of HDDs and SSDs provide unique serial numbers for their storage drives. This idea is shockingly close to unique hardware Ethernet MAC addresses.

Bad surprise: I seem to have misremembered that udev and persistent device naming both of networking devices (enp3s0 vs eth0) and storage devices (/dev/disk/by-id/) was supposed to solve this naming problem. Apparently, we need an Ulrich-Drepper-of-the-year award now for missing the mark so gracefully.

Aside: This also assumes that uniqueness follows some equivalence relation that is interesting to the person naming things. (Someone I know rejects the notion of equivalence relations as a concept. That conversation was short.)

By stafford at 2024-10-28 19:35:45:

This idea is so common, that manufacturers of HDDs and SSDs provide unique serial numbers for their storage drives. This idea is shockingly close to unique hardware Ethernet MAC addresses.

For disk naming, I think a big part of the problem is that it's not obvious which names are the preferred ones. The hardware itself exposes 3: USB iSerial, ATA serial, and WWN. Then udev derives names from those and throws all the results at the user; one of my partitions has 16 names. The Arch wiki page documents most of it, without really expressing an opinion. (Also, some of the information seems kind of obsolete, like "Partitions will be referenced by their number in the partition table and that can change if the partitions are reordered". With GPT, why would anyone do that? A partition's number is a GPT slot number, which is just an arbitrary number from 1 to 128 that should never need to change—even if partitions are added, deleted, or moved/resized.)

Distributors could clean things up slightly by dropping the backward-compatibility names on fresh installations.

Based on the comment I quoted, it seems the "ata-" names are preferred when using "by-id", and "usb-" should only be used on old buggy systems that lack the "ata-" names. But if such "bugs" are really fixed now, that must be somewhat recent. I have several USB hard drives less than 4 years old, never removed from their enclosures (sometimes that changes the serial number), connected to a computer that's newer than them. And I see them listed with multiple $ID_SERIAL values in a udev script. Like "WD_easystore_264D_…" and "WDC_WD140EDFZ-…", with the elided suffixes having no apparent relation to each other (they are iSerial and ATA serial respectively). Some kernel or udev upgrade must have changed them.

This is all assuming good hardware. In reality, a specification saying that a number has to be unique doesn't mean it will be. Back when I did embedded software development, I had several USB-to-serial adapters at my desk without unique serial numbers; I exchanged some till I could get udev to tell them all apart. A co-worker had seen just about every kind of bad behaviour from USB-to-SATA enclosures, and I think duplicate or missing serial numbers was one such thing. We also had a bunch of boards with identical MAC addresses, which was fun—or maybe it was that they provided no addresses, and the boot-loader and OS images we were all sharing had hard-coded numbers. It annoyed the network admins either way.

From 193.219.181.219 at 2024-10-29 01:33:58:

(Also, some of the information seems kind of obsolete, like "Partitions will be referenced by their number in the partition table and that can change if the partitions are reordered". With GPT, why would anyone do that? A partition's number is a GPT slot number, which is just an arbitrary number from 1 to 128 that should never need to change—even if partitions are added, deleted, or moved/resized.)

1) Conventionally, partition table slots are normally kept sorted and packed. Perhaps they don't need to be, but nearly all partitioning tools will automatically enforce ascending order regardless. (Indeed if you have a GPT disk, then you're supposed to reference the partition GUID, not the partition index, anyway.)

2) Not all disks are GPT.

By stafford at 2024-10-29 12:16:49:

nearly all partitioning tools will automatically enforce ascending order regardless

Hmm. I wasn't aware of that, but I guess the wiki's text makes sense in that context.

gdisk certainly doesn't enforce it, and that's what I use. Apart from being able to specify the slot, I'm able to give exact start and end sector numbers, which is vital when moving stuff around and enlarging partitions; and when the sector size changes, as sometimes happens when moving between USB and SATA.

Indeed if you have a GPT disk, then you're supposed to reference the partition GUID, not the partition index, anyway.

The GUID could too easily be duplicated onto a removable disk, and udev by default provides no combination of disk serial plus partition GUID. It'd also be a bit harder to maintain a partition's GUID when re-creating the GPT due to a sector-size-change. So I just use by-id and always make my "main" partition number 4, leaving the lower numbers for junk like /boot, /boot/efi, and swap. It works fine; nothing's ever complained that 3's starting sector is after 4's, for example, nor that I skipped some numbers.

(Now that I think about it, it's kind of odd that udev doesn't make a link for the disk based on its GPT's "disk GUID" value. One could work backward from a by-partuuid link by using "readlink" and stripping the numeric suffix, but that's awkward.)

Not all disks are GPT.

I know there are still MBR-partitioned disks around, but I count those as "obsolete" (still, out-of-order and skipped numbers will work there, although things get iffy with indices above 4). FreeBSD considers disklabels obsolete. Is there something else that's relevant in the context of Linux?

Written on 27 October 2024.
« The importance of name-based virtual hosts (websites)
The question of whether to still allow HTTP/1.0 requests or block them »

Page tools: View Source, View Normal.
Search:
Login: Password:

Last modified: Sun Oct 27 23:24:56 2024
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.