Wandering Thoughts archives

2016-04-23

Why I think Illumos/OmniOS uses PCI subsystem IDs

As I mentioned yesterday, PCI has both vendor/device IDs and 'subsystem' vendor/device IDs. Here is what this looks like (in Linux) for a random device on one of our machines here (from 'lspci -vnn', more or less):

04:00.0 Serial Attached SCSI controller [0107]: LSI Logic / Symbios Logic SAS2308 PCI-Express Fusion-MPT SAS-2 [1000:0086] (rev 05)
Subsystem: Super Micro Computer Inc Device [15d9:0691]
[...]

This is the integrated motherboard SAS controller on a SuperMicro motherboard (part of our fileserver hardware). It's using a standard LSI chipset, as reported in the main PCI vendor and device ID, but the subsystem ID says it's from SuperMicro. Similarly, this is an Intel chipset based motherboard so there are a lot of things with standard Intel vendor and device IDs, but SuperMicro specific subsystem vendor and device IDs.

As far as I know, most systems use the PCI vendor and device IDs and mostly ignore the subsystem vendor and device IDs. It's not hard to see why; the main IDs tell you more about what the device actually is, and there are fewer of them to keep track of. Illumos is an exception, where much of the PCI information you see reported uses subsystem IDs. I believe that a significant reason for this is that Illumos is often attempting to basically fingerprint devices.

Illumos tries hard to have some degree of constant device naming (at least for their definition of it), so that say 'e1000g0' is always the same thing. This requires being able to identify specific hardware devices as much as possible, so you can tie them to the visible system-level names you've established. This is the purpose of /etc/path_to_inst and the systems associated with it; it fingerprints devices on first contact, assigns them an identifier (in the form of a driver plus an instance number), and thereafter tries to keep them exactly the same.

(From Illumos's perspective the ideal solution would be for each single PCI device to have a UUID or other unique identifier. But such a thing doesn't exist, at least not in general. So Illumos must fake a unique identifier by using some form of fingerprinting.)

If you want a device fingerprint, the PCI subsystem IDs are generally going to be more specific than the main IDs. A whole lot of very different LSI SAS controllers have 1000:0086 as their PCI vendor and device IDs, after all; that's basically the purpose of having the split. Using the SuperMicro subsystem vendor and device IDs ties it to 'the motherboard SAS controller on this specific type of motherboard', which is much closer to being a unique device identifier.

Note that Illumos's approach more or less explicitly errs on the side of declaring devices to be new. If you shuffle which slots your PCI cards are in, Illumos will declare them all to be new devices and force you to reconfigure things. However, this is broadly much more conservative than doing it the other way. Essentially Illumos says 'if I can see that something changed, I'm not going to go ahead and use your existing settings'. Maybe it's a harmless change where you just shuffled card slots, or maybe it's a sign of something more severe. Illumos doesn't know and isn't going to guess; you get to tell it.

(I do wish there were better tools to tell Illumos that certain changes were harmless and expected. It's kind of a pain that eg moving cards between PCI slots can cause such a commotion.)

solaris/IllumosWhyUsePCISubsystemIDs written at 02:46:54; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.