Some important things about how PCIe works out involve BIOS magic

December 7, 2019

I'll start with my remark on Mastodon:

I still don't know why my Radeon graphics card and the PCIe bridge it's behind dropped down from PCIe 3.0 all the way to PCIe 1.0 bandwidth, but going into the BIOS and wandering around appears to have magically fixed it, so I'll take that.

PCIe: this generation's SCSI.

When I added some NVMe drives to my office machine and ran into issues, I discovered that the Radeon graphics card on my office machine was operating at 2.5 GT/s instead of 8 GT/s, which is to say PCIe 1.0 data rates instead of PCIe 3.0 ones (which is what it should be operating at). At the end of the last installment I speculated that I had accidentally set something in the BIOS that told it to limit that PCIe slot to PCIe 1.0, because that's actually something you can do through BIOS settings (on some BIOSes). I went through the machine's BIOS today and found nothing that would explain this, and in fact it doesn't seem to have any real settings for PCIe slot bandwidth. However, when I rebooted the machine after searching through the BIOS, I discovered that my Radeon and the PCIe bridge it's behind were magically now at PCIe 3.0's 8 GT/s.

I already knew that PCIe device enumeration involved a bunch of actions and decisions by the BIOS. I believe that the BIOS is also deeply involved in deciding how many PCIe lanes are assigned to particular slots (although there are physical constraints there too). Now it's pretty clear that your BIOS also has its fingers in decisions about what PCIe transfer rate gets used. As far as I know, all of these decisions happen before your machine's operating system comes into the picture; it mostly has to accept whatever the BIOS set up, for good or bad. Modern BIOSes are large opaque black boxes of software, and like all such black boxes they can have both bugs and mysterious behavior.

(Even when their PCIe setup behavior isn't a bug and is in fact necessary, they don't explain themselves, either to you or to the operating system so that your OS can log problems and inefficiencies.)

How do you know that your system is operating in a good PCIe state instead of one where PCIe cards and onboard controllers are being limited for some reason? Well, you probably don't, not unless you go and look carefully (and understand a reasonable amount about PCIe). If you're lucky you may detect this through side effects, such as increased NVMe latency or lower than expected GPU performance (if you know what GPU performance to expect in your particular environment). Such is the nature of magic.

Written on 07 December 2019.
« PCIe bus addresses, lspci, and working out your PCIe bus topology
The Go runtime scheduler's clever way of dealing with system calls »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sat Dec 7 01:26:03 2019
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.