PCIe slot bandwidth can change dynamically (and very rapidly)
When I added some NVMe drives to my office machine and started looking into its PCIe setup, I discovered that its Radeon graphics card seemed to be operating at 2.5 GT/s (PCIe 1.0) instead of 8 GT/s (PCIe 3.0). The last time around, I thought I had fixed this just by poking into the BIOS, but in a comment, Alex suggested that this was actually a power-saving measure and not necessarily done by the BIOS. I'll quote the comment in full because it summarizes things better than I can:
Your GPU was probably running at lower speeds as a power-saving measure. Lanes consume power, and higher speeds consume more power. The GPU driver is generally responsible for telling the card what speed (and lane width) to run at, but whether that works (or works well) with the Linux drivers is another question.
It turns out that Alex is right, and what I saw after going through the BIOS didn't quite mean what I thought it did.
To start with the summary, the PCIe bandwidth being used by my
graphics card can vary very rapidly from 2.5 GT/s up to 8 GT/s and
then back down again based on whether or not the graphics driver
needs the card to do anything (or the aggregate Linux and X software
stack as a whole, since I don't know where these decisions are being
made). The most dramatic and interesting difference is between two
apparently very similar ways of seeing if the Radeon's bandwidth
is currently downgraded, either automatically scanning through
lspci
's
output with 'lspci -vv | fgrep downgrade
' or manually looking
through it with 'lspci -vv | less
'. When I used less
, the Radeon
normally showed up downgraded to 2.5 GT/s. When I used fgrep
,
other things before the Radeon showed up as downgraded but the
Radeon never did; it was always at 8 GT/s.
(Some of those other things have been downgraded to 'x0' lanes, which I suspect means that they've been disabled as unused.)
What I think is happening here is that when I pipe lspci
to less
,
lspci
gets the Radeon's bandwidth before any output is written
to the screen (less
reads it all in a big gulp and then displays
it), so at the time the graphics chain is inactive. When I use the
fgrep
pipe, some output is written to the screen before lspci
gets to the Radeon and so the graphics chain lights up the Radeon's
bandwidth to display things. What this suggests is that the graphics
chain can and does vary the Radeon's PCIe bandwidth quite rapidly.
Another interesting case is that running the venerable glxgears
doesn't
bring the PCIe bandwidth up from 2.5 GT/s, but running GpuTest's 'fur' test does (it goes to 8
GT/s as you might expect).
(It turns out that nVidia's Linux drivers also do this.)
Of course all of this may make seeing whether you're getting full PCIe bandwidth a little bit interesting. It's clearly not enough to just look at your system, even when it's moderately active (I have several X programs that update once a second); you really need to put it under some approximation of full load and then check. So far I've only seen this happen with graphics cards, but who knows what's next (NVMe drives could be one candidate to drop their bandwidth to save power and thus reduce heat).
|
|