2021-12-08
NVMe drives and the case of opaque bandwidth limits
Given that my home desktop's two M.2 NVMe slots seem to share a single PCIe x4 uplink in the end, it's probably not worth losing two (unused) SATA ports to bring the second NVMe drive up from x2 to x4 since I'm mirroring them anyway.
My current home desktop has an Asus PRIME Z370-A motherboard. This motherboard has two M.2 NVMe slots, one that's always at x4 PCIe lanes, and one that's either at x2 or at x4 and costs you two of the six SATA ports if you set it to x4. Unlike with my work machine, I don't have to worry about conflicts with PCIe slots (and cards in them) because I don't have any PCIe cards in my home machine (I use the integrated graphics, so I don't even have a GPU).
(There are no PCIe card slot conflicts documented in the motherboard's manual. That doesn't mean they don't exist, of course.)
My two (new) NVMe drives are the same model and so should be as fast as each other. The observed behavior is that I can do streaming block reads from the x2 drive at about 1.15 GBytes/sec, from the x4 drive at about 2.45 GB/sec, and from both at once at 1.2 and 1.35 GB/sec respectively. Now, there certainly could be Linux kernel things going on (especially since the x2 drive is faster when both are active), but this definitely feels like some sort of total bandwidth limit.
However, I'm not sure I see an obvious place with the bandwidth limit in my PCIe topology, at least with Linux's tools for PCIe topology. Both NVMe drives are connected to 'Intel Corporation 200 Series PCH PCI Express Root Port' PCIe devices that are listed as part of what I think of as the PCI root bus. Since this is an Intel thing, PCH probably stands for Intel's Platform Controller Hub, which has a DMI link between the Intel CPU and the Z370 chipset. Looking at various things, this DMI link is about the speed of PCIe 3.0 x4, which could explain how I'm running into bandwidth limits. If neither NVMe drive is directly connected to any CPU PCIe lanes, the combined bandwidth of both of them together would be limited by the PCH to CPU bandwidth of roughly PCIe 3.0 x4.
(With that said, I don't think I'm quite saturating PCIe x4 3.0 here. Probably there are additional overheads involved.)
Does this matter in practice? In one sense, probably not really; I'm unlikely to ever hit this bandwidth limit in real usage. In another sense, it actually does matter. As I mentioned in my tweet, this bandwidth limit means that I probably won't bother to change the x2 NVMe slot to x4, so I get to keep two extra SATA ports that I may actually want to use some time.