Wandering Thoughts archives

2024-08-09

The Broadcom 'bnxt' Ethernet driver and RDMA (in Ubuntu 24.04)

We have a number of Supermicro machines with dual 10G-T Broadcom based networking; specifically what they have is the 'BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet Controller'. Under Ubuntu 22.04, everything is fine with these cards (or at least seems to be in non-production use), using the normal bnxt_en kernel driver module. Unfortunately this is not our experience in Ubuntu 24.04.

In Ubuntu 24.04, these machines also load an additional Broadcom bnxt driver, bnxt_re, which is the 'Broadcom NetXtreme-C/E RoCE' driver. RoCE is short for RDMA over Converged Ethernet, and to confuse you, this driver is found in the 'Infiniband' area of the Linux kernel drivers tree. Unfortunately, on our hardware the 24.04 bnxt_re doesn't work (or maybe the hardware doesn't work and bnxt_re is failing to detect that, although with 'RDMA' in the name of the hardware one sort of suspects it's supposed to work). The driver stalls during boot and spits out kernel messages like:

bnxt_en 0000:ab:00.0: QPLIB: bnxt_re_is_fw_stalled: FW STALL Detected. cmdq[0xf]=0x3 waited (102721 > 100000) msec active 1
bnxt_en 0000:ab:00.0 bnxt_re0: Failed to modify HW QP
infiniband bnxt_re0: Couldn't change QP1 state to INIT: -110
infiniband bnxt_re0: Couldn't start port
bnxt_en 0000:ab:00.0 bnxt_re0: Failed to destroy HW QP
[... more fun ensues ...]

This causes systemd-udev-settle.service to fail:

udevadm[1212]: Timed out for waiting the udev queue being empty.
systemd[1]: systemd-udev-settle.service: Main process exited, code=exited, status=1/FAILURE

This then causes Ubuntu 24.04's ZFS services to fail to completely start, which is a bad thing on hardware that we want to use for our ZFS fileservers.

We aren't the only people with this problem, so I was able to find various threads on the Internet, for example. These gave me the solution, which is to blacklist the bnxt_re kernel module, but at the time left me with the mystery of how and why the bnxt_re module was even being loaded in the first place.

The answer is that bnxt_re is being loaded through the second sort of kernel driver module loading. It is an 'auxiliary' module for handling RDMA on top of the normal bnxt_en network driver, and the bnxt_en module basically asks for it to be loaded (which also suggests that at least the module thinks the hardware should be able to do RDMA properly). More specifically, bnxt_en basically asks for bnxt_en.rdma to be loaded, and that that is an alias for bnxt_re. Fortunately you don't have to know all of this in order to block bnxt_re from loading.

We don't have any 22.04 installs on this specific hardware any more, so I can't be completely sure what happened under 22.04, but it appears that 22.04 didn't load the bnxt_re module on these servers. Running 'modinfo' on the 22.04 module shows that it doesn't have the bnxt_en.rdma module alias it does in 24.04, so maybe you had to manually load it if your hardware had RDMA and you wanted to use it.

(Looking at kernel source history, it appears that bnxt_re support for using this 'auxiliary driver interface' only appeared in kernel 6.3, which is much too late for Ubuntu 22.04's normal server kernel, which is based on 5.15.0.)

One of my lessons learned from this is that in today's Linux kernel environment, drivers may enable additional functionality that you neither asked for or wanted, just because it's there. We don't use RDMA and never asked for anything related to RoCE, but because the hardware is (theoretically) capable of it, we got it anyway.

linux/BroadcomNetworkDriverAndRDMA written at 23:16:16;


Page tools: See As Normal.
Search:
Login: Password:

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.