What the Linux rcu_nocbs
kernel argument does (and my Ryzen issues again)
It turns out that my Linux Ryzen kernel hangs appear to be a known bug or issue (Ubuntu, Fedora, kernel); more fortunately, people have found magic incantations that appear to work around the issue. Part of the magic is some kernel command line arguments, usually cited as:
rcu_nocbs=0-N processor.max_cstate=1
(where N
is the number of CPUs you have minus one.)
Magic incantations that I don't understand bug me, especially when they seem to be essential to keeping my system from hanging, so I had to go digging.
What processor.max_cstate
does is relatively straightforward.
As briefly mentioned in kernel-parameters.txt,
it limits the C-states
(also,
also)
that Linux will allow processors to go into. Limiting the CPU to
C1 at most doesn't allow for much idling and power saving; it
might be safe to go as far as C5, since
the usual additional advice is to disable C6 in the BIOS (if your
BIOS supports doing this). On the other hand, I don't know if Ryzens
do anything between C1 and C6.
The rcu_nocbs
parameter is more involved (and mysterious). To
more or less understand it, we need to start with Read-Copy-Update
(RCU) (also Wikipedia). To simplify, RCU
handles updates to shared data structures by setting up a new version
of the data structure, changing a master location to point to it
instead of the old version, and then waiting for everyone to have
passed a synchronization point where they're guaranteed to be using
the new version instead of the old version. At that point you know
the old version is unused and you can free it.
The Linux kernel's main RCU code handles the RCU algorithm for you but it doesn't know how to free up your data structures. For that it relies on RCU callbacks that you give it; when RCU determines that the old version of your data structure can be disposed of, it will invoke your callback to do this. Normally, RCU callbacks are invoked in interrupt context as part of software interrupt (softirq) handling. Various people didn't like this because softirqs preempt whatever happens to be running at the time whenever an appropriate interrupt happens, so people came up with an alternate approach of having these potentially quite time-consuming RCU callbacks handled by regularly scheduled kernel threads instead. This is said to 'offload' RCU callbacks to these threads. Each offloaded CPU gets its own set of RCU offload kernel threads, but these kernel threads can run on any CPU, not just the CPU they're offloading.
This is what rcu_nocbs
controls; it's a list of the CPUs in
your system that should have their RCU callbacks offloaded to
threads. Normally, people use it to fence off a few CPUs from the
random interruptions of softirq RCU callbacks.
(See here and here for more information and details.)
However, the rcu_nocbs=0-N
setting we're using specifies all
CPUs, so it shifts all RCU callbacks from softirq context during
interrupt handling (on whatever specific CPU involved) to kernel
threads (on any CPU). As far as I can see, this has two potentially
significant effects, given that Matt Dillon of DragonFly BSD has
reported an issue with IRETQ
that completely stops a CPU under
some circumstances.
First, our Ryzen CPUs will spend less time in interrupt handling,
possibly much less time, which may narrow any timing window required
to hit Matt Dillon's issue. Second, RCU callback processing will
continue even if a CPU stops responding to IPIs, although
I expect that a CPU not responding to IPIs is going to cause the
Linux kernel various other sorts of heartburn.
(Unfortunately, Matt Dillon's issue doesn't correspond well with the observed symptoms, where Ryzens hang under Linux not while busy but while idle. My kernel stack backtraces do suggest that at least one CPU is spinning waiting for its IPI to other CPUs to be fully acknowledged, though, so perhaps there is a related problem. Perhaps there are even several problems.)
|
|