Non-uniform CPU hyperthreading is here and can cause fun issues
Today I said something on the Fediverse:
Today my co-worker discovered that the SLURM job scheduler requires your hyperthreading to be uniform across your CPU cores. Our latest SLURM GPU nodes have Intel hybrid CPUs, which aren't uniform; they have 24 cores but 32 threads total, because only the 8 performance cores are hyperthreaded.
I guess we'll turn off hyperthreading. Thanks, Intel and SLURM.
(I'm sure people are going to discover much more fun with this.)
These new GPU machines have Intel i9-13900K CPUs. Modern higher end Intel desktop CPUs have a split core model, with a mix of better 'performance' cores and more power efficient 'efficient' cores. The 'efficient' cores are lower performance and don't have hyperthreading. In the case of the i9-13900K, the split is 8 performance and 16 efficient cores; with hyperthreading on, you have 8 performance cores, 8 extra logical CPUs from the hyperthreads on those cores, and then 16 efficient cores, for a total of 32.
(See my entry on sorting out Intel desktop hyper-threading for more. This Intel CPU quirk has actually been around for some time.)
The lscpu(1) information for this Intel CPU is a little hard to decode unless you know what's going on:
CPU(s): 32 [...] Thread(s) per core: 2 Core(s) per socket: 24 Socket(s): 1
According to 'lscpu -e', Linux logical CPUs 0 through 15 are the performance cores, with successive logical CPUs being hyperthread pairs (so 0 and 1 are the same core, 2 and 3 are the same core, and so on). Logical CPUs 16 through 31 are 'efficient' cores with lower maximum clock speeds. This pairing isn't always how (Intel) hyperthreading is done; my home desktop has an 6 core hyperthreaded CPU, with the pairs being CPU 0 and 6, 1 and 7, and so on.
(I don't know what decides how this pairing works.)
It's not news that this non-uniform CPU distribution is likely to cause heartburn for software; this is just our first encounter with it. That's partly because these are probably our first machines with Intel's non-uniform core and CPU structure. Future versions of SLURM will probably be updated to deal with both the non-uniform hyperthreading and perhaps the non-uniform CPU speeds.
It's worth noting that in theory you can already have non-uniform hyperthreading on a system even without Intel doing weird things in their CPUs. On a multi-socket server, you might wind up with hyperthreading enabled on only one socket for some reason. It's also possible to have non-contiguous Linux CPU numbers, for example because you've offlined one socket on a dual-socket machine and have hyperthreading on.
Since I looked it up, there are two ways to disable SMT (Simultaneous
multithreading), aka hyperthreads in the
Linux kernel whether or not your BIOS supports doing so. First, you
can add 'nosmt
' to your kernel command line parameters.
Second, you can change it during startup by writing 'off' to
/sys/devices/system/cpu/smt/control,
which will also tell you the state of SMT on your systems. I don't
know what either option does to Linux's logical CPU numbering; if
you need (or want) sequential CPU numbering with SMT off, you may
need to disable SMT in the BIOS.
(This might be a sysfs file you want to check or monitor if for some reason you need to be sure that SMT is disabled or not available on your systems.)
PS: Another other option on these i9-13900Ks might be to offline the efficiency cores and see if SLURM will be happy calling the result a good old fashioned 8/16 socket. Since we're using these as SLURM GPU nodes, where we traditionally don't care about the CPU, losing the efficiency cores may not really matter.
(I'm aware that some GPU computation jobs still want plenty of CPU. People with those sort of jobs probably won't be happy with our SLURM GPU nodes in general, which are mostly not 'powerful machines with GPUs' but instead 'a (once) decent GPU in any machine we can put it in', although we did at least bring all of the GPU nodes up to 32 GB of RAM.)
|
|