I'm unsure of the security of simultaneous multithreading on modern x86 CPUs

November 11, 2021

We're planning to get some high core count machines to be new compute machines in our environment of general multi-user Unix login servers, and in the process of working on this we found ourselves with an important question: is it reasonably safe and secure to turn on simultaneous multithreading on modern CPUs from either Intel or AMD or both? It's been surprisingly hard to come up with a decent answer.

(For reasons beyond the scope of this entry, we can assume that SMT is worthwhile for us.)

The original security issues with Intel's Hyper-Threading and often AMD's SMT back in 2018 were Spectre and Meltdown and their speculative execution friends. At the time, disabling HT/SMT on affected CPUs was a required mitigation (and so we disabled it on most of our machines), and OpenBSD felt that SMT was fundamentally unfixable and disabled it universally. Since then, both Intel and AMD have released new CPU generations which they claim mitigate many of these vulnerabilities in general (cf Intel's statement about MDS), although it's hard to find statements specifically about SMT/HT.

On the side of still disabling SMT on both Intel and AMD, you can find the Azure's advice to disable SMT if you run untrusted code in your VMs and the Linux kernel's "core scheduling" documentation saying that disabling SMT is the only full mitigation (also the LWN story and the nice version of the kernel documentation on hardware vulnerabilities). And in general, the OpenBSD people have a good point. The usual purpose of SMT is to dynamically share some of the resources of a single core between two threads, so it seems quite likely that there are all sorts of ways to extract information about what the other thread is doing.

On the side of enabling SMT on sufficiently modern hardware, there are things like Intel's claims that these things aren't an issue in their latest CPUs. There is also what the Linux kernel reports in /sys/devices/system/cpu/vulnerabilities. On older Intel processors, some of these will talk about SMT being enabled (or disabled), while on at least some 2nd generation Xeon Scalable processors the Linux kernel reports that it's not affected even with HT still on (the one I know about is a 'Cascade Lake' CPU, the Silver 4215R). In addition, various online media sources often report that there are few to no known, in the wild exploits for all of these speculative execution vulnerabilities, which reduces the practical risks to us (for reasons that don't fit in this entry).

If you operate servers in a strongly hostile environment, it seems most likely that you should disable SMT universally even on modern Intel and AMD CPUs. It's not as clear for people like us, who have a somewhat exposed but not outright hostile environment. As a practical matter, we could probably enable SMT without getting a security breach as a result.


Comments on this page:

The entire mess is suitably interesting.

It's obvious to me that secure computation should happen at a higher level. Consider an expression reducer entirely divorced from the ability to determine time, representation provenance, and other such things; it's clearly safe to run such a program without worries. People could argue it would be too inefficient, but it's not as if most programs properly use the resources afforded to them anyway.

Perhaps I should write about this on my website at some point.

By Walex at 2021-11-13 09:20:21:

«If you operate servers in a strongly hostile environment»

Write-down and side-channels have been known issues for many decades (well before the Orange Book, 1983) and I think that they cannot be solved cheaply, As to "strongly hostile environment" sites, that is not universities and not "cloud" providers, I suspect run gapped systems for every security level or compartment, so might as well not worry about side channels (yes "only the paranoid survive", though).

«we could probably enable SMT without getting a security breach as a result»

Depends on which “security breach”. Side channels usually allow write-down or read-up, that is information leakage, but usually do not allow write-up directly, that is privilege escalation. If someone is storing credentials that allow privilege escalation unencrypted in memory on shared computers with side-channels (never mind "giant backdoor" VM hosts) they deserve what they get :-).

Written on 11 November 2021.
« People will always exploit presentation, because presentation matters
A linear, sequential boot and startup order is easier to deal with »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Thu Nov 11 23:54:03 2021
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.