Intel's MDS issues have now made some old servers almost completely useless to us

June 14, 2019

Over on Twitter, I said:

One unpleasant effect of MDS is that old Intel-based machines (ones with CPUs that will not get microcode updates) are now effectively useless to us, unlike before, because it's been decided that the security risks are too high for almost everything we use machines for.

If Intel releases all of the MDS microcode updates they've promised to do (sometime), this will have only a small impact on our available servers. If they decide not to update some older CPUs they're currently promising updates for, we could lose a significant number of servers.

To have MDS mitigated at all on Intel CPUs, you need updated microcode (and to turn off hyperthreading, and an updated kernel; see eg Ubuntu's MDS page). According to people who have carefully gone through the CPU lists available in PDFs that are linked from Intel's MDS advisory page, Intel is definitely not releasing microcode updates for anything older than the 2nd generation 'Core' "Sandy Bridge" architecture, which started appearing in 2011 or so (so in theory my 2011 home machine could receive a microcode update and be fixed against MDS). In theory they are going to release microcode updates for everything since then.

For reasons beyond the scope of this entry, we have decided that we have almost no roles where we are comfortable deploying an unpatchable machine that is vulnerable to MDS. In normal places this might not be an issue, since they would long since have turned over old server hardware. We are not a normal place; we run hardware into the ground, which means that we often have very old servers. For example, up until we reached this decision about MDS, we were still using Dell 1950s as scratch and test machines.

In theory, the direct effects of MDS on our stock of live servers are modest but present; we're losing a few machines, including one quite nice one (a Westmere Xeon with 512 GB of RAM), and some machines that were on the shelf are now junk. However, at the moment we have a significant number of servers that are based around Sandy Bridge Xeons. Intel has promised microcode updates for these CPUs but has not yet delivered them. If Intel never delivers updated microcode, we'll lose a significant number of machines and pretty much decimate the compute infrastructure we provide to the department.

(A great deal of our current compute machines are donated Dell C6220 blades with Xeon E5-2680s, specifically CPUID 206D7 ones. Don't ask how much work it took to get the raw CPUID values that Intel puts in their PDF.)

If further significant MDS related attacks get discovered and Intel is more restricted with microcode updates, this situation will get worse. We have a significant number of in-service Dell R210 IIs and R310 IIs, and they are almost as old as these C6220s (although some of them have Ivy Bridge generation CPUs instead of Sandy Bridge ones). Losing these would really hurt.

(In general we are not very happy with Intel right now, and we are starting to deploy AMD-based machines where we can. I would be happy if someone started offering decent basic 1U or 2U AMD-based servers at competitive prices.)


Comments on this page:

By Liam Greenwood at 2019-06-14 08:47:39:

Are you able to deploy the AMD boxes in parallel with Intel compute boxes? Is running on AMD vs Intel totally transparent to your users running jobs, and to your software builds?

Thanks Cheers, Liam UNC

By cks at 2019-06-14 11:24:16:

Intel versus AMD is totally transparent to us, and it's transparent enough to our users that none of them have complained (and even if they did, we probably would shrug and say 'those are what compute machines we have available, feel free to fund more from your grants'). In general, our collection of Intel-based servers isn't from deliberate choice, it's just that most of the servers available to buy off the shelf have historically been Intel-based (as have been donated ones). Our current round of AMDs are actually hand-built from parts, which is an exception to our usual practice.

(We have had some AMD compute servers in the past, but only sporadically. We did have a long run of 1U AMD-based servers, in the form of Sun's Sunfire X2100 and X2200 units, which we were pretty fond of in general. Some of them are still alive and in service, even.)

By Computer at 2019-06-14 20:16:06:

You say "at competitive price". It was my understanding that comparable AMD chip was much less $?

By cks at 2019-06-14 22:07:23:

We care about overall server price, and right now it appears that at least Dell is targeting higher end servers with their AMD based units. On a CPU to CPU basis the AMD servers might be cheaper or competitive (I haven't tried to check), but we do not generally buy higher end servers; we mostly only have the money to get inexpensive 1U servers.

(Most servers that we deploy don't have high performance needs, but we want to get as many of them as we can for a fixed amount of money. More servers is more important to us than more powerful servers.)

In general, vendors like Dell seem to only be dipping their toes into the AMD waters, especially in 1U. Most of their servers are still Intel CPUs.

Written on 14 June 2019.
« My weird problem with the Fedora 29 version of Firefox 67
Intel's approach to naming Xeon CPUs is pretty annoying »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Fri Jun 14 01:13:04 2019
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.