I don't expect to see competitive RISC-V servers any time soon

June 22, 2023

Recently on the Fediverse, Dan Luu was dubious about a prediction that RISC-V would take over in datacenters in the next 5 to 10 years (here's the EETimes article being quoted from). Much like Dan Luu, I was skeptical, considering that under nearly ideal circumstances AMD didn't make much of a dent. But let's take this from the top, and ask what RISC-V would need and when if it's going to do this.

(This is implicitly 64-bit RISC-V. No one is going to put 32-bit RISC-V into datacenters, much less have it take over.)

Obviously if RISC-V is going to take over in datacenters, there need to be RISC-V servers that people can buy, including off the shelf. This is especially the case for non-cloud datacenter usage of servers; only the cloud players and a few other big places design and manufacture their own servers. These servers need suitable good RISC-V CPUs and chipsets (either as systems on a chip or separately). Apart from performance, these systems need multi-socket support, lots of PCIE lanes, ECC with large modern RAM standards, and so on. Given that moving to RISC-V will make people's life harder, these servers and their CPUs need to be unambiguously better than the x86 (and ARM) server systems available at the same time. Given that domination has a lead time these servers need to be available in quantity and proven quality before that five (or ten) year deadline, probably years before.

(Realistically the first generation of RISC-V datacenter servers would probably not take over, unless they were amazing marvels that utterly eclipse the competition. I would expect it to need two or three generations, just to prove things, shake issues out, and convince people that these servers really are enough better than the competition.)

These RISC-V datacenter servers will also need proven operating systems and other software to run, and that software will need proven and good compilers and other tools to build it. Shaking the architecture specific bugs out of compilers and operating systems takes time, probably years of increasingly serious usage. The developers of all of this software will need RISC-V hardware to use for this, and this hardware mostly can't be early versions of those datacenter servers (datacenter servers are too loud, too large, and too expensive for many people). Some developers will want to use RISC-V hardware as their daily desktop, but I suspect many others will want a quiet mini-sized box they can put in the corner (and use over the network). There will also need to be early servers that can be used to set up the infrastructure of open source (Linux) development, for things like dedicated builders for Debian and other large projects (GCC, clang, Rust, the Linux kernel, etc), CI/CD build servers that smaller open source projects can use, and so on.

(As a practical matter, the quality of compiler optimization, kernel tuning, and so on has a significant effect on the realized CPU performance of anything. Bringing all of this optimization up to speed to take advantage of the raw capabilities of good RISC-V CPUs will take (more) time.)

All of this will take money both literally, for hardware, and possibly figuratively, for people's time. The amount of time this RISC-V bringup takes will be influenced by how much actual money is spent on it. If interested companies wait for Linux developers and other parties to spend their own money and time on buying developer hardware and working on RISC-V kernels, software, and Linux distributions, it's probably going to take quite a while. If interested companies spend money, they can to some extent accelerate this process.

At the moment, RISC-V has very little of this as far as I know (based partly on replies to my Fediverse post about this). RISC-V is probably in a somewhat better place than ARM64 was a decade ago (partly because RISC-V people have learned lessons from ARM's experiences), but it's not all that far along. On top of that, even ARM is not doing all that well in competition to x86. I believe that the only competitive ARM64 servers available today are the proprietary ones Amazon made for AWS, and while those see real usage (as covered in comments on my earlier entry), they haven't exactly taken over even AWS.

Given all of the steps between current reality and the prediction, I believe there's no way it can be reached in five years. Ten years might be possible, but it feels like an aggressive timeline that needs a lot of fast development. I'd want to see the first generation of RISC-V datacenter servers in five years, which means we need high-performance RISC-V CPUs in only a couple of years, along with developer hardware (probably in large quantity in order to kickstart a lot of development that will be necessary if those first generation datacenter servers are going to sell to anyone in any quantity).

(If we have the first generation datacenter servers in five years, that gives two years to get a better second or even third generation out, a year for people to come to trust those servers, and then two years to ramp up purchases to take over the installed base at year ten. If people keep datacenter servers long enough that RISC-V servers need to be dominating sales well before year eight, the timeline gets worse and thus less plausible.)

Written on 22 June 2023.
« Domination has a lead time
Go 1.21 will (probably) download newer toolchains on demand by default »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Thu Jun 22 22:51:03 2023
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.