Illustrating the importance of fully multi-core program building today

January 29, 2021

I have an enduring interest in comparing the from scratch Firefox build time on my office AMD machine and my home Intel machine, which are from the same era and have very similar configurations but drastically different Firefox build times. One of the things that I have noticed about the difference, and about building Firefox in general, is, well, I will quote my tweet:

One major area where the Firefox Nightly build takes longer on my AMD machine than on my Intel one is the end stage of building the Rust webrender, webrender_bindings, and especially gkrust. This seems to be single-core in Rust, so the lower single-core performance really hurts.

As it happens, I'm sort of wrong here, at least in general (writing Wandering Thoughts gives me plenty of opportunities to be wrong once I research things more). I don't know much about Firefox's overall build process, but it definitely builds its Rust components using Cargo, the standard Rust tool for this. Cargo itself will do building in parallel by default and Firefox doesn't turn that off; as a result, there's a cargo process running from very early on in the Firefox build process with multiple concurrent rustc processes beneath it.

(I don't know how the Firefox build processes balances the Cargo concurrency with the simultaneous C++ compilation concurrency it's getting. It doesn't seem to invoke cargo with any special flags to limit concurrency but it also doesn't flood my machine.)

However, toward the end of the Firefox build process, my AMD machine will spend a significant portion of the build time (multiple minutes) with rustc running alone on a single core, apparently primarily building gkrust itself. This single core build time is a clear bottleneck in building Firefox on my AMD machine (and is visible to some extent on my Intel one). Since rustc's memory usage keeps climbing, this may be some final step of assembling the gkrust crate together instead of actually compiling things, but it's still a clear single-core bottleneck. Depending on how long the whole process takes, this single-core Rust time can be a quarter of my entire Firefox build time on my AMD machine.

I'm not picking on Rust here; it's just that Firefox and Rust's role in building it makes a handy example. Building things concurrently is hard in general, and if it is the linking stage that's the single-core bottleneck that's even harder; linking has historically been challenging to make a multi-threaded activity. But at the same time it's increasingly important to do as much as you can here, in both the language and the build system. Any single-threaded build stage in a large program can kill build speeds.

(This is kind of an inverted version of Amdahl's law. Although I suppose if the final rustc is churning through a lot of memory, that might not help, especially if it's relatively random memory accesses; RAM latency remains comparatively terrible, and my office AMD machine doesn't have the fastest memory.)

Written on 29 January 2021.
« Forecasting drive failures is not always as useful as it sounds
I wish every program that wanted 'a SQL database' would let me use SQLite »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Fri Jan 29 23:08:30 2021
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.