Portability has ongoing costs for code that's changing

March 22, 2021

I recently had a hot take on Twitter:

Hot take: no evolving project should take patches for portability to an architecture that it doesn't have developers using, or at least a full suite of tests and automated CI on. Especially if the architecture is different from all supported ones in some way (eg, big-endian).

Without regular usage or automated testing, it's far too likely that the project will break support on the weird architecture. And keeping the code up to date for the weird architecture is an ongoing tax on other development.

Some proponents of alternate architectures like to maintain that portability is free, or at least a one time cost (that can be paid by outside contributors in the form of a good patch to 'add support' for something). It would be nice if our programming languages, habits, and techniques made that so, but they don't. The reality is that maintaining portability to alternate environments is an ongoing cost.

(This should not be a surprise, because all code has costs and by extension more code has more costs.)

To start with, we can extend the general rule that 'if you don't test it, it doesn't work'. If alternate environments (including architectures) aren't tested by the project and probably used by developers, they will be broken sooner or later. Not because people are necessarily breaking them deliberately, but because people overlook things and never have the full picture. This happens even to well intentioned projects that fully want to support something that they don't test, as shown by Go's self-tests. Believing that an alternate architecture will work and keep working without regular testing goes against everything we've learned about the need for regular automated testing.

(I believe that for social reasons it's almost never good enough to just have automated tests without developers who are using the architecture and are committed to it.)

Beyond the need for extra testing and so on, portability is an ongoing cost for code changes. The alternate architecture's differences and limits will need to be considered during programming, probably complicating the change, then some changes will be broken only on the alternate architecture and will require more work to fix. Sooner or later some desired changes won't be possible or feasible on the alternate architecture, or will require extra work to implement options, alternatives, or additional things. In some cases it will require extreme measures to meet the project's quality of implementation standards for changes and new features. When something slips through the cracks anyway, the project will have additional bug reports, ones that will be difficult to deal with unless the project has ongoing easy access both to the alternate environment and to experts in it.

More broadly, the alternate architecture is implicitly yet another thing that programmers working on the project have to keep in mind. The human mind has a limited capacity for this; by forcing people to remember one thing, you make it so that they can't remember other things.

The very fact that an alternate architecture actually needs changes now, instead of just building and working, shows that it will most likely need more changes in the future. And the need for those changes did not arise from nothing. Programmers mostly don't deliberately write 'unportable' code to be perverse. All of that code that needs changes for your alternate architecture is the natural outcome of our programming environment. That code is the easiest, most natural, and fastest way for the project to work. The project has optimized what matters to it, although not necessarily actively and consciously.

(My tweets and this entry were sparked by portions of this, via. The whole issue has come up in the open source world before, but this time I felt like writing my own version of this rejoinder.)


Comments on this page:

By James at 2021-03-24 03:37:22:

Two things:

  • Should this not also apply to OSes (e.g. how many devs on a project use a BSD, or even Windows)? Arguably supporting different architectures finds more logic/design bugs than other OSes, and avoids the code becoming too tied to a single architecture, which causes problems down the line when the default changes (e.g. the 32->64 transition, a possible future change to ARM).

  • Who is a developer (and when are they not)? Is it someone who has commit/merge privileges, someone who regularly sends patches (but without commit/merge privileges) or a packager who follows upstream and regularly runs tests (maybe even with a private test infrastructure, in order to manage the load on systems), and sends portability patches?

Arguably the answer for smaller projects now seems comes down whether they can get free (in terms of cost) CI services (e.g. how many projects truly test on 32-bit systems), and how long it takes for said CI to run (e.g. projects switching away from appveyor to azure pipelines/github actions for testing on Windows due to the latter's increased throughput, or projects avoiding MacOS testing on Travis due to its long wait times). I wonder if before the time of widespread free CI, whether projects were more accepting of patches adding/fixing support for different architectures/OSes.

By cks at 2021-03-24 15:49:58:

I think that this definitely applies to OSes, and in fact many projects have run into this issue. For example, I know that darktable spent a long time telling people that they couldn't accept patches for Windows support because they had no one working on Windows and so couldn't keep the patches working (they later got active Windows people and now support it).

If portability to various environments and future-proofing is a goal for the project (it isn't always), then supporting additional architectures is a good thing. But 'supporting' requires more than just accepting patches once; you have to be able to verify that things still work and that you can fix issues. This is true even if your goal is future proofing (which makes pure future proofing hard to test in many environments). I feel that this is partly a matter of our current tools; how many programming environments have linters and so on that look for and can warn about portability issues, especially subtle ones?

(This assumes that you're doing something that even allows for easy portability if you get the code right. Chromium is not, for example, because it requires a high quality JIT for each architecture in order to meet its quality goals.)

My version of 'developer' here would be more or less 'someone who regularly works on the code and contributes changes'. I feel that you want at least some of the people who are evolving the program in general to be working on any architecture that you support, so they can notice things and raise issues in places like discussions of new features, plans for code reorganizations, and code reviews. Portability issues are much like bugs; the sooner you catch and consider them, the better.

Written on 22 March 2021.
« My uncertainty about swapping and swap sizing for SSDs and NVMe drives
Discovering Vim's Visual (selection) mode »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Mon Mar 22 23:53:56 2021
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.