2024-04-06
GNU Autoconf is not replaceable in any practical sense
In the wake of the XZ Utils backdoor, which involved GNU Autoconf, it's been somewhat popular to call for Autoconf to go away. Over on the Fediverse I said something about that:
Hot take: autoconf going away would be a significant net loss to OSS, perhaps as bad as the net loss of the Python 2 to Python 3 transition, and for much the same reason. There are a lot of projects out there that use autoconf/configure today and it works, and they would all have to do a bunch of work to wind up in exactly the same place ('a build system that works and has some switches and we can add our feature checks to').
(The build system can never supply all needed tests. Never.)`
Autoconf can certainly be replaced in general, either by one of the existing and more modern configuration and build systems, such as CMake, or by something new. New projects today often opt for one of the existing alternative build systems and (I believe) often find them simpler. But what can't be replaced easily is autoconf's use in existing projects, especially projects that use autoconf in non-trivial ways.
You can probably convert most projects to alternate build systems. However, much of this work will have to be done by hand, by each project that is converted, and this work (and the time it takes) won't particularly move the project forward. That means you're asking (or demanding) projects to spend their limited time to merely wind up in the same place, with a working build system. Further, some projects will still wind up running a substantial amount of their own shell code as part of the build system in order to determine and do things that are specific to the project.
(Although it may be an extreme example, you can look at the autoconf
pieces that OpenZFS has in its config/
subdirectory. Pretty much all
of that work would have to be done in any build system that OpenZFS
used, and generally it would have to be significantly transformed to
fit.)
There likely would be incremental security-related improvements
even for such projects. For example, I believe many modern build
systems don't expect you to ship their generated files the way that
autoconf sort of expects you to ship its generated configure
script (and the associated infrastructure), which was one part of
what let the XZ backdoor slip files into the generated tarballs
that weren't in their repository. But this is not a particularly
gigantic improvement, and as mentioned it requires projects to
do work to get it, possibly a lot of work.
You also can't simplify autoconf by declaring some standard checks
obsolete and dropping everything to do with them. It may indeed be
the case that few autoconf based programs today are actually going
to cope with, for example, there being no string.h
header file
(cf), but that
doesn't mean you can remove mentioning it from the generated header
files and so on, since existing projects require those mentions to
work right. The most you could do would be to make the generated
'configure
' scripts simply assume a standard list of features and
put them in the output those scripts generate.
(Of course it would be nice if projects using autoconf stopped
making superstitious use of things
like 'HAVE_STRING_H
' and just assume that standard headers are
present. But projects generally have more important things to spend
limited time on than cleaning up header usage.)
PS: There's an entire additional discussion that we could have about whether 'supply chain security' issues such as Autoconf and release tarballs that can't be readily reproduced by third parties are even the project's problem in the first place.
Solving the hairpin NAT problem with policy based routing and plain NAT
One use of Network Address Translation (NAT) is to let servers on your internal networks be reached by clients on the public internet. You publish public IP addresses for your servers in DNS, and then have your firewall translate those public IPs to their internal IPs as the traffic passes through. If you do this with straightforward NAT rules, someone on the same internal network as those servers may show up with a report that they can't talk to those public servers. This is because you've run into what I call the problem of 'triangular' NAT, where only part of the traffic is flowing through the firewall.
The ability to successfully NAT traffic to a machine that is actually on the same network is normally called hairpin NAT (after the hairpin turn packets make as they turn around to head out the same firewall interface they arrived on). Not every firewall likes hairpin NAT or makes it easy to set up, and even if you do set it up through cleverness, using hairpin NAT necessarily means that the server won't see the real client IP address; it will instead see some IP address associated with the firewall, as the firewall has to NAT the client IP to force the server's replies to flow back through it.
However, it recently struck me that there is another way to solve this problem, by using policy based routing. If you add an additional IP address on the server, set a routing policy so that outgoing traffic from that IP can never be sent to the local network but is always sent to the firewall, and then make that IP the internal IP that the firewall NATs to, you avoid the triangular NAT problem without the firewall having to change the client IP (which means that the internal server gets to see the true client IP for its logs or other purposes). This sort of routing policy is possible with at least some policy based routing frameworks, because at one point I accidentally did this on Linux.
(You almost certainly don't want to set up this routing policy for the internal server's primary IP address, the one it will use when making its own connections to machines. I'd expect various problems to come up.)
You still need a firewall that will send NAT'd packets back out the same interface they came in on. Generally, routers will do this for ordinary traffic, but firewall rules on routers may come with additional requirements. However, it should be possible on any routing firewall that can due full hairpin NAT, since that also requires sending packets back out the same interface after firewall rules. I believe this is generally going to be challenging on a bridging firewall, or outright impossible (we once ran into issues with changing the destination on a bridging firewall, although I haven't checked the state of affairs today).