Wandering Thoughts archives

2024-04-06

GNU Autoconf is not replaceable in any practical sense

In the wake of the XZ Utils backdoor, which involved GNU Autoconf, it's been somewhat popular to call for Autoconf to go away. Over on the Fediverse I said something about that:

Hot take: autoconf going away would be a significant net loss to OSS, perhaps as bad as the net loss of the Python 2 to Python 3 transition, and for much the same reason. There are a lot of projects out there that use autoconf/configure today and it works, and they would all have to do a bunch of work to wind up in exactly the same place ('a build system that works and has some switches and we can add our feature checks to').

(The build system can never supply all needed tests. Never.)`

Autoconf can certainly be replaced in general, either by one of the existing and more modern configuration and build systems, such as CMake, or by something new. New projects today often opt for one of the existing alternative build systems and (I believe) often find them simpler. But what can't be replaced easily is autoconf's use in existing projects, especially projects that use autoconf in non-trivial ways.

You can probably convert most projects to alternate build systems. However, much of this work will have to be done by hand, by each project that is converted, and this work (and the time it takes) won't particularly move the project forward. That means you're asking (or demanding) projects to spend their limited time to merely wind up in the same place, with a working build system. Further, some projects will still wind up running a substantial amount of their own shell code as part of the build system in order to determine and do things that are specific to the project.

(Although it may be an extreme example, you can look at the autoconf pieces that OpenZFS has in its config/ subdirectory. Pretty much all of that work would have to be done in any build system that OpenZFS used, and generally it would have to be significantly transformed to fit.)

There likely would be incremental security-related improvements even for such projects. For example, I believe many modern build systems don't expect you to ship their generated files the way that autoconf sort of expects you to ship its generated configure script (and the associated infrastructure), which was one part of what let the XZ backdoor slip files into the generated tarballs that weren't in their repository. But this is not a particularly gigantic improvement, and as mentioned it requires projects to do work to get it, possibly a lot of work.

You also can't simplify autoconf by declaring some standard checks obsolete and dropping everything to do with them. It may indeed be the case that few autoconf based programs today are actually going to cope with, for example, there being no string.h header file (cf), but that doesn't mean you can remove mentioning it from the generated header files and so on, since existing projects require those mentions to work right. The most you could do would be to make the generated 'configure' scripts simply assume a standard list of features and put them in the output those scripts generate.

(Of course it would be nice if projects using autoconf stopped making superstitious use of things like 'HAVE_STRING_H' and just assume that standard headers are present. But projects generally have more important things to spend limited time on than cleaning up header usage.)

PS: There's an entire additional discussion that we could have about whether 'supply chain security' issues such as Autoconf and release tarballs that can't be readily reproduced by third parties are even the project's problem in the first place.

programming/AutoconfNotReplaceable written at 22:50:15;

Solving the hairpin NAT problem with policy based routing and plain NAT

One use of Network Address Translation (NAT) is to let servers on your internal networks be reached by clients on the public internet. You publish public IP addresses for your servers in DNS, and then have your firewall translate those public IPs to their internal IPs as the traffic passes through. If you do this with straightforward NAT rules, someone on the same internal network as those servers may show up with a report that they can't talk to those public servers. This is because you've run into what I call the problem of 'triangular' NAT, where only part of the traffic is flowing through the firewall.

The ability to successfully NAT traffic to a machine that is actually on the same network is normally called hairpin NAT (after the hairpin turn packets make as they turn around to head out the same firewall interface they arrived on). Not every firewall likes hairpin NAT or makes it easy to set up, and even if you do set it up through cleverness, using hairpin NAT necessarily means that the server won't see the real client IP address; it will instead see some IP address associated with the firewall, as the firewall has to NAT the client IP to force the server's replies to flow back through it.

However, it recently struck me that there is another way to solve this problem, by using policy based routing. If you add an additional IP address on the server, set a routing policy so that outgoing traffic from that IP can never be sent to the local network but is always sent to the firewall, and then make that IP the internal IP that the firewall NATs to, you avoid the triangular NAT problem without the firewall having to change the client IP (which means that the internal server gets to see the true client IP for its logs or other purposes). This sort of routing policy is possible with at least some policy based routing frameworks, because at one point I accidentally did this on Linux.

(You almost certainly don't want to set up this routing policy for the internal server's primary IP address, the one it will use when making its own connections to machines. I'd expect various problems to come up.)

You still need a firewall that will send NAT'd packets back out the same interface they came in on. Generally, routers will do this for ordinary traffic, but firewall rules on routers may come with additional requirements. However, it should be possible on any routing firewall that can due full hairpin NAT, since that also requires sending packets back out the same interface after firewall rules. I believe this is generally going to be challenging on a bridging firewall, or outright impossible (we once ran into issues with changing the destination on a bridging firewall, although I haven't checked the state of affairs today).

tech/HairpinNATAndPolicyBasedRouting written at 00:06:59;


Page tools: See As Normal.
Search:
Login: Password:

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.