2024-09-07
I wish (Linux) WireGuard had a simple way to restrict peer public IPs
WireGuard is an obvious tool to build encrypted, authenticated connections out of, over which you can run more or less any network service. For example, you might expose the rsync daemon only over a specific WireGuard interface, instead of running rsync over SSH. Unfortunately, if you want to use WireGuard as a SSH replacement in this fashion, it has one limitation; unlike SSH, there's no simple way to restrict the public IP address of a particular peer.
The rough equivalent of a WireGuard peer is a SSH keypair. In SSH, you can restrict where a keypair will be accepted from with the 'from="..."' restriction in your .ssh/authorized_keys. This provides an extra layer of protection against the key being compromised; not only does an attacker have to acquire the key, they have to be able to use it from exactly the same IP (or the expected IPs). However, more or less by design WireGuard doesn't have a particular restriction on where a WireGuard peer key can be used from. You can set an expected public IP for the peer, but if the peer contacts you from another IP, your (Linux kernel) WireGuard will update its idea of where the peer is. This is handy for WireGuard's usual usage cases but not what we necessarily want for a wired down connection where the IPs should never change.
(I don't think this is a technical restriction in the WireGuard protocol, just something not done in most or all implementations.)
The normal answer is firewall rules that restrict access to the WireGuard port, but this has two limitations. The first and lesser limitation is that it's external to WireGuard, so it's possible to have WireGuard active but your firewall rules not properly applied, theoretically allowing more access than you intend. The bigger limitation is that if you have more than one such wired down WireGuard peer, firewall rules can't tell which WireGuard peer key is being used by which external peer. So in a straightforward implementation of firewall rules, any peer public IP can impersonate any other (if it has the required WireGuard peer key), which is different from the SSH 'from="..."' situation, where each key is restricted separately.
(On the other hand, the firewall situation is better in one way in that you can't accidentally add a WireGuard peer that will be accepted from anywhere the way you can with a SSH key by forgetting to put in a 'from="..."' restriction.)
To get firewall rules that can tell peers apart, you need to use different listening ports for each peer on your end. Today, this requires different WireGuard interfaces (and probably different server keys) for each peer. I think you can probably give all of the interfaces the same internal IP to simplify your life, although I haven't tested this.
(Having written this entry, I now wonder if it would be possible to write an nftables or iptables extension that hooked into the kernel side of WireGuard enough to know peer identities and let you match on them. Existing extensions are already able to be aware of various things like cgroup membership, and there's an existing extension for IPsec. Possibly you could do this with eBPF programs, since there's a BPF/eBPF iptables extension.)
Operating system threads are always going to be (more) expensive
Recently I read Asynchronous IO: the next billion-dollar mistake? (via). Among other things, it asks:
Now imagine a parallel universe where instead of focusing on making asynchronous IO work, we focused on improving the performance of OS threads [...]
I don't think this would have worked as well as you'd like, at least not with any conventional operating system. One of the core problems with making operating system threads really fast is the 'operating system' part.
A characteristic of all mainstream operating systems is that the operating system kernel operates in a separate hardware security domain than regular user (program) code. This means that any time the operating system becomes involved, the CPU must do at least two transitions between these security domains (into kernel mode and then back out). Doing these transitions is always more costly than not doing them, and on top of that the CPU's ISA often requires the operating system go through non-trivial work in order to be safe from user level attacks.
(The whole speculative execution set of attacks has only made this worse.)
A great deal of the low level work of modern asynchronous IO is about not crossing between these security domains, or doing so as little as possible. This is summarized as 'reducing system calls because they're expensive', which is true as far as it goes, but even the cheapest system call possible still has to cross between the domains (if it is an actual system call; some operating systems have 'system calls' that manage to execute entirely in user space).
The less that doing things with threads crosses the CPU's security boundary into (and out of) the kernel, the faster the threads go but the less we can really describe them as 'OS threads' and the harder it is to get things like forced thread preemption. And this applies not just for the 'OS threads' themselves but also to their activities. If you want 'OS threads' that perform 'synchronous IO through simple system calls', those IO operations are also transitioning into and out of the kernel. If you work to get around this purely through software, I suspect that what you wind up with is something that looks a lot like 'green' (user-space) threads with asynchronous IO once you peer behind the scenes of the abstractions that programs are seeing.
(You can do this today, as Go's runtime demonstrates. And you still benefit significantly from the operating system's high efficiency asynchronous IO, even if you're opting to use a simpler programming model.)
(See also thinking about event loops versus threads.)