Wandering Thoughts archives

2025-05-01

The complexity of mixing mesh networking and routes to subnets

One of the in things these days is encrypted (overlay) mesh networks, where you have a bunch of nodes and the nodes have encrypted connections to each other that they use for (at least) internal IP traffic. WireGuard is one of the things that can be used for this. A popular thing to add to such mesh network solutions is 'subnet routes', where nodes will act as gateways to specific subnets, not just endpoints in themselves. This way, if you have an internal network of servers at your cloud provider, you can establish a single node on your mesh network and route to the internal network through that node, rather than having to enroll every machine in the internal network.

(There are various reasons not to enroll every machine, including that on some of them it would be a security or stability risk.)

In simple configurations this is easy to reason about and easy to set up through the tools that these systems tend to give you. Unfortunately, our network configuration isn't simple. We have an environment with multiple internal networks, some of which are partially firewalled off from each other, and where people would want to enroll various internal machines in any mesh networking setup (partly so they can be reached directly). This creates problems for a simple 'every node can advertise some routes and you accept the whole bundle' model.

The first problem is what I'll call the direct subnet problem. Suppose that you have a subnet with a bunch of machines on it and two of them are nodes (call them A and B), with one of them (call it A) advertising a route to the subnet so that other machines in the mesh can reach it. The direct subnet problem is that you don't want B to ever send its traffic for the subnet to A; since it's directly connected to the subnet, it should send the traffic directly. Whether or not this happens automatically depends on various implementation choices the setup makes.

The second problem is the indirect subnet problem. Suppose that you have a collection of internal networks that can all talk to each other (perhaps through firewalls and somewhat selectively). Not all of the machines on all of the internal networks are part of the mesh, and you want people who are outside of your networks to be able to reach all of the internal machines, so you have a mesh node that advertises routes to all of your internal networks. However, if a mesh node is already inside your perimeter and can reach your internal networks, you don't want it to go through your mesh gateway; you want it to send its traffic directly.

(You especially want this if mesh nodes have different mesh IPs from their normal IPs, because you probably want the traffic to come from the normal IP, not the mesh IP.)

You can handle the direct subnet case with a general rule like 'if you're directly attached to this network, ignore a mesh subnet route to it', or by some automatic system like route priorities. The indirect subnet case can't be handled automatically because it requires knowledge about your specific network configuration and what can reach what without the mesh (and what you want to reach what without the mesh, since some traffic you want to go over the mesh even if there's a non-mesh route between the two nodes). As far as I can see, to deal with this you need the ability to selectively configure or accept (subnet) routes on a mesh node by mesh node basis.

(In a simple topology you can get away with accepting or not accepting all subnet routes, but in a more complex one you can't. You might have two separate locations, each with their own set of internal subnets. Mesh nodes in each location want the other location's subnet routes, but not their own location's subnet routes.)

sysadmin/MeshNetworksRoutesComplexity written at 22:51:16;


Page tools: See As Normal.
Search:
Login: Password:

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.