What symmetric and asymmetric IP routing are

June 28, 2022

In a recent entry I talked somewhat informally about symmetric (IP) routing. Symmetric and asymmetric IP routing are ideas that I'm familiar with from working on firewalls and networking, but it's not necessarily common knowledge in the broader community. We can approach what they are from two directions, so I'm going to start from how conventional IP routing works.

The traditional and normal way that your IP stack decides where an outgoing IP packet should be sent is based (only) on the destination IP address. If the destination IP is in a directly attached network, your system sends it out the relevant interface. If there's a specific route that applies to the destination IP, the packet is sent to the gateway the route lists. And if all else fails, the packet is sent to your default route's gateway (or dropped, if you have no default route).

However, if you have a multi-homed host, a host with multiple interfaces and IP addresses, this approach to routing outgoing traffic can create a situation where outgoing and incoming packets for the same connection (or flow) use different interfaces. To have this happen you normally need at least two of your networks to be routable, which is to say that hosts not on those networks can reach them and hosts on those networks can reach other networks.

To make this concrete, say you have a host with two interfaces and IP addresses on each, with 10.20.0.10 on 10.20.0.0/16 and 192.168.100.1 on 192.168.100.0/24. Your default route is to 192.168.100.254 and you have no other special routes. There are two situations that will create a difference between incoming and outgoing packets. First, if any host not on 10.20.0.0/16 pings your 10.20.0.10 IP address, your replies will use your default route and go out your 192.168.100.0/24 network interface (despite coming from 10.20.0.10). Second, if a host on 10.20.0.0/16 pings your 192.168.100.1 IP address, your replies will go directly out your 10.20.0.0/16 interface despite coming from 192.168.100.1.

Both of these situations are asymmetric routing, where packets in one direction take a different path through the network than packets in the other direction. In a completely reliable network with no special features, asymmetric routing is things working as intended, with IP packets taking what your system believes is the most efficient available path to their destinations. However, in a network that may be having faults along some paths and that has firewalls, asymmetric routing can cause artificial connectivity failures (or hide them). It's especially a problem with stateful firewalls, because such a firewall will be seeing only one half of the conversation and will normally block it.

In symmetric routing, we arrange (somehow) for packets to take the same path in both directions in all of these situations. If you're pinged at 192.168.100.1, your replies always go out on 192.168.100.0/24 even if they're from a host in 10.20.0.0/16; if you're pinged at 10.20.0.10 by some random IP, your replies always go out on 10.20.0.0/16 even if your normal default route is through 192.168.100.254 (you'll need a second default route for 10.20.0.0/16 to make this work). This also extends to traffic that your host originates. If you ping a host in 10.20.0.0/16 with the source IP of 192.168.100.1, your pings should go to 192.168.100.0/24's default gateway of 192.168.100.254, not directly out your 10.20.0.0/16 interface. If your 'source IP 192.168.100.1' pings did go out your 10.20.0.0/16 interface, the ICMP replies from the innocent 10.20.0.0/16 host would take a different return path and create asymmetric routing.

There are a variety of ways to create a situation with symmetric routing. One general approach is to create separate network worlds, each with only one (routed) network interface in it, and to confine packets (and connections) to their appropriate world. Another general approach goes by the name of policy based routing, which is the broad idea of using more than just the destination IP to decide on packet routing. To do symmetric routing through policy based routing, you make routing choices depend on the source IP as well as the destination IP.

(Policy based routing is potentially much more general than mere symmetric routing, and I believe that it originates from the world of routers, not hosts. Sophisticated routing environments may have various complex rules, such as 'traffic from these networks can only use these links'. Symmetric routing itself is mostly a host issue.)


Comments on this page:

By Walex at 2022-07-01 18:03:54:

This is a short discussion of an interesting topic, which is that in the IP architecture there is no "multihoming", because there are only "interfaces" that may have one or more IP addresses. That a "host" may have multiple interfaces is something that is not part of the IP architecture.

However hosts with multiple interfaces are common enough, and there are several interpretations of them, and this is described quite well in RFC1122 "Requirements for Internet Hosts -- Communication Layers" (1989), section 3.3.42. https://datatracker.ietf.org/doc/html/rfc1122.html#page-61

 "There are two key requirement issues related to multihoming:
 (A)  A host MAY silently discard an incoming datagram whose
      destination address does not correspond to the physical
      interface through which it is received.
 (B)  A host MAY restrict itself to sending (non-source-routed)
      IP datagrams only through the physical interface that
      corresponds to the IP source address of the datagrams."

 "Internet host implementors have used two different conceptual
 models for multihoming, briefly summarized in the following
 discussion. This document takes no stand on which model is
 preferred; each seems to have a place. This ambivalence is
 reflected in the issues (A) and (B) being optional.

 o    Strong ES Model
      The Strong ES (End System, i.e., host) model emphasizes
      the host/gateway (ES/IS) distinction, and would therefore
      substitute MUST for MAY in issues (A) and (B) above. It
      tends to model a multihomed host as a set of logical
      hosts within the same physical host. [...]
      Under the Strong ES model, the route computation for an
      outgoing datagram is the mapping:
        route(src IP addr, dest IP addr, TOS) -> gateway
      [...]

  o   Weak ES Model
      This view de-emphasizes the ES/IS distinction, and would
      therefore substitute MUST NOT for MAY in issues (A) and
      (B). This model may be the more natural one for hosts
      that wiretap gateway routing protocols, and is necessary
      for hosts that have embedded gateway functionality. [...]
      In the Weak ES model, the route computation for an
      outgoing datagram is the mapping:
        route(dest IP addr, TOS) -> gateway, interface"

The Linux network stack implements by default an extreme version of the "weak ES Model", and our blogger prefers the strong one.

I agree with RFC1122 that since the IP architecture does not really describe "hosts", their behavior is implementation dependent, but there is a third option that I think fits better with the overall IP architecture:

  • Each host has either zero interfaces and no IP networking or if it has IP networking it has an internal "virtual" network.

  • Each host has a distinguished "virtual" internal interface on that network, and it is not connected on any "real" network.

  • That interface has at least one routable IP address.

  • All processes by default "bind" to that interface and one of its IP addresses.

  • The host may have other interfaces to external networks, each with zero or more IP addresses.

It is particularly nice if the distinguished interface address is routed /32 with something like OSPF, and all processes bind solely to it. Then host reachability is entirely independent of network topology. If that is done, in effect that /32 routable address becomes the host identifier.

It is even better is ECMP is also used: then the distinguished address/identifier is reachable not only by parallel routes (as many external interface is it has), but IP traffic can flow on all of them in parallel, and if one route goes away, that is even hard to notice (but for capacity reduction).

This scheme works well because as the IP architecture has no notion of "host" but only of networks and interfaces then the natural thing is to model a host as a (virtual) network with an interface; it also offers all flexibility of both models without any drawbacks (other than more routing, which is currently cheap).

Note: In "strong ES" there no internal network and distinguished interface, in rather "weak ES" like in Linux all interfaces in effect are merged into a single distinguished interface attached to all external networks.

Note: Compare IPv6 link scope addresses, and ther XNS/IPX datagram addressing scheme based on network identifiers and Ethernet identifiers.

By Walex at 2022-07-02 07:44:59:

«without any drawbacks (other than more routing, which is currently cheap)»

There is a very important story there related to the TTL, header checksum, CISCO, and VLAN tags, but since it goes against the grain of "conventional wisdom" it is unpopular.

http://www.sabi.co.uk/blog/13-two.html?130713#130713 http://www.sabi.co.uk/blog/13-two.html?130726#130726 http://www.sabi.co.uk/blog/14-two.html?141226#141226 http://www.sabi.co.uk/blog/17-two.html?170704#170704

«rather "weak ES" like in Linux all interfaces in effect are merged into a single distinguished interface attached to all external networks»

That goes beyond asymmetric routing, by default there is "asymmetric" ARPing (AKA "ARP flux") too and that to me is a lot more objectionable (see 'arp_filter' and 'hidden').

«XNS/IPX datagram addressing scheme based on network identifiers and Ethernet identifiers.»

Addresses and identifiers are subtle topics, the XNS/IPX scheme is in effect "mixed": network addresses are composed of a 32 bit network number and a 48-bit Ethernet identifier, both of which are globally unique. In way of principle traffic to a target 80 bit address can be routed solely on the Ethernet identifier, but that would require huge routing tables in principle. Therefore the network numbers, which can be arranged by range, can be used first to route a datagram to the network on which the target's Ethernet identifier can be used. http://www.sabi.co.uk/blog/13-one.html?130119#130119

This is a special case of a general idea, to use both a locator and an identifier, with the locator used to shortcut the search for the identifier, which I first found in the MUSS OS from the University of Manchester: it had process ids that contained both a process table slot number, and a "unique" process id. So there was no need to search for the process id in the table, the slot number sufficed, and then that slot's process id was compared with the sought-for process id.

More generally when using a (locator,identifier) scheme there are two options, depending on the design, the locator can be:

  • definite: if the identifier is not at that location, the search ends as the identifier is invalid;
  • a hint: if the identifier is not at that location, the search continues more slowly with just the identifier.
Written on 28 June 2022.
« Wishing for a simple way to set up multi-interface symmetric routing on Linux
Notes on the Linux kernel's 'pressure stall information' and its meanings »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Tue Jun 28 22:11:43 2022
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.