Wandering Thoughts archives

2010-08-02

How OpenBSD's pf source-hash maps internal IPs to NAT pool IPs

OpenBSD's pf packet filter has a feature where you can NAT your internal IP addresses to a pool of gateway IP addresses, where the pool is specified as a CIDR netblock. The pf.conf manpage describes the various options one has for how this happens. If you need a randomly distributed but consistent mapping, source-hash is what you want.

(Random distribution makes all of your gateway IP addresses more or less evenly used, regardless of how you distribute IPs around your internal IP address space. Consistent means that an internal IP address is always going to be assigned the same gateway IP, instead of rotating around to different ones over time.)

OpenBSD does not document the specifics of how the source-hash option determines the NAT gateway IP address given an internal IP (at least as far as I know), which is a pity since there are a number of situations where you want to be able to determine this for yourself. Instead you are left to reverse engineer the relevant kernel code.

Let me save you the time. Here is how it works, at least as of OpenBSD 3.x and 4.x and IPv4 addresses. The whole process uses three things: the internal IP address, the CIDR netblock of the NAT pool (split into the IP and the netmask), and the source-hash's key. Source hash keys are 128 bits, but the IPv4 hashing algorithm only uses the first 96 bits.

(As the manpage says, if you don't specify a key pfctl will make up a random key every time you reload pf.conf. If you care about doing post-facto mapping yourself, you will want to specify a static key.)

The overview: the internal IP address is hashed together with the first 96 bits of the key, generating a 32-bit hash result. However many (logical) low order bits of that hash result as necessary are then used to fill in the host portion of the NAT pool CIDR to get the gateway IP address.

(Assuming a strong hash, all of the bits of the result should be evenly distributed, low order bits included, so internal IPs will be spread evenly over the NAT pool.)

The actual code has some complications. First, the internal IP address, the NAT pool base IP, and the NAT pool netmask are all in network byte order. Source-hash keys are written as 32 hex characters, but are stored and used in the code as four 32-bit words (IPv4 hashing uses words 0 through 2); word 0 is the leftmost 8 characters of a printed key, in host byte order.

Finally, the hashing function used for this is a custom one. The setup code is in pf_hash(), currently found in pf_lb.c; the guts of the custom hash function are in the mix() macro. All of the hashing code uses unsigned 32-bit arithmetic and definitely counts on all of the usual rollover artifacts from it. Because this is a custom hash, you will have to copy the code from pf_lb.c and translate it into your favorite language, or at least remove all of the kernel-isms from it and put some scaffolding around it.

(When reading and translating the code, note that all of the addr32 and key32 structure elements are unsigned 32-bit integers. Because IPv4 addresses are only 32 bits long anyways, addr32[0] means the whole internal IP address, pool base address, or pool netmask, always in network byte order. key32[0] is word 0 of the key, as discussed above.)

Combining the hash result with everything else is done in pf_poolmask(), currently found in pf.c, which uses the NAT pool base IP, the netmask, and the hash result. While the code is complicated by OpenBSD kernel data structures and variable naming, it boils down to:

gwip = (paddr & pmask) | (invert(pmask) & hashresult)

(where invert(pmask) inverts the pool network mask so that it is all 1s in the host portion and all 0s in the network portion, ie it is 'pmask ^ 0xffffffff'.)

Doing this reverse engineering for IPv6 is left as an exercise to someone who is actually using IPv6 and needs this.

When I was doing this I used Python, which worked fine once I sorted out the unsigned 32-bit arithmetic issues and realized things had to be in network byte order. You will definitely want to test your code against reality to make sure that it works. (If there is any interest, I can clean my code up and make it available somewhere.)

unix/OpenBSDPfHash written at 23:39:19; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.