2010-08-02
How OpenBSD's pf source-hash
maps internal IPs to NAT pool IPs
OpenBSD's pf packet filter has a feature where you can NAT your
internal IP addresses to a pool of gateway IP addresses, where
the pool is specified as a CIDR netblock. The pf.conf manpage describes the
various options one has for how this happens. If you need a randomly
distributed but consistent mapping, source-hash
is what you want.
(Random distribution makes all of your gateway IP addresses more or less evenly used, regardless of how you distribute IPs around your internal IP address space. Consistent means that an internal IP address is always going to be assigned the same gateway IP, instead of rotating around to different ones over time.)
OpenBSD does not document the specifics of how the source-hash
option
determines the NAT gateway IP address given an internal IP (at least as
far as I know), which is a pity since there are a number of situations where you want to be able to determine this
for yourself. Instead you are left to reverse engineer the relevant
kernel code.
Let me save you the time. Here is how it works, at least as of OpenBSD
3.x and 4.x and IPv4 addresses. The whole process uses three things: the
internal IP address, the CIDR netblock of the NAT pool (split into the
IP and the netmask), and the source-hash
's key. Source hash keys are
128 bits, but the IPv4 hashing algorithm only uses the first 96 bits.
(As the manpage says, if you don't specify a key pfctl will make up a random key every time you reload pf.conf. If you care about doing post-facto mapping yourself, you will want to specify a static key.)
The overview: the internal IP address is hashed together with the first 96 bits of the key, generating a 32-bit hash result. However many (logical) low order bits of that hash result as necessary are then used to fill in the host portion of the NAT pool CIDR to get the gateway IP address.
(Assuming a strong hash, all of the bits of the result should be evenly distributed, low order bits included, so internal IPs will be spread evenly over the NAT pool.)
The actual code has some complications. First, the internal IP address, the NAT pool base IP, and the NAT pool netmask are all in network byte order. Source-hash keys are written as 32 hex characters, but are stored and used in the code as four 32-bit words (IPv4 hashing uses words 0 through 2); word 0 is the leftmost 8 characters of a printed key, in host byte order.
Finally, the hashing function used for this is a custom one. The
setup code is in pf_hash()
, currently found in pf_lb.c; the guts of
the custom hash function are in the mix()
macro. All of the hashing
code uses unsigned 32-bit arithmetic and definitely counts on all of the
usual rollover artifacts from it. Because this is a custom hash, you
will have to copy the code from pf_lb.c and translate it into your
favorite language, or at least remove all of the kernel-isms from it and
put some scaffolding around it.
(When reading and translating the code, note that all of the addr32
and key32
structure elements are unsigned 32-bit integers. Because
IPv4 addresses are only 32 bits long anyways, addr32[0]
means the
whole internal IP address, pool base address, or pool netmask, always
in network byte order. key32[0]
is word 0 of the key, as discussed
above.)
Combining the hash result with everything else is done
in pf_poolmask()
, currently found in pf.c, which uses
the NAT pool base IP, the netmask, and the hash result. While the code
is complicated by OpenBSD kernel data structures and variable naming, it
boils down to:
gwip = (paddr & pmask) | (invert(pmask) & hashresult)
(where invert(pmask)
inverts the pool network mask so that it is all
1s in the host portion and all 0s in the network portion, ie it is
'pmask ^ 0xffffffff
'.)
Doing this reverse engineering for IPv6 is left as an exercise to someone who is actually using IPv6 and needs this.
When I was doing this I used Python, which worked fine once I sorted out the unsigned 32-bit arithmetic issues and realized things had to be in network byte order. You will definitely want to test your code against reality to make sure that it works. (If there is any interest, I can clean my code up and make it available somewhere.)