Making sense of OpenBSD 'pfctl -ss' output for firewall state tables

August 21, 2019

Suppose, not entirely hypothetically, that you have some OpenBSD firewalls and every so often you wind up looking at the state table listing that's produced by 'pfctl -ss'. On first impression, this output looks sort of understandable, with entries like:

all tcp <- 128.100.3.X:46392       ESTABLISHED:ESTABLISHED
all tcp 128.100.3.X:46392 ->       ESTABLISHED:ESTABLISHED

I won't say that appearances are deceptive here, but things are not as straightforward as they look once you start wanting to know what this is really telling you. For instance, there is no documentation on what that 'all' actually means. Since I've been digging into this, here's what I've learned.

The general form of a state table entry as printed by 'pfctl -ss' is:


At least for our firewalls, the interface is generally 'all'. The protocol can be any number of things, including tcp, udp, icmp, esp, ospf, and pfsync. For TCP connections, the listed states are the TCP states (and you can get all of the weird and wonderful conditions where the two directions of the connection are in different states, such as half-closed connections). For other protocols there's a smaller list; see the description of 'set timeout' in pf.conf's OPTIONS section for a discussion of most of them. There's also a NO_TRAFFIC state for when no traffic has happened in one direction.

So let's talk about directions, the field for which I have called DIR and which will always be either '<-' or '->', which mean in and out respectively. By that I mean PF_IN and PF_OUT (plus PF_FWD for forwarded packets), not 'inside' and 'outside'. OpenBSD PF doesn't have any notion of inside and outside interfaces, but it does have a notion of incoming traffic and outgoing traffic, and that is what ultimately determines the direction. If a packet is matched or handled during input and that creates a state table entry, that will be an in entry; similarly, matching or passing it during output will create an out entry. Sometimes this is through explicit 'pass in' and 'pass out' rules, but other times you have a bidirectional rule (eg 'match on <IF> ... binat-to ...') and then the direction depends on packet flow.

The first thing to know is that contrary to what I believed when I started writing this entry, all state table entries are created by rules. As far as I can tell, there are no explicit state table entries that get added to handle replies; the existing 'forward' state table entries are just used in reverse to match the return traffic. The reason that state table entries usually come in pairs (at least for us) is that we have both 'pass in' and 'pass out' rules that apply to almost all packets, and so both rules create a corresponding state table entry for a specific connection. An active, permitted connection will thus have two state table entries, one for the 'pass in' rule that allows it in and one for the 'pass out' rule that allows it out.

The meaning of the left and the right address changes depending on the direction. For an out state table entry, the left address is the (packet or connection) source address and the right address is the destination address; for an in state table entry it's reversed, with the left address the destination and the right address the source. The LEFT-STATE and RIGHT-STATE fields are associated with the left and the right addresses respectively, whatever they are, and for paired up state table entries I believe they're always going to be mirrors of each other.

(I believe that the corollary of this is that the NO_TRAFFIC state can only appear on the destination side, ie the side that didn't originate the packet flow. This means that for an out state NO_TRAFFIC will always be the right state, and on an in state it will always be the left one.)

So far I have shown a pair of state table entries from a simple firewall without any sort of NAT'ing going on (which includes 'rdr-to' rules). If you have some sort of NAT in effect, the output changes and generally that change will be asymmetric between the pair of state table entries. Here is an example:

all tcp 128.100.X.X:22 <-       ESTABLISHED:ESTABLISHED
all tcp 128.100.3.Y:60689 ( -> 128.100.X.X:22       ESTABLISHED:ESTABLISHED

This machine has made an outgoing SSH connection that was first matched by a 'pass in' rule and then NAT'd on output. Inbound NAT creates a different set of state table entries:

all tcp 10.X.X.X:22 (128.100.20.X:22) <-       ESTABLISHED:ESTABLISHED
all tcp -> 10.X.X.X:22       ESTABLISHED:ESTABLISHED

The rule is that the pre-translation address is in () and the post translation address is not. On outbound NAT, the pre-translation address is the internal address and the post-translation one is the public IP; on inbound NAT it's the reverse. Notice that this time the NAT was applied on input, not on output, and of course there was a 'pass in' rule that matched.

(If you have binat-to machines they can have both sorts of entries at once, with some connections coming in from outside and some connections going outside from the machine.)

If you do your NAT through bidirectional rules (such as 'match on <IF> ...'), where NAT is applied is determined by what interface you specify in the rule combined with packet flow. This is our case; all of our NAT rules are applied on our perimeter firewall's external interface. If we applied them to the internal interface, we could create situations where the right address had the NAT mapping instead of the left one. The resulting state table entries would look like this (for an inbound connect that was RDR'd):

all tcp 128.100.3.X:25 <- 128.100.A.B:39304       ESTABLISHED:ESTABLISHED
all tcp 128.100.A.B:39304 -> 128.100.3.YYY:25 (128.100.3.X:25)       ESTABLISHED:ESTABLISHED

This still follows the rule that the pre-translation address is in the () and the post-translation address is not.

In general, given only a set of state table entries, you don't know what is internal and what is external. This is true even when NAT is in effect, because you don't necessarily know where NAT is being applied (as shown here; all NAT'd addresses are internal ones, but they show up almost all over). If you know certain things about your rules, you can know more from your state table entries (without having to do things like parse IP addresses and match network ranges). Given how and where we apply NAT, it's always going to appear in our left addresses, and if it appears on an in state table entry it's an external machine making an inbound connection instead of an internal machine making an outgoing one.

PS: According to the pfctl code, you may sometimes see extra text in left or right address that look like '{ <IP address> }'. I believe this appears only if you use af-to to do NAT translation between IPv4 and IPv6 addresses. I'm not sure if it lists the translated address or the original.

PPS: Since I just tested this, the state of an attempted TCP connection in progress to something that isn't responding is SYN_SENT for the source paired with CLOSED for the destination. An attempted TCP connection that has been refused by the destination with a RST has a TIME_WAIT:TIME_WAIT state. Both of these are explicitly set in the relevant pf.c code; see pf_create_state and pf_tcp_track_full (for the RST handling). Probably those are what you'd expect from the TCP state transitions in general.

Sidebar: At least three ways to get singleton state table entries

I mentioned that state table entries usually come in pairs. There are at least three exceptions. The first is state table entries for traffic to the firewall itself, including both pings and things like SSH connections; these are accepted in 'pass in' rules but are never sent out to anywhere, so they never get a second entry. The second is traffic that is accepted by 'pass in' rules but then matches some 'block out' rule so that it's not actually sent out. The third and most obvious exception is that if you match in one direction with 'no state' but use state on the other one, perhaps by accident or omission.

(Blocked traffic tends to have NO_TRAFFIC as the state for one side, but not all NO_TRAFFIC states are because of blocks; sometimes they're just because you're sending traffic to something that doesn't respond.)

I was going to say things about the relative number of in and out states as a consequence and corollary of this, but now that I've looked at our actual data I'm afraid I have no idea what's going on.

(I think that part of it is that for TCP connections, you can have closed down or inactive connections where one state table entry expires before the other. This may apply to non-TCP connections too, but my head hurts. For that matter, I'm not certain that 'pfctl -ss' is guaranteed to report a coherent copy of the state table. Pfctl does get it from the kernel in a single ioctl(), but the kernel may be mutating the table during the process.)

Written on 21 August 2019.
« Saying goodbye to Flash (in Firefox, and in my web experience)
Pruning deleted remote Git branches (manually or automatically) »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Wed Aug 21 20:52:57 2019
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.