A few small notes about OpenBSD PF (as of 4.4 and 5.1)

December 15, 2012

Suppose that you read the pf.conf manpage (in OpenBSD 4.4 or 5.1) and stumble across the following:

max-src-conn <number>
Limits the maximum number of simultaneous TCP connections which have completed the 3-way handshake that a single host can make.

Great, you say, this is just what you need to make sure that bad people are not holding too many connections to your web server open at once. So you write a PF rule more or less like this:

table <BRUTES> persist
block quick log on $EXT_IF proto tcp from <BRUTES> to any port 80
pass in quick on $EXT_IF proto tcp from any to any port 80 \
     keep state \
     (max-src-conn 20, overload <BRUTES> flush)

Shortly after you activate this rule you may discover an ever-increasing number of web crawler IPs listed in your BRUTES table, which will probably surprise you. What is going on is that the OpenBSD manpage is misleading you. max-src-conn does not limit the number of concurrent TCP connections. It limits the number of state table entries for TCP connections that have been fully established. If you examine the state tables as a web crawler is walking your site, you will discover any number of entries sitting around in FIN_WAIT_2. These connections are thoroughly closed but, guess what, they count against max-src-conn until they expire completely.

An extremely technical reading of the wording of the pf.conf manpage might lead you to a claim that this is allowed by the manpage (if you say that a TCP connection still exists in FIN_WAIT_2), but at the least I think this is going to surprise almost everyone. It also renders this max-src-conn rule useless in limiting the number of concurrent real TCP connections. Given that states linger in FIN_WAIT_2 for on the order of a minute or more, there is no feasible setting for max-src-conn that will allow a crawler to make one or two requests a second without getting blocked while also giving you a useful concurrent connections limit.

(This almost certainly applies to max-src-states too, but at least that is explicitly documented in terms of state table entries.)

But wait, the fun isn't done yet. You decide that you really need to limit the number of concurrent real TCP connections. You don't really care if stray out of sequence packets from fully closed connections get rejected by the firewall (they'd only get rejected by the host anyways), so the obvious solution is to set a very fast timeout for those lingering FIN_WAIT_2 states. You read the fine pf.conf manpage again and spot some timeout settings (which can be either global or per-state-creating-rule):

tcp.closed
The state after one endpoint sends an RST.
tcp.finwait
The state after both FINs have been exchanged and connection is closed. [...]

There is no pleasant way to put this: the pf.conf manpage is lying to you. Setting tcp.finwait to a very low value will do exactly nothing to help you; you need to set tcp.closed. The state timeouts are actually:

tcp.closed Both sides in FIN_WAIT_2 or TIME_WAIT.
tcp.finwait Both sides in CLOSING, or one side CLOSING and the other side has progressed a bit further.
tcp.closing One but not both sides in CLOSING, ie a FIN has been sent.
tcp.established Both sides ESTABLISHED.
tcp.opening At least one side not ESTABLISHED yet.

(All of this is expressed in terms of what 'pfctl -ss' will print as the states. There are a few intermediate transient states that may show up which I am eliding because my head hurts. See the logic in sys/net/pf.c and the list of states in sys/netinet/tcp_fsm.h if you really care.)

The manpage is partly technically correct in that after an RST is sent, PF puts the state into TIME_WAIT and tcp.closed applies. This is also the only time that a state winds up in TIME_WAIT.

(I have verified this behavior on OpenBSD 4.4. I have not verified the behavior on OpenBSD 5.1 but the sys/net/pf.c code involved is basically the same and reads just the same as the 4.4 version; in fact my table above is generated by reading the 5.1 pf.c source code (and my manpage quotes are from the 5.1 manpages). I have not looked at 5.2 source or manpages.)

Written on 15 December 2012.
« A drawback of short servers
Alerts should be actionable (and the three sorts of 'alerts') »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sat Dec 15 00:14:03 2012
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.