## A realization about ratelimit time horizons

September 11, 2012

Here's something that's smacked me in the nose recently as I started working with Exim's ratelimits.

When you have a ratelimit it's usually expressed in terms of 'X events in Y time' and you generally get to pick both X and Y. Mathematically, there is no difference in how many total events a ratelimit allows over a long time period if you scale both X and Y together; 20 events per 10 minutes is the same as 120 events per hour. But this is not quite right. The two ratelimits are different, and here's how: the shorter the ratelimit time interval is, the harsher it is on burst traffic. If you send to 50 recipients in a burst in five minutes, you will trip the '20 in 10 minutes' limit but not the '120 in an hour' version.

The result is that picking a time period for your ratelimit is a tradeoff. On the one hand a short ratelimit period gives less and less allowance for bursts. On the other hand it also limits the amount of damage that an aggressor can do because it cuts in sooner. If you are sending messages as fast as possible, the 20 in 10 minutes limit will allow you to send to 20 recipients and then cut you off while the 120 in an hour version will let you spam a lot more people before it stops you.

I think that this means that my first step for setting ratelimit numbers in the future should be to figure out how much we're willing to let a bad guy get away with. The larger the number the more flexibility I have with the time period and generally longer is going to be better unless the events aren't very bursty.

(I'm sure that this is well known among people who deal with ratelimits regularly, and probably it was even mentioned in the documentation for Exim's. I'm slow sometimes and writing things down helps me make them stick.)

Written on 11 September 2012.