2013-04-05
Authoritative, non-recursive DNS servers now need ratelimiting
Pretty much all of the coverage about the recent DNS amplification DDoS attacks, including advice to sysadmins, has been about open recursive DNS servers and how they are bad. I followed the issue enough to check that none of our subnets appeared in the recently-available databases of open recursive DNS servers and otherwise ignored it.
It turns out that this is not good enough, because authoritative DNS servers can be used for DNS amplification DDoS attacks too. Attackers prefer open recursive DNS servers because it's easy to use them to create big replies, but many DNS zones have enough records for various things that an authoritative DNS server can be coaxed into giving relatively big replies (on the order of 500 bytes and up) to small query packets. This is not theoretical. My awakening about this came about because someone appears to have done a test run against our authoritative DNS server, achieving around a 20x amplification (partly through a very small query packet); the resulting temporary traffic spike was picked up by university-wide network monitoring and then reported to us.
Open recursive DNS servers are generally easy to fix; you close them, because very few of them are intended to be used by the world. Authoritative DNS servers can't be fixed this way because their entire purpose is answering queries from the world about your zones. Instead our only option is to implement some sort of query ratelimiting. Unfortunately this is likely to become essential (especially for people handling big, complex zones) because DDoSes seem unlikely to go away any time soon.
(While people are working on adding ratelimiting to various DNS servers, this is nowhere yet generally available or ready. That leaves you with some form of firewall-based rate limiting as the only option, if your firewall supports it. It's only important to ratelimit UDP DNS queries because those are the only ones useful for DNS amplification attacks.)
It's worth noting that one of the drawbacks of ratelimiting is that, well, you have to figure out what rate to limit things at (and also what time interval to do it over). There is no generic answer (disbelieve anyone who offers you one) so your only real choice is to either measure ahead of time or experiment to see what blocks trigger when.
(You can try bandwidth limits instead of or in addition to query limits. Again you'll want to measure actual normal and peak DNS bandwidth usage.)