2008-06-15
Why DNS blocklists return information as IP addresses
Especially in light of the difficulties that they present in returning multiple bits of information, you might sensibly ask why DNS blocklists opt to return information as IP addresses, instead of something more flexible. A related question is why DNSBLs are queried in such an odd way, by reversing the octets of the host address you're interested in.
The IP address thing is simple: everyone has has gethostbyname() or
its equivalent in their standard library. Especially if you just want
a yes or no answer (and that's what DNSBLs started as and are still
mostly used for), gethostbyname() gives you a simple and basically
hassle-free basic interface. It also has the right failure mode for a
cautious system; if you can't talk to the DNSBL for any reason, the
lookup fails and you assume that the host isn't blocklisted.
Using any other DNS record type to return information requires people to write much more involved custom code to do direct DNS queries, which is usually much more complicated even if you have a high level library to handle a bunch of the details, and really more complicated if you don't. (Trust me, decoding DNS response packets is not fun for any part of the family.)
The reversed octets query format is for historical reasons. The first
DNSBL was mostly aimed at listing
entire bad networks and subnets, not individual hosts, and was probably
not using a custom nameserver. If you want to list a lot of things
with bind you want to use wildcard records, but wildcards can only
come at the start of a DNS name (at the end of the lookups, which go
right to left), not in the middle. So if you are going to use wildcard
records to cover subnets, you need to reverse the octets so you can put
the wildcard in the right place, letting you list 24.10.0.0/16 by just
creating a *.10.24.your.dnsbl PTR record.
(These days none of this applies any more; most DNSBLs that
people use list individual hosts instead of subnets, and I believe that
most of them use custom nameservers because loading several hundred
thousand records into bind doesn't work too well.)
2008-06-11
Designing a usable DNS Blocklist result format
It's relatively common for DNS blocklists to want to encode a certain amount of information in their results, ranging from the source of the information to how reliable they consider the results. For sensible reasons, DNS blocklists have to encode this information in the IP address or addresses that they return.
As it turns out, there is a useful way and a not so useful way to encode this information, because of the limitations of mailer support for DNSBL lookups. Most mailers can ask only two questions: 'is this host listed at all?' and 'is this host listed with a specific IP address?'
(Even when mailers support more, I believe that those two are the easiest two conditions to use.)
There are two natural ways to encode multiple pieces of information in
DNSBL results. One of them is to return multiple IP addresses, each
one representing one piece of information; the other is to encode all
of the information into a single IP address (using several octets, or
encoding an octet by ORing flags together, or both). Now consider
what happens if you want to know only one piece of information in your
mailer, for example 'is this host blocked with high confidence'.
If the DNSBL encodes multiple pieces of information in a single IP address and you want only one piece, you probably have no good way of extracting it and matching on it; instead you have to inventory all of the different IP addresses that it might be encoded into. However, if the DNSBL encodes the information into multiple IP addresses, you have a simple check; 'does the DNSBL return IP <X> for this host'.
Thus, I believe that the most useful and best way for DNSBLs to encode multiple pieces of information is to return multiple IP addresses for each lookup, each one encoding one specific bit of information. Encoding several pieces of information in one IP address only makes sense if you are very confidant that most people will want to use them together and will never want to check just one.