2005-07-28
Doing DNS queries in Python
Every now and then, there comes a time when you need to make DNS
queries more complicated than gethostbyname and gethostbyaddr. Or
at least, that's what happens to me.
When this happens, my best Python tool is the dnspython module, a pure Python module for doing all sorts of DNS manipulation. But dnspython is a little bit obscure to use, so here's a sample program.
This program looks up IP addresses on the command line that are in the SBL and reports the URLs to their SBL records, so you can see exactly why something is listed. (If an IP address isn't actually in the SBL, nothing gets printed.)
#!/usr/bin/python
import sys
from dns.resolver import query
from dns.exception import DNSException
def revip(ip):
n = ip.split('.')
n.reverse()
return '.'.join(n)
sstr = '%s.sbl.spamhaus.org.'
def sblip(ip):
qstr = sstr % revip(ip)
try:
qa = query(qstr, 'TXT')
except DNSException:
return
for rr in qa:
for s in rr.strings:
print s
def process(args):
for ip in args:
sblip(ip)
if __name__ == '__main__':
process(sys.argv[1:])
The SBL DNS blocklist zone lists the SBL record URLs as TXT DNS
records, so all we have to do is to make a TXT query against the
proper zone. dns.resolver.query is the easiest interface for most
simple query operations, so that's what we use. There are two tricky
bits having to do with how DNS and TXT DNS records work.
First, a DNS reply can include more than one record in the answer; for
example, a single IP address might be in several SBL records. So we
have to iterate over the answer to get every rr (which I believe is
short for 'resource record', and is a standard DNS thing).
Second, a single TXT record can include multiple pieces of text. I
don't believe the SBL uses this, but other places do; for example, the
routeviews.org DNS-queryable database
of IP to ASN mappings does. So in this program, we iterate over all
the strings in each rr and print that.
(Perhaps next I'll do a version of this program in perl for comparison purposes. (Feel free to beat me to it in comments.))
2005-07-19
Exceptions as efficient programming
Python likes to make programming efficient: that is, to make programs fast and easy to write, and eliminate the tedious drudgery. Pretty much all dynamically typed languages do, since static typing is a lot of effort (except in languages with a lot of type inference, where it is only some effort).
After writing yesterday's entry, it's struck me that exceptions are a form of efficient programming, because they let you aggregate error checking and error handling across a large block of code.
Without exceptions, proper error checking means a lot of tedious code to check each possibly failing operation and handle the errors. Usually the code is identical or almost identical, making the tedium more pronounced. (This tedium is one reason why people can wind up skipping some error checks.)
With exceptions, you only write new code when you want to do something differently on an error. Otherwise, you can cover a large swatch of code, full of various calls that may fail, with just one exception and just one block of code to handle the problem. (Especially if you let fine details of what went wrong for error messages be captured in the exception or in ongoing state variables.)
Result: less tedium, more efficiency, happier programmers, and Python helps one of its goals along again.
Exceptions and casual programming
Exceptions have somewhat of a bad reputation in serious programming circles. People feel that they are troublesome for various reasons, including that they are non-local gotos and they make it hard to guarantee that you really are cleaning up after problems. (You can find posts about this at Joel on Software and in Raymond Chen's blog, among other places.)
I don't know about that, but I do know that exceptions are really good for casual programs, which is what I mostly write. Casual programs is my label for the small and narrow-scope utilities that crop up all over system administration. Shell scripts, perl and Python programs of a few hundred lines at most, and so on.
I like exceptions for casual programs because they mean that blind optimism and blithe ignorance of potential system-level problems is not an option. Either I explicitly write code to handle a problem like an IO error, or my program aborts with a stack puke and I find out about it.
Exceptions force you to deal with errors. You may not deal with them well, but you do not get to tacitly ignore them. (You can explicitly ignore them, sometimes semi-accidentally as I mentioned in BroadTrys. Solution: don't do that.)
In theory, serious industrial-strength programs are written to carefully check everything that can go wrong and handle all of the errors. However, even if the theory actually was the reality, casual programs are not written that way; either because the authors don't know better, or because they consider it too much nit-picking work for the expected gains.
This is one reason I've come to really like Python for casual programs (as annoying as exceptions sometimes are to deal with). They're also a useful security backstop; many security exposures are created by failing to notice bad conditions, buffer overflows being the classical example.
2005-07-02
What shouldn't be a method function
I dislike excessive religion, so I don't hold to the view that everything should be a method function. Fortunately, Python lets me indulge this view.
My rule of thumb is that any method function that doesn't use its
self argument to get at things gets made into a normal module level
function. This reduces clutter on the class (and in the source code)
and makes what's really going on clear to me and any later readers.
Of course there are some exceptions, such as:
- the function is only sometimes
self-less; some implementations have to useself. - the function varies on a per-class basis. This is a good case for StaticMethodUse.
- the function is a generic interface to multiple classes of objects, especially when they're spread out over multiple modules.
For me, most of the functions that wind up being de-method-ized this
way are internal helper functions. If a bunch of public interface
functions turn out to not use self, it's usually because my design
of the object's interface is bad.