A Python surprise: exiting is an exception
Once upon a time I wrote a program to scan incoming mail messages during the SMTP conversation for signs of spam. Because this was running as part of our mailer, reliability was very important; unhandled errors could cause us to lose mail.
As part of reliability, I decided I wanted to catch any unhandled exception (which would normally abort the program with a backtrace), log the backtrace, save a copy of the message that caused the bug, and so on. So I wrote code that went like this:
def catchall(M, routine): try: routine(M) except: # log backtrace, etc
The mail checking routine told the mailer whether to accept or reject
the message through the program's exit status code, so the routine
sys.exit() when it's done.
Then I actually ran this code, and the first time through it dutifully
spewed a backtrace into the logs about the code raising an unhandled
Naievely (without paying attention to the documentation for the
sys module) I had
sys.exit() to mostly just call the C library
function. As the sys module documentation makes clear, this is not
SystemExit, the whole chain of
exception handling happens (in part so that
finally clauses get
executed), and at the top of the interpreter you finally exit.
Unless, of course, you accidentally catch
SystemExit. Then very odd
things can happen.
(This is another example of why
excepts are dangerous.)
Accidentally shooting yourself in the foot in Python
Recently, I stumbled over a small issue in Python's cgi module that is a good illustration of how unintended consequences in Python can wind up shooting you in the foot.
form = cgi.FieldStorage() for k in form.keys(): ... stuff ...
Then one day a cracker tried an XML-RPC based exploit against DWiki and
this code blew up, getting a
TypeError from the
This is at least reasonable, because an XML-RPC
POST is completely
different than a form
POST and doesn't actually have any form
parameters. (TypeError is a bit strong, but it did ensure that DWiki
No problem; I could just guard the
form.keys() call with an '
form: return'. Except that the '
not form' got the same TypeError.
Which is startling, because you don't normally expect '
not obj' to
throw an error.
This surprising behavior of the cgi module happens through three steps. First, Python decides whether objects are True or False like this:
- if there is a
__nonzero__method, call that.
- if there is a
__len__method, a zero length is False and otherwise you're True (because Python usefully makes collections false if they're empty and true if they contain something).
- if there is neither, you're always True.
As a dictionary-like thing, FieldStorage defines a
in the obvious way:
def __len__(self): return len(self.keys())
Finally, FieldStorage decided to let instances represent several
different things and that calling
.keys() on an instance that wasn't
dealing with form parameters should throw TypeError. (This is more
sensible than this description may make it sound.)
Apart from a practical illustration of unintended consequences and complex interactions, what I've taken away from this is to remember than __len__ on objects is used for more than just the len() function. (Other special methods also have multiple uses.)
Sidebar: so how did I solve this?
My solution was lame:
try: form.keys() except TypeError: return
I suspect that the correct solution is to check
form.type to make
sure that the
POST came in with a Content-Type header of
'application/x-www-form-urlencoded'. (Except I don't know enough to
know if all
POSTs to DWiki will always arrive like that. Ah, HTTP,
we love you so.)