2006-10-29
Python's assert is a weak debugging tool
Here's something (perhaps obvious) that recent experience has taught me:
Python's
assertis a pretty bad debugging tool.
The problem is that a failed assert gives you almost no information,
just that it did fail. Almost all the time, you need to know something
about what the bad values are in order to actually debug the problem,
so your first step is to transform 'assert condition' into something
like:
if not condition: print <stuff> assert condition
(The exception that proves the rule are assertions added just to see if a particular condition was as impossible as you thought; there, once you know it's not impossible you're going to be adding code to handle it properly.)
One quick fix is to start using the two-expression form of assert,
which at least lets you bundle a useful message (complete, hopefully,
with some state information) with the assertion failure. But it's
difficult to predict in advance just what information you're going
to need to debug something that you thought was impossible when you
wrote the code.
(Writing these entries is educational, as it forces me to actually do
careful research so that I don't write something truly stupid instead of
just relying on my memories. In other words, I didn't know that assert
could also be given a message to assert with until just now.)
I consider this especially annoying in assert's case because it is a
part of the language. As a language builtin, it could break the normal
rules constraining functions in order to be more useful and do clever
things like print information about the variables involved in the
failing expression.
It's possible to do your own version of assert, or to use a traceback
hook in order to make it smarter; possible things to do include dumping
local variables and entering the debugger. So far I haven't tried to
build anything like this myself, although the automatic variable dumping
code would make an interesting exercise in playing around with deep
Python introspection.
2006-10-06
A Python quoting irritation
I was writing code today where I needed to turn a single backslash ('\') into two backlashes ('\\') in order to quote it (so that the shell wouldn't eat it, through a combination of annoyances). My first attempt was:
s = s.replace(r"\", r"\\")
To my surprise, this gave me an error, and a peculiar one: Python
reported SyntaxError: invalid token (at the closing ')'.) It took a
bunch of head-scratching to figure out that Python was blowing up on the
first string. I tried it with triple-quoted strings and that didn't work
either (but with a different error).
Ultimately, this is because Python uses backslash in strings for two
separate jobs: quoting end of string characters, and introducing special
characters like \n. The former is active all the time; it is only the
latter that is turned off by the r string modifier.
Where things get really confusing is the handling of '\\' in r
strings. The double-backslash stops the second backslash from escaping
anything it normally would have, but is not turned into '\' in the
parsed string; it remains intact as '\\'.
So the ultimate answer I wound up with is:
s = s.replace("\\", r"\\")
You can write the second string as "\\\\", but that sort of thing
annoys the heck out of me so I would up leaving it as an r string.
(This is all actually documented in the string literal section of the reference manual. If it is covered in the tutorial, I read it sufficiently long ago to have forgotten it.)
All of this (coupled with wrestling with shell quoting, which is how I wound up dealing with this) has reminded me of how much I hate complicated quoting schemes and the need for quoting in general.