2008-10-31
Banging rocks together in Python
One of the things I continue to like Python for is what I call 'banging rocks together': quickly programming relatively small but non-trivial things, on the order of a few hundreds lines and anywhere from an afternoon to a few days of time. A representative example would be the basic UDP request/response bandwidth tester that I recently wrote; it came to a bit under three hundred lines of Python with some comments, and took me perhaps a day or two of time to write, tune, make more complex but more useful, and polish a bit.
I find that Python has a number of advantages for this:
- it has some pretty big rocks, so you can get pretty far without too
much work.
- it runs surprisingly fast enough; for example, my UDP ping-pong
tester could saturate gigabit Ethernet without very much tuning.
This too means that you can get pretty far without too much work.
- it has enough power that you can scale your program up to more sophistication if you need it. The UDP ping-pong tester started out doing very simple things, but they turned out not to be at all representative of how UDP NFS behaved on problematic networks; I was able to make it more complicated without particular much work.
(My end result turned out to still not be representative of what happens to UDP NFS performance, but at least it stopped giving hopelessly optimistic answers and melting networks down in the process.)
This is not unique to Python, of course; I'm sure that Perl would be as good, as would a number of other modern 'scripting' languages. What matters is a certain expressive power coupled with large rocks, and Perl certainly has a very good collection of large rocks.
2008-10-06
A problem with Python's help()
I mentioned in passing that when I am
poking around an unfamiliar object (well, class), I usually reach first
for dir() and only somewhat later try help(). There's several reasons
for this, but one of them is that help() is overly verbose in a useless
way.
The problem is that many objects have a lot of completely standard
methods like __repr__, generally with absolutely no useful
documentation text, and help() will gleefully tell you all about them
(for an example of this, apply help() to itself). The more advanced an
object is, the more such standard methods it will have (to implement all
the protocols that make it a sequence object or whatever) and the more
noise that help() produces.
(And it is a lot of noise; each such member produces two or three lines
of output. This can add up very fast, especially if you use help() on
a module that has a bunch of classes with a bunch of standard methods.)
There's only two interesting things about standard methods; first, what
'protocols' the object supports (is it a sequence object, a callable
one, an iterable one, does it have a string value, etc), and second,
does it do anything unusual for any of them. The first is directly
obvious, and we can guess at the second by seeing if any of the standard
methods have non-standard docstrings. Thus, help() would be more
helpful if it did not report such generic standard methods individually,
but instead condensed all of them into a single line of information
about what the object supports.
(Note that writing better docstrings can't help the situation; at most it can drown the noise about standard methods in verbose signal.)
2008-10-04
Consider having obvious interfaces too
Python offers any number of ways to create very Pythonic
interfaces to your objects, natural ways of getting various
bits of information out of them. For an example that is fresh
to my mind, the len() of M2Crypto's Cipher
objects is the number of bits that the cipher uses.
The M2Crypto example also illustrates the problem with this way of
creating interfaces: it is not necessarily very intuitive to people
using your code. The len() of a cipher being its bits strikes me
as Pythonic (and it certainly is the closest to the 'length' of a
cipher object), but it is not obvious. Short of reading documentation
(or in this case, reading the source), there is nothing that points
to len(Cipher-object) being the way to get this particular piece of
information out of Cipher objects.
So I would like to suggest that you consider the merits of giving your
objects obvious interfaces too. Adding a .bits() method may not be as
elegant as len(), but it has the virtue that it is more likely to be
noticed by people in the output of dir() and other introspection tools.
(I don't know about other people, but I use introspection in the
interactive interpreter a lot when I am trying to figure out how to bang
together some code. I also tend to skip entirely over the __X__
entries when reading dir()'s output, partly because all objects have
a certain amount of noise there.)
Having said this, I will admit that in retrospect I have committed my
share of clever, Pythonic, and not all that obvious interfaces in my own
code. I'll have to remember this the next time I am tempted to add a
__len__ method to something.
(I don't think that docstrings are entirely the answer. Again, perhaps
it's just me but I'm more likely to look first at dir()'s output and
only second at docstrings. And of course, dir() is always there while
docstrings need someone to write them, which doesn't always happen.)