Wandering Thoughts archives

2008-05-26

Shimming modules for testing (and fun)

Suppose that you have a chunk of Python code that wants to properly map IP addresses to hostnames, and you want to test this code to make sure that it actually works (especially with unit tests). In order to do this you need to contrive for various IP address and hostname lookups to fail in various ways, and to do this on command.

The easy way to do this is to exploit Python's freedom by shimming (well, replacing) the gethostbyname() and gethostbyaddr() functions in the socket module with completely fake versions, for example simple functions that just consult an internal table for the results they should return for various lookups. You then test your IP to hostname mapping code against these known fake IP addresses and make sure it returns correct results (since you already know what each IP address should result in; you specified it).

(Make sure that your test framework saves the original functions and puts them back into place after the test finishes; otherwise things may get very confused.)

Shimming ordinary module functions is usually a relatively simple thing (another useful module to shim is the time module, if you are testing time-dependent things). With more work you can shim entire classes, such as socket.socket, so that code that creates its own sockets and does things to them can be tested under completely controlled conditions.

(Watch out, though; it's easy for your shims to get overly complex and clever. I was probably there by the end of my unit testing fun. Also, remember to document what all of this clever testing code does, or you may have more excitement than you want in a year or so.)

Disclaimer: this is unlikely to be the officially TDD-approved way of unit testing this sort of stuff. But it has the two great virtues of not contorting your actual code and being relatively simple.

ShimmingModulesForTests written at 23:25:37; Add Comment

2008-05-01

What the co_names attribute on Python code objects is

As a trap for the unwary, Python code objects have both a co_names and a co_varnames attribute. Since I just confused myself about which was what the other day, here is what the co_names one is.

Put simply, co_names is a tuple of names of globals and attributes that are used by the function's code. For example, if you have 'a = self.bar()' in the function, the 'bar' will show up in co_names, as will the 'foo' from 'a = foo()'.

(Perhaps I should call these 'identifiers' instead of 'names'. In Ruby and Lisp and probably elsewhere these are called symbols.)

Ultimately this is part of how the CPython bytecode interpreter is implemented. When the bytecode interpreter refers to anything but a local variable, it has to do an attribute lookup with the name to get the actual object involved. Rather than put the name that's being looked up directly in the bytecode instructions, CPython puts all the names into a table and has the instructions refer to table slots, so the LOAD_GLOBAL instruction says 'look up name 3' instead of 'look up "somevar"'. And co_names is that table (or at least a representation of that table).

Each name only appears once in co_names, no matter how many times it's used in your function and no matter if it's used in different contexts; if you have both 'obj.foo' and 'foo()' in your code, there will only be one "foo" in co_names, even though one use of the name is for an object attribute and one is for a global. As far as I can tell from reading CPython source, names are always interned strings and so are globally unique.

(The co_names table slot numbers are of course a per-function thing; slot 0 in different functions will refer to completely different names.)

WhatCoNamesIs written at 22:48:26; Add Comment

By day for May 2008: 1 26; before May; after May.

Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.