Wandering Thoughts archives

2011-09-30

Unit testing by analogy to scientific hypotheses

In the popular and currently dominant view of how to consider whether something is a proper scientific hypothesis, an important criteria is falsifiability. To simplify a great deal, you test a scientific hypothesis not just by looking for what it says should be there but also by looking for what it says should not be there. If the hypothesis is 'all swans are white' you don't just look for white swans, you also look for ones that are not white.

Let us consider a theoretical function that returns True if a number is a prime (and False if it is not). We need to write a test for this function, so we fire up an editor:

def testPrimeness():
  for i in 2, 3, 5, 7, 883:
    mustBeTrue(isprime(i))

We're done, right? (Ignoring that this is only a short list of primes.)

No, not at all. What we've done is the testing equivalent of only looking for white swans. We need to also see if there are any black swans around by testing to see if the function returns False for numbers that are not prime.

Another way to look at this is that we are implicitly testing the wrong hypothesis. The hypothesis that this test checks is that isprime() returns True for prime numbers, but this is not the correct hypothesis; the actual specification is that it returns True only for prime numbers. Although it's not literally the case, we have essentially formed a non-falsifiable hypothesis without noticing and are cheerfully testing it.

It's my gut feeling that this is a relatively easy testing mistake to fall into. It's human nature (or at least our cognitive biases) to look for confirmation of what we think is the case, so we verify that isprime() returns True for primes and forget the other half of the specification.

There's a variant of this hypothesis falsification approach for test planning. One way to form tests is to imagine a whole series of hypotheses about how the function might work incorrectly and then attempt to falsify each one of them with a test. For example, I have two such falsification checks in the list of test primes (2 and 883), and a test series for mustBeFalse(isprime(n)) would likely throw in testing odd numbers as well as even ones.

(Checking the proper handling of corner cases is one common instance of this.)

This is of course closely related to testing your error paths, and I've probably written about bits of it in passing in other entries that I can't find right now.

Update: corrected an embarrassing error in my test. You can read about it in the comments.

FalsifiableUnitTests written at 00:12:58; Add Comment

2011-09-20

Why I still comment out code even with a VCS

One of the bits of VCS doctrine that I've seen is that once you have a VCS you shouldn't comment out code (or #ifdef it out or the equivalent); instead you should just delete it because you can always get it back from the VCS.

Well, I have a VCS or two and I have to say that I continue to comment out my code for the most pragmatic of reasons: I find it an easier way to work. In theory the whole idea of using the VCS for this is great, but in practice, well, it's got some drawbacks. First off, at least in my straightforward environment I haven't seen a really convenient way to revert parts of changes; it's not as if I can open up an older version of the file in another editor buffer to cut and paste back and forth. Plus there's the problem of finding whichever old version of the file that has whatever bit of code that I want to look at or get back. If I keep code commented out or disabled but still in the current version, it's simply much more accessible than if it was marooned back in the version history.

(This is where someone tells me that GNU Emacs can do this with a suitably superintelligent mode.)

Another thing I like to do is to take two different versions of the code and flip back and forth between which one of them I'm running. If I do this with comments, I can flip versions without leaving my editor (sometimes I can do it with plain undo). This is important, since I have my editor open on the files all the time when I'm working (whatever my editor of the moment is); the last thing I want is to have to close down file buffers and then reopen them every time I want to do this sort of back and forth shuffle. Even if I'm not actively testing different versions, not infrequently I'm considering which version I like better and so I really want to see them side by side (well, one after the other).

Having said all that, I do follow the VCS doctrine to the extent that I try to delete old commented out code once I've reached a definitive stopping point and I'm confidant that the old code really is dead and is never going to come back. But I consider this more or less like cleaning up any other sort of comments in the code (which is something I try to do every so often, especially at the end of a burst of development).

CommentingOutCode written at 01:22:08; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.