Unit testing by analogy to scientific hypotheses
In the popular and currently dominant view of how to consider whether something is a proper scientific hypothesis, an important criteria is falsifiability. To simplify a great deal, you test a scientific hypothesis not just by looking for what it says should be there but also by looking for what it says should not be there. If the hypothesis is 'all swans are white' you don't just look for white swans, you also look for ones that are not white.
Let us consider a theoretical function that returns True if a number is a prime (and False if it is not). We need to write a test for this function, so we fire up an editor:
def testPrimeness(): for i in 2, 3, 5, 7, 883: mustBeTrue(isprime(i))
We're done, right? (Ignoring that this is only a short list of primes.)
No, not at all. What we've done is the testing equivalent of only looking for white swans. We need to also see if there are any black swans around by testing to see if the function returns False for numbers that are not prime.
Another way to look at this is that we are implicitly testing the wrong
hypothesis. The hypothesis that this test checks is that isprime()
returns True for prime numbers, but this is not the correct hypothesis;
the actual specification is that it returns True only for prime
numbers. Although it's not literally the case, we have essentially
formed a nonfalsifiable hypothesis without noticing and are cheerfully
testing it.
It's my gut feeling that this is a relatively easy testing mistake to
fall into. It's human nature (or at least our cognitive biases) to
look for confirmation of what we think is the case, so we verify that
isprime()
returns True for primes and forget the other half of the
specification.
There's a variant of this hypothesis falsification approach for test
planning. One way to form tests is to imagine a whole series of
hypotheses about how the function might work incorrectly and then
attempt to falsify each one of them with a test. For example, I have
two such falsification checks in the list of test primes (2
and
883
), and a test series for mustBeFalse(isprime(n))
would likely
throw in testing odd numbers as well as even ones.
(Checking the proper handling of corner cases is one common instance of this.)
This is of course closely related to testing your error paths, and I've probably written about bits of it in passing in other entries that I can't find right now.
Update: corrected an embarrassing error in my test. You can read about it in the comments.
Comments on this page:

