My thoughts on the mockist versus classicalist testing approaches

December 31, 2011

To summarize aggressively, one of the quiet long-running disputes in the OO test driven community is between classical TDD, where you use real or stub support classes, and mockist TDD, where you use behavior-based mock objects. My guide on this is primarily Martin Fowler (via Jake Goulding). Jake Goulding summarizes the difference as stubs assert state while mocks assert messages (or method calls). I'm mostly a classicalist as far as testing goes for no greater reason than I generally find it easier, but reading Martin Fowler's article started me rethinking my passive attitude on this. On the whole I think I'm going to remain a classicalist, but I want to run down why.

One way to look at the divide is to look at what is actually being tested. When you use stubs (or real objects), what you are really testing is the end result of invoking the code under test. When you use mocks, you're testing the path that code under test used to get to its end result. So the real question is whether or not the path the code under test uses to derive its result is actually important.

Phrased this way, the answer is clearly 'sometimes'. The most obvious case is situations where the calls to other objects create real side effects; for example, exactly how you debit and credit accounts matters significantly, not just that every account winds up with the right totals in the end. This means means that sometimes you should use mocks. If you're testing with stubs and the path is important, you're not actually testing the full specification of your code; the specification is not just 'it gets proper results', the specification is 'it gets proper results in the following way'.

At the same time I feel that you should not test with mocks unless the specific behavior actually is part of the specification of the thing under test. Otherwise what you are actually testing is not just that the code works but also that it has a specific implementation. I strongly dislike testing specific implementations unless necessary because I've found it to be a great way to unnecessarily break tests when you later change the implementation.

This also ties into what sort of interfaces and interactions your objects have with each other; there's a whole spectrum of closely coupled objects to loosely coupled objects to deliberately isolated objects. Where you have deliberately isolated objects, objects used to create hard and limited interface boundaries, I think you should almost always test the behavior as well as the actual outcome for things that call them (because you want to make sure that the interface is being respected and used in the right way). Conversely, closely coupled objects (where you are only using multiple sorts of objects because it fits the problem best) are a case where I'd almost never test behavior because the split into different objects is essentially an implementation artifact.

(Possibly some or all of this is obvious to experienced practitioners. One of my weaknesses as a programmer is that I learned programing before both OO and the testing boom, and I have never really caught up with either.)

Comments on this page:

From at 2011-12-31 10:07:12:

Don’t forget the fact that there is a chance for mocks to get their mimicry wrong, either outright (i.e. behaving differently from how a real object would) or by omission (which is often the case when you mock out external dependencies like the network). So there is less confidence in the “proper” part of “it gets proper results in the following way” – a little or a lot, depending on how much complexity the mock is trying to stand in for.

What is actually going on then is that mocks test whether you implemented the logic correctly, whereas real dependencies test whether the logic is correct.

There is parallel here to the arguments made about very powerful static type systems like Haskell’s: if the program compiles, it is essentially certain to implement its logic correctly. But the compiler cannot know whether this correctly implemented logic actually solves the right problem.

(Thank you for jogging my thoughts on this.)

Aristotle Pagaltzis

By cks at 2012-01-02 17:20:44:

What is actually going on then is that mocks test whether you implemented the logic correctly, whereas real dependencies test whether the logic is correct.

I don't think that this is necessarily the case. Using real dependencies and testing the end result merely tests whether some logic path is correct and doesn't necessarily test whether the logic path is the right one; the difference matters where there are multiple logic paths that get the same end result. For instance, it won't necessarily tell you whether you made all changes to the database in a single transaction or in multiple transactions.

Possibly the answer is some sort of 'flight recorder' object interposed between the code under test and the real dependencies; this lets you validate all parts of what you care about. The real dependencies cover if the logic is correct; the flight recorder covers whether you're using them correctly.

Written on 31 December 2011.
« Why I don't like Python 3 dropping the comparison function for sorting
Why CA-based SSL is not likely to be replaced any time soon »

Page tools: View Source, View Normal, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sat Dec 31 01:04:56 2011
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.