Wandering Thoughts archives

2011-12-31

My thoughts on the mockist versus classicalist testing approaches

To summarize aggressively, one of the quiet long-running disputes in the OO test driven community is between classical TDD, where you use real or stub support classes, and mockist TDD, where you use behavior-based mock objects. My guide on this is primarily Martin Fowler (via Jake Goulding). Jake Goulding summarizes the difference as stubs assert state while mocks assert messages (or method calls). I'm mostly a classicalist as far as testing goes for no greater reason than I generally find it easier, but reading Martin Fowler's article started me rethinking my passive attitude on this. On the whole I think I'm going to remain a classicalist, but I want to run down why.

One way to look at the divide is to look at what is actually being tested. When you use stubs (or real objects), what you are really testing is the end result of invoking the code under test. When you use mocks, you're testing the path that code under test used to get to its end result. So the real question is whether or not the path the code under test uses to derive its result is actually important.

Phrased this way, the answer is clearly 'sometimes'. The most obvious case is situations where the calls to other objects create real side effects; for example, exactly how you debit and credit accounts matters significantly, not just that every account winds up with the right totals in the end. This means means that sometimes you should use mocks. If you're testing with stubs and the path is important, you're not actually testing the full specification of your code; the specification is not just 'it gets proper results', the specification is 'it gets proper results in the following way'.

At the same time I feel that you should not test with mocks unless the specific behavior actually is part of the specification of the thing under test. Otherwise what you are actually testing is not just that the code works but also that it has a specific implementation. I strongly dislike testing specific implementations unless necessary because I've found it to be a great way to unnecessarily break tests when you later change the implementation.

This also ties into what sort of interfaces and interactions your objects have with each other; there's a whole spectrum of closely coupled objects to loosely coupled objects to deliberately isolated objects. Where you have deliberately isolated objects, objects used to create hard and limited interface boundaries, I think you should almost always test the behavior as well as the actual outcome for things that call them (because you want to make sure that the interface is being respected and used in the right way). Conversely, closely coupled objects (where you are only using multiple sorts of objects because it fits the problem best) are a case where I'd almost never test behavior because the split into different objects is essentially an implementation artifact.

(Possibly some or all of this is obvious to experienced practitioners. One of my weaknesses as a programmer is that I learned programing before both OO and the testing boom, and I have never really caught up with either.)

MockistVsClassicalist written at 01:04:56; Add Comment

2011-12-16

Avoiding the classic C quoting bug in your language

To summarize an earlier entry, the classic C quoting bug that I'm talking about here is writing 'printf(ustr)' or 'syslog(pri, ustr)' where 'ustr' is a string that comes from some form of external input. In that entry I mentioned that it's possible for languages to avoid this entire class of bugs with the right design. Well, it's time to elaborate on that parenthetical aside.

To start with, let's turn the issue around: what is it about these functions that causes the bug? The answer is that the bug happens because all of these functions do two things; they do something useful and in the process of doing it they format their arguments for you. The bug happens when you (the programmer) just want to do the useful thing and you overlook the fact that you're also getting the formatting for free. The conclusion is straightforward. To make this bug impossible, we need to make functions like this do only one thing; they should take a plain string and not format it. But people still need string formatting, so we need some easy way for people to do it in order to make these single purpose functions both feasible and convenient, a way that involves as close to no extra work as possible. In short, we need effortless generic string formatting.

(The need for no extra work is why we can't do this in C. We need a language where you don't need to explicitly manage string lifetimes, since the result of string formatting is another string.)

In theory you can implement generic string formatting as a function call (ideally with a very short function name). In practice I think that it isn't going to work the way you want because of perception issues; if string formatting is just a function call, it's still tempting to create convenience functions that bundle the two function calls together for you (one to format the arguments then one to do the useful thing). Doing generic string formatting as an operator (such as Python's '%') has the pragmatic benefit of drawing a much more distinct line between regular function calls and formatting strings.

(The third approach to effortless generic string formatting is string interpolation in certain sorts of strings. This has the benefit of sidestepping the entire problem, although it has problems of its own.)

PS: another approach in an OO language is to give strings an explicit formatting or interpolation method, so that you might write '"%s: %s".fmt(a, b)'. My gut thinks that this is closer to string formatting as an operator than anything else.

AvoidingQuotingBug written at 00:36:14; Add Comment

2011-12-05

Debuggers and two sorts of bugs

In the middle of reading Go Isn't C I ran across some remarks about how it was bad that programmers ignore debuggers in favour of print statements. This immediately sparked my standard reaction, which is that debuggers are focused on telling you what happens next but generally I want to know how on earth my code got into its current state. Then I started thinking some more and realized that I was being too strong because this isn't really accurate. In fact I use debuggers periodically, but only on certain sorts of bugs.

Let us say that there are two sorts of bugs. For lack of better names, I will call them direct bugs and indirect bugs. A direct bug's cause can be determined immediately by looking at the call stack, the local variables, the code, and so on at the time when it happens. You can say 'oh, the caller forgot that this function couldn't be called with a NULL', or see that you forgot to handle a case and something fell through to code that should never have been reached in this situation. A decent debugger works very well on direct bugs, or even features like automatic call stack backtraces on uncaught exceptions (as you get in languages like Python).

Indirect bugs are data structure corruption bugs (or sometimes flow of control bugs), where you are now in a 'this can't happen' situation (whether caught by an assert() or not). Finding the immediate problem in the code or diagnosing the source of corrupt results is only a starting point; the real challenge is discovering where and how things went off the rails so as to get you to where you are now. Indirect bugs are the bugs where you need to look back into the past to answer 'how did I get here?' questions.

(For practical examples, my recent Liferea issue was a direct bug; if I had read the code carefully, the first stack backtrace would have shown me the problem. My SIGCHLD signal handler race in Python was an indirect bug; I always knew what the direct problem was, but I had no idea how the program got into that state until I did some careful analysis.)

My unconscious bias until now has been that direct bugs are uninteresting because they are easy to solve from basic inspection, so I only really thought about what I wanted to deal with indirect bugs. But the main reason that direct bugs are easy to deal with is that I already have tools like stack backtraces and inspection of local variables so it's easy to see what's wrong with the program's current state.

Sidebar: a hazard of dealing with indirect bugs

It's often popular to 'fix' an indirect bug that crashes the program or generates obviously bad results by making the code accept the impossible state; for example, by adding a NULL check to the low-level routine that's crashing with a NULL pointer exception. This is generally the wrong idea (you're treating the symptoms instead of the disease), but it's tempting as a quick fix and it's an easy approach to fall into if you don't understand the code well enough to know that you're dealing with an indirect bug, not a direct bug.

This is one of the issues that always makes me wary about fixing 'obvious' crash bugs, especially if I want to send the fix upstream. Before I add a NULL pointer check or the like I need to be sure that it's the real bug, and I need to understand what the code should do instead of crashing (which is not always obvious).

DebuggersAndBugTypes written at 00:38:56; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.