One of my test based development imperfections: use driven testing
Recently I've been coding what I call a 'sinkhole SMTP server' in Go (a SMTP server that doesn't do anything messages except perhaps save them to disk). Over the process of this I've once again gotten to watch one of my (bad) habits with test driven development in action, something that I will call 'use driven testing'.
A SMTP server can conceptually be divided into at least two layers. One layer is a general layer that handles the SMTP protocol; it reads SMTP commands from the network, makes sure they are actually commands, enforces command ordering requirements, handles a bunch of fiddly stuff, and so on. A second layer sits atop this and handles higher level policy issues like what email addresses are acceptable (as source or destination addresses) and what to do with received email messages. The bottom layer is generic (you could in theory reuse it in any number of different SMTP servers) while the top layer is generally specific to your needs.
I started out writing the bottom layer as a Go package. Go has reasonably good support for testing, so I wrote tests for this layer's functionality of parsing SMTP command lines and handling the basic flow of the protocol. In other words, I did relatively conventional test focused development; I wrote code and then wrote tests to make sure it work, and sometimes I mutated the code some to make it easier to test. But at a certain point the general base SMTP layer passed the point where I could start writing the top layer. At this point I switched to writing the top layer and mutating the base layer as necessary to make a better API or to make things work. I didn't write any new base layer tests for the base layer's added functionality and I didn't write tests for the top layer; testing for the top layer consisted of actually using it. This is the switch to what I'm calling 'use driven testing', where actually using my code is how I test it.
This is flawed and imperfect but it's hard for me to see how to get out of it. Writing top layer code, changing the bottom layer to match, and then going on to write bottom layer tests seems like make work or wasted work. I have to have the top layer code and the bottom layer tests basically duplicate that work without telling me much extra. Of course this is wrong; writing tests will tell me not just if something works now but whether it keeps on working. But it's hard to feel motivated to do the extra work and it's also hard to shape an API for both of convenient testing and the convenience of higher layer stuff.
(There's also the related question of how much stuff in the higher
layer I want to test and what the best way to test it is. I think
that Go will let me write tests for code in the
main package that
makes up your program, but I haven't actually verified that.)
Okay, let me admit something else about this: writing live code is a lot more fun than writing tests. When I write top layer code, my program does something real for me. When I write more tests, yay, more tests (which may break and have to be redone if I restructure what my actual productive code does). It's very hard to avoid the fun and do drudgery, especially when I'm doing this entirely for fun. At the same time I wind up feeling guilty for having minimal tests and chunks of code that are only tested through use by the higher level.
Complicating this is that some of the functionality I wound up putting in the lower layer is not straightforward to test. For example, how do I test that TLS negotiation actually works or that network IO is (optionally) being written at an artificially slow rate of one character every tenth of a second? There are probably clever ways but they're not obvious to me, and it's hard to feel hugely motivated when I can test these using the live program by inspection or by using swaks.
(I have considered the merits of automatedly hooking the Go SMTP client up to my server and verifying that it, for example, sees the expected SMTP extensions. Maybe this is actually the right answer.)
I don't have any answers here, just stuff that I'm thinking about aloud. Although perhaps my use driven testing is not completely crazy and at some point I should just accept that high level tests of functionality are fine (even if some of them are manual).
PS: part of the pain here is that testing the output of a SMTP server is kind of a pain in the rear. It's easy enough to test the literal output in response to a series of commands but that's both verbose and it blows up any time you change the messages the server sends (which discourages changing those messages, which to me is bad). Doing better requires building some moderately complex testing infrastructure to extract, say, the sequence of SMTP response codes that you expect so you don't care about the literal text.