2015-12-13
I still believe in shimming modules for tests
A commentator on my 2008 entry about shimming modules for testing recently asked if I still liked this approach today. My answer is absolutely yes; I continue to love that Python lets me do this, and I feel that it (or a variation on it) is the best approach for mocking out parts of standard modules. There are two reasons for this.
The first is that I continue to believe in what I wrote in my old
entry about why I do this; I would much
rather have clean code and dirty tests than clean tests and dirty
code. I consider code that is full of artificial dependency injection
to be dirty. For instance, it's hard to think of a reason why you'd
need to do DI for socket module functions apart from being able
to inject fakes during testing. Artificially contorting my code to
test it bugs me enough that I basically don't do it for my own
programming.
The second reason follows on from the first reason, and it is that monkey patching modules this way is an excellent way of exactly simulating or exactly replaying the results you would get from them in the real world under various circumstances. If you discover that some tricky real world scenario gives your code problems, you can capture the low level results of interacting with the outside world and then use them for your future tests. You don't need some cooperative outside entity that fails in a specific controlled way, because you can just recreate it internally.
Without some way of doing this 'exact replay' style of injecting results, what at least I wind up with is tests that can have subtle failures. Synthetic high level data can be quietly wrong data, and while synthetic low level data can be wrong too my view is that I'm much more likely to notice because I know exactly what, eg, a DNS lookup should return.
(If I don't know exactly what a low level thing should return, I'm likely to actually test it and record the results. There are ways for this to go wrong, for example if I can't naturally create some malfunction that I want to test against, but I think it's at least somewhat less likely.)
Finally, I simply feel happier if the code I'm testing uses code paths that are as close to what it will use outside of testing. With monkey patching modules for tests, the code paths are authentic right down until they hit my monkey patched modules. With dependency injection, some amount of code is not being tested because it's the code involved with creating and injecting the real dependencies. Probably I will find out right away if this code has some problem, but I can imagine ways to subtly break things and it makes me a bit nervous (somewhat like my issues with complex mocks).
2015-12-06
The (peculiar) freedom of having a slow language
Back in my entry on why speeding up (C)Python matters, I said in an aside that there was a peculiar freedom in having a slow language. Today I'm going to explain what I meant in that cryptic aside.
The peculiar freedom of having a slow language is the mirror image of the peculiar constraint of having a fast language, which is that in a fast language there is usually a (social) pressure to write fast code. Maybe not the very fastest code that you could, that's premature optimization, but at least code that is not glaringly under-performant. When the language provides you a fast way to do what your code needs to do, you're supposed to use it. Usually this means using a 'narrow' feature, one that is not particularly more powerful than you need.
In a slow language like (C)Python, you are free of this constraint. You don't have to feel guilty about using an 'expensive' feature or operation to deal with a small problem instead of carefully writing some narrow efficient code. The classical example of this is various sorts of simple parsing. In many languages, using a regular expression to do most parsing is vastly indulgent because it's comparatively slow, even if it leads to simple and short code; there is great social pressure to write hand-rolled character inspection code and the like. In CPython you can use regexps here without any guilt; not only are they comparatively fast, they're probably faster than hand written code that does it the hard way.
The result of this is that in CPython I solve a lot of problems with simple brute force using builtins, regular expressions, and other broad powerful features, while in languages like Go I wind up writing more complicated, more verbose code that is more narrow and more efficient because it only does what's strictly necessary.
(I came to really be aware of this after recently writing some Go code to turn newlines into CR NL sequences as I was writing output to the network. In Python this is a one-liner piece of code; in Go, the 'right' Go-y way involves a carefully efficient hand-rolled loop, even though you could theoretically do it in exactly the same way that Python does.)