2012-08-31
A realization about one bit of test-driven development
One of the standard pieces of instruction for TDD is that when you are about to do some coding you should not just write the tests before the code but you should also run the tests and see them fail before starting on the real code. You can find this cycle described in a lot of places; write test, run test, see the failure, write the code, run the test, see the test pass, feel good (monkey got a pellet, yay). Running tests that you knew were going to fail always struck me as stupidly robotic behavior, so even when I wrote tests before my code (eg, to try out my APIs) I skipped that step.
Recently I came to a realization about why this is actually a sensible thing to do (at least sometimes). The important thing about seeing your test fail first is it verifies that your code change is what made the test pass.
(This is partly a very basic check that your test is at least somewhat correct and partly a check on the code change itself.)
Sometimes this is a stupid thing to verify because it's already clear
and obvious. If you're adding a new doWhatever() method, one that
didn't exist before, and calling it from the test, then your code change
is clearly responsible for the test succeeding (at least in pretty much
any sane codebase; your mileage may vary if you have complex inheritance
trees or magic metaprogramming that catches undefined methods and so
on).
But not all changes are like that. Sometimes you're making a subtle change deep in the depths of existing code. This is where you most want to verify that the code is behaving as you expect even before you make your modification; in other words, that the tests you expect to fail and that should fail do indeed fail. Because if a test already passes even before your code change, you don't understand the existing code as well as you thought and it's not clear what your change actually does. Maybe it does nothing and is redundant; maybe it does something else entirely than what you thought (if you have good test coverage, it's at least nothing visibly damaging).
(Alternately, your test itself has a problem and isn't actually testing what you think it is.)
There's a spectrum between the two extremes, of course. I'm not sure where most of my code falls on it and I still don't like the robotic nature of routinely running tests that you expect to fail, but this realization has at least given me something to think about.
2012-08-24
The theoretical right way to check if an account is in a Unix group
If you are checking once to see if an account is in a group, there is a simple and obvious approach (omitting some details):
grp = getgrnam(GRPNAME);
getgrouplist(pwent->pw_name, pwent->pw_gid, &groups, &ngroups);
for (i = 0; i < ngroups; i++) {
if (groups[i] == grp->gr_gid)
return 1;
}
There is just one problem
with this: on many systems, getgrouplist() will re-scan and
re-parse /etc/group every time you call it. If you call it only
once or twice this usually doesn't matter. If you call it a lot,
this matters a lot (especially if /etc/group is big).
In theory, the better check is simple. Instead of getting the login's
group list and seeing if the group's GID is in there, you work the other
way around: you get the group's membership (as part of getting the group
entry itself) and then see if the login either has the group's GID as
it's primary group or appears on the list of group members. This avoids
(repeatedly) parsing all of /etc/group, especially if you cache the
group entry for the group you care about.
However, these two checks are not equivalent and now you have to decide what you care about. The first version checks to see if a login has the group ID of a particular group. The second version checks to see if a login has the group name of a particular group. To see the difference, consider the following group entries:
wheel:x:10:jane cog:x:10:fred
Here is the question: is fred in group wheel? If we go by GID
the answer is clearly yes; fred will have GID 10 as one of his
supplemental groups when he logs in and be able to access files that are
only accessible to GID 10. But if we go by group name, the answer is no;
fred is in cog, not wheel, although they have the same GID. Which
version the software you use cares about is something that you may have
to investigate.
(If you are designing the software, you can decide to make it whichever is more convenient and useful to you.)
The corollary is that if you really do need the GID version and you want
to be fast for a large number of checks, in theory you need to build
some sort of full in-memory index of /etc/group. In practice, however,
duplicate GIDs are extremely rare and usually not intended so you may be
to be able to ignore them. If not, you can at least scan /etc/group
once to see if the group you care about has duplicated GIDs.
(The fully general version is to scan /etc/group and accumulate all
of the group entries for groups with the same GID as the group you care
about. Then you check all of their group membership lists. This is going
to require caching in order to make it fast for a large number of logins
and groups.)
2012-08-22
My view on the understandability of language idioms
In my entry on the periodic strangeness of idiomatic Python I noted that I found the C version of the 'repeat N times' loop much more immediately understandable than the Python version. In response, a commentator wrote:
I don't find this: "for (i = 0; i < times; i++) { .... }" idiomatic at all ... unless you are familiar with a language that writes its loops that way. [...]
Looking at my entry again I see that I was unclear about what I meant by immediately understandable. My impression from the comment is that the commentator expects idiomatic code to be understandable even if you don't know the language it's written in. I don't believe in this for idiomatic code any more than I believe in this for ordinary code (which I don't).
When I talked about the C code being more immediately understandable, I
meant to someone who knew C in general. If you know C, you know what
a for loop is and what the parts of it are; once you know that, it's
easy to see what this particular for loop does. In that sense the C
idiom for 'repeat N times' is obvious (in a way I feel that the Python
idiom is not).
(This is in a sense what 'idiomatic code' conventionally means; it is the natural way to solve the problem for someone who is familiar with the language.)
When an idiom is not obvious or immediately understandable, it's using some less common or relatively obscure feature of the language, or an odd convention that you have to know about, or doing something tricky but clever. There are perfectly good, well regarded idioms that fall into this general category, for example the Schwartzian transform. What these idioms all have in common is that you either have to know them already or you need to carefully think them through before you can understand what they do; someone who merely knows the language without knowing the idioms cannot read code with them in it and immediately understand it. In a sense, such idioms are slang or jargon.
To me this is what makes them strange and worthy of note. If you know the idiom the code's perfectly natural; if you don't know the idiom (but do know the language) it's a 'huh? what?' moment.
2012-08-09
Learning something from not testing jQuery 1.8's beta releases
I was reading the jQuery 1.8 release announcement when I ran across this gem:
We don't expect to get any bug reports on this release, since there have been several betas and a release candidate that everyone has had plenty of opportunities to thoroughly test. Ha ha, that joke never gets old. We know that far too many of you wait for a final release before even trying it with your code. [...]
Since we use jQuery a bit and I'm one of those 'didn't test even the release candidate' people I was immediately seized by an urge to justify my inaction. Then I had a realization.
First, the justification. We're not using jQuery in any demanding way, in a situation where we'll notice the improvements in 1.8. Thus we'd be testing a beta or release candidate purely to validate that it's compatible with our code. Unfortunately testing beta code doesn't save us from having to re-test with the actual release; we can't assume that the changes between a beta and the release are harmless. Nor does reporting bugs against the beta really help us since we're not trying to upgrade to 1.8 as fast as possible. This makes testing betas and even release candidates basically a time sink as far as we're concerned.
(If you actively want or need to use the new release then reporting bugs early (against the betas or even the development version) increases the chances that the bugs will be gone in the released version and you can deploy it the moment it passes your tests.)
All of this sounds nice and some of you may be nodding your heads along with me. But as I was planning this entry out I had the realization that what this really reveals is that we have a testing problem. In an environment with good automated tests it should take almost no time and effort to drop a new version of jQuery into a development environment and then run our tests against it to make sure that everything still works. This would make testing betas, release candidates, or even current development snapshots something that could be done casually, more or less at the snap of our fingers. That it isn't this trivial and that I'm talking seriously about the time cost of testing a new version of jQuery is a bad sign.
What's really going on is that I haven't built any sort of automated testing for the browser view of the web app (well, actually, I haven't built tests for any of it, but especially the browser view of things). This means that testing a new version of jQuery requires going through a bunch of interactions and corner case tests in at least one browser, by hand. I effectively did this once, when I was coding all of these JS-using features, but I did it progressively (bit by bit, feature by feature) instead of all at once. And of course I was making sure that my code worked instead of testing that a new version of jQuery is as bug free and compatible as it's expected to be; the former is far more motivating than the latter (which is basically drudge work).
I'm sure there's ways of doing automated tests of client side JavaScript (including jQuery), but two things have kept me away from trying to explore it. First, all through the development of this web app I've been focused on getting the app running instead of building infrastructure like tests; among other things, I was learning as I was going and just learning how to do stuff is hard enough without also trying to learn how to build automated tests for it. Second, the entire thought of automated testing of things involving browsers gives me hives since I'm pretty sure it's going to be complex, require a bunch of infrastructure, and involve a pile of hacks, especially on Unix (I can't see how you can get away from driving a real browser by some form of remote control and I can't see how that can be done at all gracefully).
This is a weakness. I'm not sure it's enough of a weakness to be worth spending the time to fix it, though.
(If I was planning to do much more client side JS programming or if this web app was going to undergo significant more development, things might be different. But as it is I don't see much call for either in the moderate future and there's always a lot of claims on a sysadmin's time.)