2016-02-29
Sometimes, doing a bunch of programming can be the right answer
I like doing programming, and on top of that I can be a bit obsessive about it; for instance, if there are obvious features for a program to have, I want to add them even if they may not be strictly necessary. If left to myself, I would write plenty of programs for plenty of things and enjoy it a fair bit. The problem with this is that locally written programs are often an overhead and a long term burden, as xkcd has famously pointed out. Sure, it's nice to write code to solve our problems, but often that's not the right answer. I'm very conscious of this every time I'm tempted to write a program, and as a result I wind up sitting on my hands a lot.
We have a long standing local program to sort of deal with the pain of the Ubuntu package update process. It was a relatively minimal program, but it worked, and so for a long time I suppressed my urge to make it shinier and let it be. A couple of weeks ago I reached the limits of my tolerance after one too many extended end-of-day update runs. Never mind being sensible, I was going to change things because I couldn't take the current situation any more, and it didn't matter if this was objectively a bad use of my time.
I spent about a week working over most of the code, substantially growing the program in the process. The result is faster and more convenient, but it is also a lot more than that. The old update process had a lot of limitations; for example, it didn't really notice if updating one machine had problems, and if updating one machine hung there was no way to go see what was going on and maybe rescue the situation. The new program fixes these issues. This makes it substantially more complicated, but also much more useful (and less dangerous). There are a whole host of things we can do now because I got annoyed enough at the state of affairs to do something that wasn't strictly sensible (and then carry on further).
There's two lessons I draw from this. The first is that sometimes writing the code is the right answer after all. To put it one way, not everything that feels good is a bad idea. The second is that I actually should have done this years ago. This problem and its parade of irritations and workarounds is not new; we've been annoyed at the hassles of Ubuntu updates probably for as long as we've been running Ubuntu machines, and there's nothing in my code that couldn't have been done years ago. Had I done this coding much earlier, well, we could have been enjoying its improvements for quite some time by now.
(The meta-lesson here is that the earlier you make a change or an improvement with a likely long lifetime, the higher your payoff is. From the start we were pretty certain we'd be running Ubuntu machines for a long time to come, so clearly we could have forecast that a good update handling program had a big potential long-term payoff.)
2016-02-24
I'm often an iterative and experimental programmer
I've been doing a significant amount of programming lately (for a good cause), and in the process I've been reminded that I'm often fundamentally an iterative and explorative programmer. By this I mean that I flail around a lot as I'm developing something.
In theory, the platonic ideal programmer plans ahead. They may not write more than they need now, but what they do write is considered and carefully structured. They think about the right data structures and code flow before they start typing (or at the latest as they start typing) and their code is reasonably solid from the start.
I can work this way when I understand the problem domain I'm tackling and how I want to approach it well enough to know in advance what's probably going to work and how I want to approach it. This works even (or especially) for what people sometimes consider relatively complicated cases, like recursive descent parsers. But put me in a situation where I don't know in advance what's going to work and roughly how it's all going to come out, and things get messy fast.
In a situation of uncertainty my approach is not to proceed cautiously and carefully, but instead to bang something ramshackle together to get experience and figure out if I can get an idea to work at all. My first pass code is often ugly and hacky and almost entirely the wrong structure (and often contains major duplication or badly split functionality). Bits and pieces of it evolve as I work away, with periodic cleanup passes that usually happen after I get some piece of functionality fully working and decide that now is a good time to deal with some of the mess. Entire approaches and user interfaces can be gutted and replaced with things that are clearly better ideas once I have a better understanding of the problem; entire features can sadly disappear because I realize that in retrospect they're bad ideas, or just unnecessary.
(It's very common for me to get something working and then immediately gut the working code to rebuild it in a much more sensible manner. I have an idea, I establish that the idea can actually be implemented, and then I slow down to figure out how the implementation should be structured and where bits and pieces of it actually belong.)
Eventually I'll wind up with a solid idea of what I want from my program (or code) and a solid understanding of what it takes to get there. This is the point where I feel I can actually write solid, good code. If I'm lucky I have the time to do so and it's not too difficult to transmogrify what remains of the first approach into this. If I'm unlucky, well, sometimes I've done a ground up rewrite and sometimes I've just waited for the next time I tackle a similar problem.
2016-02-16
Whether or not to use cgo for Go packages, illustrated in a dilemma
Recently Dave Cheney wrote cgo is not Go where he very strongly advocates for not getting functionality into Go by using cgo to interface to existing C libraries. More directly, he writes:
I believe that when faced with reimplementing a large piece of C code in Go, programmers choose instead to use cgo to wrap the library, believing that it is a more tractable problem. I believe this is a false economy.
He goes on to run down a laundry list of problems that using cgo causes. All of them are real problems and they do bite, and I very much appreciate it when people write pure Go packages for what could otherwise easily be cgo-based interfaces to existing C libraries (eg, there's pure Go bindings for the X protocol).
But, well, as it happens I have a cgo using package, and today I want to talk about the problems of making it a pure Go package. My package is a Go interface to a Solaris/Illumos/etc facility for getting kernel statistics, called 'kstat'. Kstat is not officially provided as a documented kernel interface, just as a C library with a specific API. Of course the C library talks to the kernel under the hood, but this interface to the kernel is documented only in source code comments (and is not necessarily stable).
This could be reimplemented from scratch in pure Go (although using
the unsafe package). Go can make ioctl() calls and it's actually
not all that difficult to reverse engineer the kernel ioctls that
libkstat is using (partly because the kernel interface is fairly
close to the user level API). While in theory the kernel API might
change, it seems unlikely to do so. A pure Go implementation wouldn't
make this package any less OS-specific, but it would avoid the other
problems with cgo.
On the other hand, a pure Go version has potential problems of its own. One is testing that it properly handles various things that the kernel does with changing kstats, some of which only occur in odd corner cases, especially since I would be writing it from necessarily imperfect understanding of how things work. Another is that this is more or less explicitly not supported by the platform; on Solaris and Illumos, libc and other OS-supplied shared libraries are the API, and the kernel interfaces is officially an implementation detail. If I have problems, I can pretty much count on all of the Illumos people telling me 'don't do that'.
(Go goes behind the back of this libc API already, which may hurt it at some point.)
So, I could rewrite to be a pure Go package. But I'd get a collection of new problems in the process and I'm not sure that the package would be as solid and as probably portable as it is now (as it is, for example, it probably works on Solaris 11 too). Is this a good tradeoff? I don't know. I feel it's a genuine dilemma.
PS: Over the course of writing this entry, I think I've talked myself into believing that a pure Go rewrite is at least reasonably possible and thus worth attempting (in theory). In practice I have no idea if I'll ever have the time and energy to do one, since the current situation does work for me.
Sidebar: The two pragmatic reasons that this is a cgo based package today
When I decided that I wanted to be able to get kstats in a Go program, I had neither the energy for a grounds up reverse engineering of the libkstat library (which I didn't even understand when I started out) nor the spare time a grounds up (re)write would have required. I was interested in solving a small problem, which meant not a lot of time and not very much frustration. Talking to libkstat in a Go program was only slightly more work than talking to it in a C program, and the Go program was nicer than a C program would have been.
(My other option was Perl, but I was even less interested in Perl than in C.)
I suspect a bunch of people are going to keep making this particular tradeoff and writing cgo-based packages. Some of them will throw the packages up on Github because, well, why not, that's the friendly way.