Whether or not to use cgo for Go packages, illustrated in a dilemma

February 16, 2016

Recently Dave Cheney wrote cgo is not Go where he very strongly advocates for not getting functionality into Go by using cgo to interface to existing C libraries. More directly, he writes:

I believe that when faced with reimplementing a large piece of C code in Go, programmers choose instead to use cgo to wrap the library, believing that it is a more tractable problem. I believe this is a false economy.

He goes on to run down a laundry list of problems that using cgo causes. All of them are real problems and they do bite, and I very much appreciate it when people write pure Go packages for what could otherwise easily be cgo-based interfaces to existing C libraries (eg, there's pure Go bindings for the X protocol).

But, well, as it happens I have a cgo using package, and today I want to talk about the problems of making it a pure Go package. My package is a Go interface to a Solaris/Illumos/etc facility for getting kernel statistics, called 'kstat'. Kstat is not officially provided as a documented kernel interface, just as a C library with a specific API. Of course the C library talks to the kernel under the hood, but this interface to the kernel is documented only in source code comments (and is not necessarily stable).

This could be reimplemented from scratch in pure Go (although using the unsafe package). Go can make ioctl() calls and it's actually not all that difficult to reverse engineer the kernel ioctls that libkstat is using (partly because the kernel interface is fairly close to the user level API). While in theory the kernel API might change, it seems unlikely to do so. A pure Go implementation wouldn't make this package any less OS-specific, but it would avoid the other problems with cgo.

On the other hand, a pure Go version has potential problems of its own. One is testing that it properly handles various things that the kernel does with changing kstats, some of which only occur in odd corner cases, especially since I would be writing it from necessarily imperfect understanding of how things work. Another is that this is more or less explicitly not supported by the platform; on Solaris and Illumos, libc and other OS-supplied shared libraries are the API, and the kernel interfaces is officially an implementation detail. If I have problems, I can pretty much count on all of the Illumos people telling me 'don't do that'.

(Go goes behind the back of this libc API already, which may hurt it at some point.)

So, I could rewrite to be a pure Go package. But I'd get a collection of new problems in the process and I'm not sure that the package would be as solid and as probably portable as it is now (as it is, for example, it probably works on Solaris 11 too). Is this a good tradeoff? I don't know. I feel it's a genuine dilemma.

PS: Over the course of writing this entry, I think I've talked myself into believing that a pure Go rewrite is at least reasonably possible and thus worth attempting (in theory). In practice I have no idea if I'll ever have the time and energy to do one, since the current situation does work for me.

Sidebar: The two pragmatic reasons that this is a cgo based package today

When I decided that I wanted to be able to get kstats in a Go program, I had neither the energy for a grounds up reverse engineering of the libkstat library (which I didn't even understand when I started out) nor the spare time a grounds up (re)write would have required. I was interested in solving a small problem, which meant not a lot of time and not very much frustration. Talking to libkstat in a Go program was only slightly more work than talking to it in a C program, and the Go program was nicer than a C program would have been.

(My other option was Perl, but I was even less interested in Perl than in C.)

I suspect a bunch of people are going to keep making this particular tradeoff and writing cgo-based packages. Some of them will throw the packages up on Github because, well, why not, that's the friendly way.

Comments on this page:

By James (trs80) at 2016-02-17 11:10:43:

Here is another view that says you should use C libraries, particularly when they are POSIX standards, instead of making system calls yourself.

Written on 16 February 2016.
« SMTP submission ratelimits should have delays too
The many load averages of Unix(es) »

Page tools: View Source, View Normal, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Tue Feb 16 01:28:44 2016
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.