2015-09-09
Some notes on my experience using Go's cgo system
Cgo is the Go compiler suite's bridge to C. I recently used it to write a Go package that gives you access to Solaris kstat kernel statistics, so I want to write down some notes on the whole thing before they fall out of my memory. On the whole, using cgo was a pleasant enough experience. I don't have very much experience in FFIs so I can't say how cgo compares to others, but cgo makes C and Go seem like a reasonably natural fit. It often really does seem like you're just using another Go package.
(With that said, there's a lot more use of unsafe.Pointer()
than
you'll find almost anywhere else.)
At the mechanical level, the most annoying thing to deal with was
C unions. Go has no equivalent and cgo basically leaves you on your
own to read or set union fields. I wound up just writing some trivial
C functions to extract union fields for me and then had my Go code
call them, rather than wrestle with casts and unsafe.Pointer()
and so on in Go code; the C functions were both short and less error
prone for me to write.
A C function was also my solution to needing to do pointer arithmetic.
In C, a common approach is to define a field as 'struct whatever
*ptr;
' and then say it actually points to an array of those
struct
s, with the length of the array given by some other field.
You access the elements of the array by doing things like incrementing
ptr
or indexing off it. Well, in Go that doesn't work; if you
want to increment ptr
to the next struct
, you're going to have
to throw in explicit C.sizeof
invocations and so on. I decided
it was simpler to do it in C instead:
kstat_named_t *get_nth_named(kstat_t *ks, uint_t n) { kstat_named_t *knp; if (!ks || !ks->ks_data || n >= ks->ks_ndata) return NULL; knp = (kstat_named_t *)ks->ks_data; return knp + n; }
Typecasts are another one of the irritations of cgo. Cgo makes every
C type into a Go type, and boy does a lot of C turn out to have a
lot of different integer types. In C they mostly convert into each
other without explicit casts; in Go they are all fully separate
types and you must explicitly cast them around in order to interact
with each other and with native Go integer types. This can get
especially annoying in things like for
loop indexing, because if
you write 'for i := 0; i < CFIELD; i ++
' the compiler will object
that i
is a different type than CFIELD
. This resulted in a for
loop that looks like this:
for i := C.uint_t(0); i < k.ksp.ks_ndata; i++ { .... }
(I wrote more about the mechanics in getting C-compatible struct
s
in Go, copying memory into Go struct
s, and cgo's string functions explained.)
At the design level, my biggest problem was handling C memory lifetime issues correctly. Part of this was figuring out where the C library had to be using dynamic allocation (and when it got freed), and part of it was working out what it was safe for Go structures to hold references to and when those references might become invalid because of some call I made to the C library API. Working this out is vital because of the impact of coupling Go and C memory lifetimes together, plus these memory lifetime issues are likely to have an effect on your package API. What operations can callers do or not do after others? What precautions do you need to take inside your package to try to avoid dereferencing now-free C memory if callers get the lifetime rules wrong? What things can you not expose because there's no way to guard against 'use after free' errors? And so on.
(runtime.SetFinalizer()
can help with this by letting you clean up C memory when Go memory
is going away, but it's not a complete cure.)
Not all uses of cgo will run into memory lifetime problems. Many are probably self-contained, where all of your interaction with C code is inside one function and when it returns you're done and can free up everything.