2015-08-31
CGo's Go string functions explained
As plenty of its documentation will tell you, cgo provides four functions to convert between Go and C types by making copies of the data. They are tersely explained in the CGo documentation; too tersely, in my opinion, because the documentation only covers certain things by implication and omits two very important glaring cautions. Because I made some mistakes here I'm going to write out a longer explanation.
The four functions are:
func C.CString(string) *C.char func C.GoString(*C.char) string func C.GoStringN(*C.char, C.int) string func C.GoBytes(unsafe.Pointer, C.int) []byte
C.CString() is the equivalent of C's strdup() and copies your
Go string to a C char * that you can pass to C functions, just as
documented. The one annoying thing is that because of how Go and CGo
types are defined, calling C.free will require a cast:
cs := C.CString("a string")
C.free(unsafe.Pointer(cs))
Note that Go strings may contain embedded 0 bytes and C strings may not. If your Go string contains one and you call C.CString(), C code will see your string truncated at that 0 byte. This is often not a concern, but sometimes text isn't guaranteed to not have null bytes.
C.GoString() is also the equivalent of strdup(), but for going
the other way, from C strings to Go strings. You use it on struct
fields and other things that are declared as C char *'s, aka
*C.char in Go, and (as we'll see) pretty much nothing else.
C.GoStringN() is the equivalent of C's memmove(), not to any
normal C string function. It copies the entire length of the C
buffer into a Go string, and it pays no attention to null bytes.
More exactly, it copies them too. If you have a struct field that
is declared as, say, 'char field[64]' and you call C.GoStringN(&field,
64), the Go string you get will always be 64 characters long and
will probably have a bunch of 0 bytes at the end.
(In my opinion this is a bug in cgo's documentation. It claims that GoStringN takes a C string as the argument, but it manifestly does not, as C strings are null-terminated and GoStringN does not stop at null bytes.)
C.GoBytes() is a version of C.GoStringN() that returns a []byte
instead of a string. Since it doesn't claim to be taking a C string
as the argument, it's clearer that it is simply a memory copy of
the entire buffer.
If you are copying something that is not actually a null terminated
C string but is instead a memory buffer with a size, C.GoStringN()
is exactly what you want; it avoids the traditional C problem of
dealing with 'strings' that aren't actually C strings. However, none of these functions are what
you want if you are dealing with size-limited C strings in the
form of struct fields declared as 'char field[N]'.
The traditional semantics of a fixed size string field in structs,
fields that are declared as 'char field[N]' and described as
holding a string, is that the string is null terminated if and only
if there is room, ie if the string is at most N-1 characters long.
If the string is exactly N characters long, it is not null terminated.
This is a fruitful source of bugs even in C code
and is not a good API, but it is an API that we are generally stuck
with. Any time you see such a field and the documentation does not
expressly tell you that the field contents are always null terminated,
you have to assume that you have this sort of API.
Neither C.GoString() nor C.GoStringN() deal correctly with these
fields. Using GoStringN() is the less wrong option; it will merely
leave you with N-byte Go strings with plenty of trailing 0 bytes
(which you may not notice for some time if you usually just print
those fields out; yes, I've done this). Using the tempting GoString()
is actively dangerous, because it internally does a strlen() on
the argument; if the field lacks a terminating null byte, the
strlen() will run away into memory beyond it. If you're lucky you
will just wind up with some amount of trailing garbage in your Go
string. If you're unlucky, your Go program will take a segmentation
fault as strlen() hits unmapped memory.
(In general, trailing garbage in strings is the traditional sign that you have an unterminated C string somewhere.)
What you actually want is the Go equivalent of C's strndup(),
which guarantees to copy no more than N bytes of memory but will
stop before then if it finds a null byte. Here is my version of it,
with no guarantees:
func strndup(cs *C.char, len int) string {
s := C.GoStringN(cs, C.int(len))
i := strings.IndexByte(s, 0)
if i == -1 {
return s
}
return C.GoString(cs)
}
This code does some extra work in order to minimize extra memory
usage due to how Go strings can hold memory.
You may want to take the alternate approach of returning a slice
of the GoStringN() string. Really sophisticated code might decide
which of the two options to use based on the difference between i
and len.
Update: Ian Lance Taylor showed me the better version:
func strndup(cs *C.char, len int) string {
return C.GoStringN(cs, C.int(C.strnlen(cs, C.size_t(len))))
}
Yes, that's a lot of casts. That's the combination of Go and CGo typing for you.
Turning, well copying blobs of memory into Go structures
As before, suppose (not entirely hypothetically) that you're writing a
package to connect Go up to something that will provide it with
blobs of memory that are actually C structs; these might be mmap()'d
files, information from a library, or whatever. Once you have a
compatible Go struct, you still have to
get the data from a C struct (or raw memory) to the Go struct.
One way to do this is to manually write your own struct copy function
that does it field by field (eg 'io.Field = ks_io.field' for
each field). As with defining the Go structs by hand, this is tedious
and potentially error prone. You can do it and you'll probably have
to if the C struct contains unions or other hard to deal with things,
but we'd like an easier approach. Fortunately there are two good
ones for two different cases. In both cases we will wind up copying
the C struct or the raw memory to a Go struct variable that is an
exact equivalent of the C struct (or at
least we hope it is).
The easy case is when we're dealing with a fixed struct that we
have a known Go type for. Assuming that we have a C void * pointer
to the original memory area called ks.ks_data, we can adopt the
C programmer approach and write:
var io IO io = *((*IO)(ks.ks_data)) return &io
This casts ks.ks_data to a pointer to an IO struct and then
dereferences it to copy the struct itself into the Go variable we
made for this. Depending on the C type of ks_data, you may need
to use the hammer of unsafe.Pointer() here:
io = *((*IO)(unsafe.Pointer(ks.ks_data)))
At this point, some people will be tempted to skip the copying and
just return the 'casted-to-*IO' ks.ks_data pointer. You don't
want to do this, because if you return a Go pointer to C data,
you're coupling Go and C memory management lifetimes. The C
memory must not be freed or reused for something else for as long
as Go retains at least one pointer to it, and there is no way for
you to find out when the last Go reference goes away so that you
can free the C memory. It's much simpler to treat 'C memory' as
completely disjoint from 'Go memory'; any time you want to move
some information across the boundary, you must copy it. With copying
we know we can free ks.ks_data safely the moment the copy is
done and the Go runtime will handle the lifetime of the io variable
for us.
The more difficult case is when we don't know what structs we're
dealing with; we're providing the access package, but it's the
callers who actually know the structs are. This situation might
come up in a package for accessing kernel stats, where drivers or
other kernel systems can export custom stats structs. Our access
package can provide specific support for known structs, but we
need an escape hatch for when the callers knows that some specific
kernel system is providing a 'struct whatever' and it wants to
retrieve that (probably into an identical Go struct created through
cgo).
The C programmer approach to this problem is memmove(). You can
write memmove() in Go with sufficiently perverse use of the
unsafe package, but you don't want to. Instead we can use the
reflect package to create a generic version of the specific 'cast
and copy' code we used above. How to do this wasn't obvious to me
until I did a significant amount of flailing around with the package,
so I'm going to go through the logic of what we're doing in detail.
We'll start with our call signature:
func (k *KStat) CopyTo(ptri interface{}) error { ... }
CopyTo takes a pointer to a Go struct and copies our C memory in
ks.ks_data into the struct. I'm going to omit the reflect-based
code to check ptri to make sure it's actually a pointer to a
suitable struct in the interests of space, but you shouldn't in
real code. Also, there are a whole raft of qualifications you're
going to want to impose on what types of fields that struct can
contain if you want to at least pretend that your package is somewhat
memory safe.
To actually do the copy, we first need to turn this ptri interface
value into a reflect.Value that is the destination struct itself:
ptr := reflect.ValueOf(ptri) dst := ptr.Elem()
We now need to cast ks.ks_data to a Value with the type 'pointer to
dst's type'. This is most easily done by creating a new pointer of the
right type with the address taken from ks.ks_data:
src := reflect.NewAt(dst.Type(), unsafe.Pointer(ks.ks_data))
This is the equivalent of 'src := ((*IO)(ks.ks_data))' in the
type-specific version. Reflect.NewAt is there for doing just
this; its purpose is to create pointers for 'type X at address Y',
which is exactly the operation we need.
Having created this pointer, we then dereference it to copy the
data into dst:
dst.Set(reflect.Indirect(src))
This is the equivalent of 'io = *src' in the type-specific
version. We're done.
In my testing, this approach is surprisingly robust; it will deal
with even structs that I didn't expect it to (such as ones with
unexported fields). But you probably don't want to count on that;
it's safest to give CopyTo() straightforward structs with only
exported fields.
On the whole I'm both happy and pleasantly surprised by how easy
it turned out to be to use the reflect package here; I expected it
to require a much more involved and bureaucratic process. Getting
to this final form involved a lot of missteps and unnecessarily
complicated approaches, but the final form itself is about as minimal
as I could expect. A lot of this is due to the existence of
reflect.NewAt(), but there's also that Value.Set() works fine
even on complex and nested types.
(Note that while you could use the reflect-based version even for the first, fixed struct type case, my understanding is that the reflect package has not insignificant overheads. By contrast the hard coded fixed struct type code is about as minimal and low overhead as you can get; it should normally compile down to basically a memory copy.)
Sidebar: preserving Go memory safety here
I'm not fully confident that I have this right, but I think that to
preserve memory safety in the face of this memory copying you must
insure that the target struct type does not contain any embedded
pointers, either explicit ones or ones implicitly embedded into types
like maps, chans, interfaces, strings, slices, and so on. Fixed-size
arrays are safe because in Go those are just fixed size blocks of
memory.
If you copy a C struct containing pointers into a Go struct containing
pointers, what you're doing is the equivalent of directly returning
the 'casted-to-*IO' ks.ks_data pointer. You've allowed the
creation of a Go object that points to C memory and you now have
the same C and Go memory lifetime issues. And if some of the pointers
are invalid or point to garbage memory, not only is normal Go code
at risk of bad things but it's possible that the Go garbage collector
will wind up trying to dereference them and take a fault.
(This makes it impossible to easily copy certain sorts of C structures into Go structures. Fortunately such structures rarely appear in this sort of C API because they often raise awkward memory lifetime issues even in C.)
2015-08-30
Getting C-compatible structs in Go with and for cgo
Suppose, not entirely hypothetically, that you're writing a
package to connect Go up to something that will provide it blobs
of memory that are C structs. These structs might be the results
of making system calls or they might be just informational things
that a library provides you. In either case you'd like to pass these
structs on to users of your package so they can do things with them.
Within your package you can use the cgo provided C.<whatever>
types directly. But this is a bit annoying (they don't have native
Go types for things like integers, which makes interacting with
regular Go code a mess of casts) and it doesn't help other code
that imports your package. So you need native Go structs, somehow.
One way is to manually define your own Go version of the C struct. This
has two drawbacks; it's tedious (and potentially error-prone),
and it doesn't guarantee that you'll wind up with exactly the
same memory layout that C has (the latter is often but not always
important). Fortunately there is a better approach, and that is to use
cgo's -godefs functionality to more or less automatically generate
struct declarations for you. The result isn't always perfect but it
will probably get you most of the way.
The starting point for -godefs is a cgo Go source file that
declares some Go types as being some C types. For example:
// +build ignorepackage kstat // #include <kstat.h> import "C" type IO C.kstat_io_t type Sysinfo C.sysinfo_t const Sizeof_IO = C.sizeof_kstat_io_t const Sizeof_SI = C.sizeof_sysinfo_t
(The consts are useful for paranoid people so you can later
cross-check the unsafe.Sizeof() of your Go types against the size
of the C types.)
If you run 'go tool cgo -godefs <file>.go', it will print out to
standard output a bunch of standard Go type definitions with exported
fields and everything. You can then save this into a file and use
it. If you think the C types may change, you should leave the
generated file alone so you won't have a bunch of pain if you have
to regenerate it; if the C types are basically fixed, you can
annotate the generated output with eg godoc comments. Cgo worries
about matching types and it will also insert padding where it existed
in the original C struct.
(I don't know what it does if the original C struct is impossible to reconstruct in Go, for instance if Go requires padding where C doesn't. Hopefully it complains. This hope is one reason you may want to check those sizeofs afterwards.)
The big -godefs limitation is the same limitation as cgo has in
general: it has no real support for C unions, since Go doesn't have
them. If your C struct has unions, you're on your own to figure out
how to deal with them; I believe cgo translates them as appropriate
sized uint8 arrays, which is not too useful to actually access
the contents.
There are two wrinkles here. Suppose you have one struct type that embeds another struct type:
struct cpu_stat {
struct cpu_sysinfo cpu_sysinfo;
struct cpu_syswait cpu_syswait;
struct vminfo cpu_vminfo;
}
Here you have to give cgo some help, by creating Go level versions of the embedded struct types before the main struct type:
type Sysinfo C.struct_cpu_sysinfo type Syswait C.struct_cpu_syswait type Vminfo C.struct_cpu_vminfo type CpuStat C.struct_cpu_stat
Cgo will then be able to generate a proper Go struct with embedded Go
structs in CpuStat. If you don't do this, you get a CpuStat struct type
that has incomplete type information; the 'Sysinfo' et al fields in it
will refer to types called _Ctype_... that aren't defined anywhere.
(By the way, I do mean 'Sysinfo' here, not 'Cpu_sysinfo'. Cgo is smart enough to take that sort of commonly seen prefix off of struct field names. I don't know what its algorithm is for doing this, but it's at least useful.)
The second wrinkle is embedded anonymous structs:
struct mntinfo_kstat {
....
struct {
uint32_t srtt;
uint32_t deviate;
} m_timers[4];
....
}
Unfortunately cgo can't deal with these at all. This is issue
5253, and you have two
options. The first is that at the moment, the proposed CL fix still applies to
src/cmd/cgo/gcc.go and works (for me). If you don't want to build
your own Go toolchain (or if the CL no longer applies and works),
the other solution is to create a new C header file that has a
variant of the overall struct that de-anonymizes the embedded struct
by creating a named type for it:
struct m_timer {
uint32_t srtt;
uint32_t deviate;
}
struct mntinfo_kstat_cgo {
....
struct m_timer m_timers[4];
....
}
Then in your Go file:
... // #include "myhacked.h" ... type MTimer C.struct_m_timer type Mntinfo C.struct_mntinfo_kstat_cgo
Unless you made a mistake, the two C structs should have the same
sizes and layouts and thus be totally compatible with each other.
Now you can use -godefs on your version, remembering to make an
explicit Go type for m_timer due to the first wrinkle. If you
feel bold (and you don't think you'll need to regenerate things),
you can then reverse this process in the generated Go file,
re-anonymizing the MTimer type into the overall struct (since
Go supports that perfectly well). Since you're not changing the
actual contents, just where types are declared, the result should
be layout-identical to the original.
PS: the file that's input to -godefs is set to not be built by
the normal 'go build' process because it is only used for this
godefs generation. If it gets included in the build, you'll get
complaints about multiple definitions of your (Go) types. The
corollary to this is that you don't need to have this file and any
supporting .h files in the same directory as your regular .go
files for the package. You can put them in a subdirectory, or keep
them somewhere entirely separate.
(I think the only thing the package line does in the godefs
.go file is set the package name that cgo will print in the
output.)
2015-08-17
Why languages like 'declare before use' for variables and functions
I've been reading my way through Lisp as the Maxwell's equations of software and ran into this 'problems for the author' note:
As a general point about programming language design it seems like it would often be helpful to be able to define procedures in terms of other procedures which have not yet been defined. Which languages make this possible, and which do not? What advantages does it bring for a programming language to be able to do this? Are there any disadvantages?
(I'm going to take 'defined' here as actually meaning 'declared'.)
To people with certain backgrounds (myself included), this question has a fairly straightforward set of answers. So here's my version of why many languages require you to declare things before you use them. We'll come at it from the other side, by asking what your language can't do if it allows you to use things before declaring them.
(As a digression, we're going to assume that we have what I'll call an unambiguous language, one where you don't need to know what things are declared as in order to know what a bit of code actually means. Not all languages are unambiguous; for example C is not (also). If you have an ambiguous language, it absolutely requires 'declare before use' because you can't understand things otherwise.)
To start off, you lose the ability to report a bunch of errors at the time you're looking at a piece of code. Consider:
lvar = .... res = thang(a, b, lver, 0)
In basically all languages, we can't report the lver for lvar
typo (we have to assume that lver is an unknown global variable),
we don't know if thang is being called with the right number of
arguments, and we don't even know if thang is a function instead
of, say, a global variable. Or if it even exists; maybe it's a typo
for thing. We can only find these things out when all valid
identifiers must have been declared; in fully dynamic languages
like Lisp and Python, that's 'at the moment where we reach this
line of code during execution'. In other languages we might be able
to emit error messages only at the end of compiling the source file,
or even when we try to build the final program and find missing or
wrong-typed symbols.
In languages with typed variables and arguments, we don't know if
the arguments to thang() are the right types and if thang()
returns a type that is compatible with res. Again we'll only be
able to tell when we have all identifiers available. If we want to
do this checking before runtime, the compiler (or linker) will have
to keep track of the information involved for all of these pending
checks so that it can check things and report errors once thang()
is defined.
Some typed languages have features for what is called 'implicit
typing', where you don't have to explicitly declare the types of
some things if the language can deduce them from context. We've
been assuming that res is pre-declared as some type, but in an
implicit typing language you could write something like:
res := thang(a, b, lver, 0) res = res + 20
At this point, if thang() is undeclared, the type of res is also
unknown. This will ripple through to any code that uses res, for
example the following line here; is that line valid, or is res perhaps
a complex structure that can in no way have 10 added to it? We can't
tell until later, perhaps much later.
In a language with typed variables and implicit conversions between
some types, we don't know what type conversions we might need in
either the call (to convert some of the arguments) or the return
(to convert thang()'s result into res's type). Note that in
particular we may not know what type the constant 0 is. Even
languages without implicit type conversions often treat constants
as being implicitly converted into whatever concrete numeric type
they need to be in any particular context. In other words, thang()'s
last argument might be a float, a double, a 64-bit unsigned integer,
a 32-bit signed integer, or whatever, and the language will convert
the 0 to it. But it can only know what conversion to do once
thang() is declared and the types of its arguments are known.
This means that a language with any implicit conversions at all
(even for constants like 0) can't actually generate machine code
for this section until thang() is declared even under the best
of circumstances.
However, life is usually much worse for code generation than this.
For a start, most modern architectures pass and return floating
point values in different ways than integer values, and they may
pass and return more complex values in a third way. Since we don't
know what type thang() returns (and we may not know what types
the arguments are either, cf lver), we basically can't generate
any concrete machine code for this function call at the time we
parse it even without implicit conversions. The best we can do is
generate something extremely abstract with lots of blanks to be
filled in later and then sit on it until we know more about
thang(), lver, and so on.
(And implicit typing for res will probably force a ripple effect
of abstraction on code generation for the rest of the function, if
it doesn't prevent it entirely.)
This 'extremely abstract' code generation is in fact what things like Python bytecode are. Unless the bytecode generator can prove certain things about the source code it's processing, what you get is quite generic and thus slow (because it must defer a lot of these decisions to runtime, along with checks like 'do we have the right number of arguments').
So far we've been talking about thang() as a simple function call.
But there are a bunch of more complicated cases, like:
res = obj.method(a, b, lver, 0) res2 = obj1 + obj2
Here we have method calls and operator overloading. If obj, obj1,
and/or obj2 are undeclared or untyped at this point, we don't
know if these operations are valid (the actual obj might not have
a method() method) or what concrete code to generate. We need to
generate either abstract code with blanks to be filled in later or
code that will do all of the work at runtime via some sort of
introspection (or both, cf Python bytecode).
All of this prepares us to answer the question about what sort of languages require 'declare before use': languages that want to do good error reporting or (immediately) compile to machine code or both without large amounts of heartburn. As a pragmatic matter, most statically typed languages require declare before use because it's simpler; such languages either want to generate high quality machine code or at least have up-front assurances about type correctness, so they basically fall into one or both of those categories.
(You can technically have a statically typed language with up-front
assurances about type correctness but without declare before use;
the compiler just has to do a lot more work and it may well wind
up emitting a pile of errors at the end of compilation when it can
say for sure that lver isn't defined and you're calling thang()
with the wrong number and type of arguments and so on. In practice
language designers basically don't do that to compiler writers.)
Conversely, dynamic languages without static typing generally don't
require declare before use. Often the language is so dynamic that
there is no point. Carefully checking the call to thang() at the
time we encounter it in the source code is not entirely useful if
the thang function can be completely redefined (or deleted) by
the time that code gets run, which is the case in languages like
Lisp and Python.
(In fact, given that thang can be redefined by the time the code
is executed we can't even really error out if the arguments are
wrong at the time when we first see the code. Such a thing would
be perfectly legal Python, for example, although you really shouldn't
do that.)
2015-08-04
A lesson to myself: commit my local changes in little bits
For quixotic reasons, I recently updated my own local version of dmenu to the upstream version, which had moved on since I last did this (most importantly, it gained support for Xft fonts). Well, the upstream version plus my collection of bugfixes and improvements. In the process of doing this I have (re)learned a valuable lesson about how I want to organize my local changes to upstream software.
My modifications to dmenu predate my recent decision to commit local changes instead of just carrying them uncommitted on top of the repo. So the first thing I did was to just commit them all in a single all in one changeset, then fetch upstream and rebase. This had rebase conflicts, of course, so I merged them and built the result. This didn't entirely work; some of my modifications clearly hadn't taken. Rather than try to patch the current state of my modifications, I decided to punt and do it the right way; starting with a clean copy of the current upstream, I carefully separated out each of my modifications and added them as separate changes and commits. This worked and wasn't particularly much effort (although there was a certain amount of tedium).
Now, a certain amount of the improvement here is simply that I was porting all of my changes into the current codebase instead of trying to do a rebase merge. This is always going to give you a better chance to evaluate and test things. But that actually kind of points to a problem; because I had my changes in a single giant commit, everything was tangled together and I couldn't see quite clearly enough to do the rebase merge right. Making each change independently made things much clearer and easier to follow, and I suspect that that would have been true even in a merge. The result is also easier for me to read in the future, since each change is now something I can inspect separately.
All of this is obvious to people who've been dealing with VCSes and local modifications, of course. And in theory I knew it too, because I've read all of those homilies to good organization of your changes. I just hadn't stubbed my toe on doing it the 'wrong' way until now (partly because I hadn't been committing changes at all until recently).
(Of course, this is another excellent reason to commit local changes instead of carrying them uncommitted. Uncommitted local changes are basically intrinsically all mingled together.)
Having come to my senses here, I have a few more programs with local hacks that I need to do some change surgery on.
(I've put my version of dmenu up on github, where you can see and cherry pick separate changes if desired. I expect to rebase this periodically, when upstream updates and I notice and care. As before, I have no plans to try to push even my bugfixes to the official release, but interested parties are welcome to try to push them upstream.)
Sidebar: 'git add -p' and this situation
In theory I could have initially committed my big ball of local
changes as separate commits with 'git add -p'. In practice this
would have required disentangling all of the changes from each
other, which would have required understanding code I hadn't touched
for two years or so. I was too impatient at the
start to do that; I hoped that 'commit and rebase' would be good
enough. When it wasn't, restarting from scratch was easier because
it let me test each modification separately as I made it.
Based on this, my personal view is that I'm only going to use 'git
add -p' when I've recently worked on the code and I'm confident
that I can accurately split changes up without needing to test the
split commits to make sure each is correct on its own.