Wandering Thoughts archives

2013-02-21

Go: using type assertions to safely reach through interface types

To start with, suppose that you have a Go net.Conn value, call it conn, that you want to shutdown() (for writing) on if possible. Some but not all specific concrete net connection types make this available as a .CloseWrite() method (eg it's available for TCP sockets but not for UDP ones), but net.Conn is an interface type and it doesn't include a .CloseWrite() method so you can't directly call conn.CloseWrite().

(In Go's software engineering view of the world this is a sensible choice. net.Conn is the set of interfaces that all connections can support. If you included .CloseWrite() in the interface anyways you would force some connections, eg UDP sockets, to implement a do-nothing or always-error version of the method and then people would write Go code that blindly called .CloseWrite() and expected it to always work.)

So sometimes conn will be of a concrete type that supports this (and sometimes it won't be). You want to somehow call .CloseWrite() if it's supported by your particular value (well, the particular concrete type of your particular value). In Python we would do this either with a hasattr() check or just by calling obj.CloseWrite() and catching AttributeError, but we're in Go and Go does things differently.

If you're a certain sort of beginning Go programmer coming from Python, you grind your teeth in irritation, look up just what concrete types support .CloseWrite(), and write the following brute force code using a type switch:

func shutdownWrite(conn net.Conn) {
    switch i := conn.(type) {
    case *net.TCPConn:
        i.CloseWrite()
    case *net.UnixConn:
        i.CloseWrite()
    }
}

(Then this code doesn't compile under Go 1.0 because net.UnixConn doesn't implement .CloseWrite() in Go 1.0.)

What this code is doing in its brute force way is changing the type of conn into something where we know that we can call .CloseWrite() and where the Go compiler will let us do so. The compiler won't let us directly call conn.CloseWrite() because .CloseWrite() is not part of the net.Conn interface, but it will let us call, say, net.TCPConn.CloseWrite(), because it is part of net.TCPConn's public methods. So if conn is actually a net.TCPConn value (well, a pointer to it) we can convert its type through this type switch and then make the call. Unfortunately this code has the great drawback that it has to specifically know which concrete types that sit behind net.Conn do and don't implement .CloseWrite(). This is bad for various reasons.

(I am mangling some Go details here in the interests of nominal clarity.)

The experienced Go programmers in the audience are shaking their heads sadly right now, because there is a more general and typesafe way to do this. We just need to say what we actually mean. First we need a type that will let us call .CloseWrite(); this has to be an interface type because we need to convert conn to it (somehow).

type Closer interface {
    CloseWrite() error
}

(It's important to get the argument and return types exactly right even if you're going to ignore the return value.)

Now we need to coerce conn to having that type if and only if this is possible; if we blindly coerce conn to this type (in one of a number of ways) we will get a runtime error when we're handed a net.Conn with a concrete type that lacks a .CloseWriter() method. In Go, this safe coercion is done with the two-result form of a type assertion:

func shutdownWrite(conn net.Conn) {
    v, ok := conn.(Closer)
    if ok {
        v.CloseWrite()
    }
}

(We can't just call conn.CloseWrite() after the coercion because we haven't changed the type of conn itself, we've just manufactured another variable, v, that has the right type.)

This is both typesafe and general. Any conn value of a concrete type that implements .CloseWrite() will work and it will work transparently, while if conn is of a concrete type that doesn't implement .CloseWrite() there are no runtime panics; all of this is exactly what we want. The same technique can be used in exactly the same way to reach through any interface type to get access to any (public) methods on the underlying concrete types; set up an interface type with the methods you want, try coercing, and then call things appropriately.

(I actually like this typesafe conversion and method access better than the Python equivalent because it feels less hacky and more a direct expression of what I want.)

I think that it follows that any type switch code of the first form, one where you just call the same routine (or a few routines) on the new types, is a danger sign of doing things the wrong way. You probably want to use interface type conversion instead.

(Had I read the right bit of Effective Go carefully I might have seen this right away, but Effective Go doesn't quite address this directly. All of this is probably obvious to experienced Go programmers.)

Update: there are several good ideas and improvements (and things I didn't know or realize) in the the golang reddit comments on this entry.

GoInterfacePunning written at 14:19:15; Add Comment

Some notes on my first experience with Go

I've finally wound up writing my first Go program. The program is a Go version of what seems to have turned into my standard language test program, namely a netcat-like program that takes standard input, sends it off to somewhere over the network, and writes to standard out what it gets back from the network. Partly because Go made it easy and partly due to an excess of new thing enthusiasm the program grew far beyond my initial basic specifications.

(I'm somewhat bemused but a netcat-like program really has become a standard program I write in new languages and to try out things like new buffering libraries. It's actually not a bad test.)

On the whole the experience was quite pleasant. The specific need I had is something I normally would have handled with a Python program and writing my Go program was not particularly much more work and bookkeeping than the Python equivalent would have been (it took much longer to write because I was semi-learning Go as I went and I already know Python). The code has reasonably few variable declarations and most of them are non-annoying; Go's := idiom really helps with this since it means that in many circumstances you don't have to declare a variable or specifically name its type.

One important thing I wish I'd know at the start is that you should ignore most everything the Go documentation overview pages tells you about what to read. Effective Go is in practice the quick guide to Go for C programmers, or at least for C programmers who have some general idea about Go to start with, and is the closest thing Go has to Python's excellent tutorial. The language reference is overly detailed and too hard to read for learning and the interactivity of the beginning tutorial makes it completely unsuitable for quick starts.

One of the reasons that I got as far as I did as fast as I did is that Go's networking library has a relatively high-level view of the world. There is no Python equivalent of Go's net.Dial() or net.Listen() APIs, at least not in the standard library; the existence of both of them made handling an absurdly wide variety of network protocols basically trivial (along with a bunch of complexity of hostname and port number lookups). On the flipside this API is not complete (especially in Go 1.0) and has a number of really annoying omissions. This is especially frustrating since I have the (Go) source for the net package and can see perfectly well that what I want access to already exists in the package; it's just not exported and (unlike Python) you can't fish into a package to grab stuff yourself.

My code wound up using goroutines and channels, although in a relatively basic way. Designing program flow in terms of channels definitely took several attempts before I had everything sorted out cleanly; earlier versions of the code had all sorts of oddities before I sorted out exactly what I wanted and how to express that in channel data flows. My broad takeaway from this experience is that it's very important to think carefully about what you want to do before you start eagerly designing a complex network of channels and goroutines. It was easy for me to get distracted by the latter and miss an obvious, relatively simple solution that was under my nose.

My feelings about channels and goroutines are mixed. On the one hand I think that using them simplified the logic of my code (and made it much easier to support TLS), even if it took a while to sort out that logic. On the other hand having to use goroutines is responsible for a serious wart in one aspect of the program, a wart I see no way around; the wart arises because there's no way for outside code to force a goroutine blocked in IO to gracefully abort that IO (this is a fundamental issue with channels).

This is rambling long enough as it is, so I think that I will save my language disagreements for another day. Well, except to say that I think that the standard Go package for parsing arguments and argument flags handles command line options utterly the wrong way and I need to get a real argument parsing package before I write another Go command.

(Go's standard flag package apparently follows some argument parsing standard that Google likes. It is pretty non-Unixy while looking just enough like normal Unix argument handling to fool you.)

GoFirstExperience written at 02:50:20; Add Comment

2013-02-19

The source of C's dependency hell for linking

C famously has a dependency hell problem for linking (both static and dynamic, although the static linking one is often more tractable). This is the problem both of what libraries you need (including what libraries are needed by the libraries that you need) and in what order you need them; it often results in people cramming ever-increasing numbers of libraries into their compiler command lines in the hopes that one of those libraries satisfies things.

As I alluded to in a comment on this entry, the root of that dependency hell is that C has only a single global namespace. With a single global namespace there is no explicit 'import' operation; global names can come from anywhere and appear from everywhere (in fact this is abused as a feature, where you can override or preempt a library routine). One way to put it is that in C, all global names from outside the current file are late-binding and scopeless. They can only be fully resolved or declared invalid at link time when the final binary is built. This leads naturally to libraries that themselves depend on and use global names which come from, well, somewhere, no one knows exactly where until link time.

(Global names often must be declared but this declaration is itself without scope or origin. There are many unfortunate things that result from this, including the potential mismatches between declarations and actual reality.)

This is in stark contrast to a compiled language with a package system and explicit imports (such as Go). In those languages, names are always within the scope of a package and a competently implemented compiler environment reliably knows the dependencies (both direct and transitive) of a piece of code; it knows what packages the code has imported and used names from, and it knows what packages those packages need, and so on. It may not be able to find them on the filesystem, but it can at least tell you that this code needs the compiled forms of the following N packages. It can even throw in version numbers (or something more comprehensive) if it wants to.

My memory is that Plan 9 made some attempts to change this for C. If I remember right, Plan 9 basically moved to a model where there was one header file per library and each header file contained a pragma to tell the compiler what the library was. Of course this is not ANSI-compatible in the least but I don't think the Plan 9 people considered this much of a problem.

In theory the library dependency problem can be dealt with; at the time you build a library (static or dynamic) you can 'link' everything as far as resolving all of the global names that the library needs, then note down where they all came from. In practice traditional Unix static libraries have never had this information and aren't built in ways that creates it (a traditional static library is just an archive of object files). I think that some dynamic library formats have attempted to include this sort of dependency information where available as a hint to various parties.

(And of course a C compiler environment could add support for a Plan 9 like pragma to say 'the stuff from this header file comes from this library' and then embed the resulting hint in the generated object files and so on. But I don't think anyone has. My cynical side suspects that it's just not considered an important problem.)

CDependencyHellSource written at 01:37:52; Add Comment

2013-02-10

A little irritating (but understandable) limitation on Go interfaces

I was all set to write an entry about how you could use Go-style interfaces to create what I called easy, type-safe conversions from a variety of types to something you wanted. The sketch of the problem and the idea is that suppose you want to create an API that accepts arguments in multiple forms (eg), for example something in uncompiled string form or in a compiled efficient form. The usual way to implement this is with a type switch in your functions ('if argument is a string, convert it to ...') but this annoying and limited. Interfaces to the rescue, in theory: define a 'Converter' interface with a single 'ToMyThing()' method, make your API take Converter arguments (and your functions then call arg.ToMyThing()), and define a new ToMyThing() method on strings, integers, and whatever else you want to accept.

(The ToMyThing() method for your actual type does nothing and just returns itself.)

People who know Go are shaking their head sadly right about now. Here, let me tell you why:

prog.go:9: cannot define new methods on non-local type string

Well, oops. So much for that.

If you understand how Go's types and interfaces are implemented, this makes sense. One of the parts of the type description for every concrete Go type is a static, fixed array of methods (with various information including their name and a function pointer); this is built as the type is compiled. What this error message really means is 'the method array for string has already been built, you can't add entries to it now'.

It's not hard to see why allowing this would massively complicate Go's life. The big reason is that entries in the method array are sorted by name (for good reason). Adding a name to it after the type has been compiled means re-sorting the array, changing the index position of entries, and then finding and changing all references to now-invalid index positions in already compiled code. In practice you would want to defer the index position resolution until link time (as a form of link time relocation) and I'm not sure that's even possible in object formats like ELF. Certainly it would add a lot more complexity to the whole process.

(Actually it's even worse; you would have to defer building the method table entirely until link time, since you might have to merge together string method definitions from all over your code base.)

You would also open up the possibility of weird link-time errors. For example, suppose that two separate bits of Go code both independently decide to add (different) Convert() methods to string. This pretty much has to be an error, but it's something you can only detect and report at link time. Worse, those bits of Go code work independently; they only fail when you combine them into a single program. This is not a recipe for good software engineering and I'm not at all surprised that Go left this out.

GoInterfacesLimitation written at 02:00:16; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.