2014-07-27
Go is still a young language
Once upon a time, young languages showed their youth by having core incapabilities (important features not implemented, important platforms not supported, or the like). This is no longer really the case today; now languages generally show their youth through limitations in their standard library. The reality is that a standard library that deals with the world of the modern Internet is both a lot of work and the expression of a lot of (painful) experience with corner cases, how specifications work out in practice, and so on. This means that such a library takes time, time to write everything and then time to find all of the corner cases. When (and while) the language is young, its standard library will inevitably have omissions, partial implementations, and rough corners.
Go is a young language. Go 1.0 was only released two years ago, which is not really much time as these things go. It's unsurprising that even today portions of the standard library are under active development (I mostly notice the net packages because that's what I primarily use) and keep gaining additional important features in successive Go releases.
Because I've come around to this view, I now mostly don't get irritated when I run across deficiencies in the corners of Go's standard packages. Such deficiencies are the inevitable consequence of using a young language, and while they're obvious to me that's because I'm immersed in the particular area that exposes them. I can't expect authors of standard libraries to know everything or to put their package to the same use that I am time. And time will cure most injuries here.
(Sometimes the omissions are deliberate and done for good reason, or so I've read. I'm not going to cite my primary example yet until I've done some more research about its state.)
This does mean that development in Go can sometimes require a certain sort of self-sufficiency and willingness to either go diving into the source of standard packages or deliberately find packages that duplicate the functionality you need but without the limitations you're running into. Some times this may mean duplicating some amount of functionality yourself, even if it seems annoying to have to do it at the time.
(Not mentioning specific issues in, say, the net packages is entirely deliberate. This entry is a general thought, not a gripe session. In fact I've deliberately written this entry as a note to myself instead of writing another irritated grump, because the world does not particularly need another irritated grump about an obscure corner of any standard Go package.)
2014-07-11
Some notes on bisecting a modified Firefox source base with Mercurial
Suppose, not hypothetically, that you maintain your own copy of the Firefox master source (aka 'Nightly') with private modifications on top of the Mozilla version. Of course you don't commit your modifications, because that would lead to a huge tangle of merges over time. Now suppose that Mozilla breaks something and you want to use Mercurial bisection to find it.
The first thing you need is to figure out the last good version. What I
do is I don't run my modified Firefox version directly out of the build
directory; instead I periodically make an install tarball and unpack it
elsewhere (and then keep the last few ones when I update it, so I can
revert in case of problems). Among other things this tarball copy has an
application.ini file, which for my builds includes a SourceStamp=
value that gives the Mercurial commit identifier that the source was
built from.
So we start the procedure by setting the range bounds:
hg bisect --good 606848e8adfc hg bisect --bad tip
Since I'm carrying local modifications this will generally report something like:
Testing changeset 192456:d2e7bd70dd95 (1663 changesets remaining, ~10 tests) abort: uncommitted changes
So now I need to explicitly check out the named changeset. If I
skip this step Mercurial won't complain (and it will keep doing
future 'hg bisect' operations without any extra complaints), but
what I'm actually doing all of the time is building the tip of my
repo. This is, as they say, not too useful. So:
hg checkout d2e7bd70dd95
This may print messages about merging changes in my changed files,
which is expected. In general Mercurial is smart enough to get
merging my changes in right unless something goes terribly wrong.
Afterwards I build and test and do either 'hg bisect --good' or
'hg bisect --bad' followed by another 'hg checkout <ver>'.
(If I remember I can use the '-U' argument to 'hg bisect' so
it doesn't attempt the checkout and abort with an error, but enhh.
I actually think that having the error is handy because it reminds
me that I need extra magic and care.)
In some cases even the 'hg checkout' may fail with the uncommitted
changes error message. In this case I need to drop my changes and
perhaps re-establish them later. The simple way is:
hg shelve hg checkout ...
Perhaps I should routinely shelve all of my changes at the start of the bisection process, unless I think some of them are important for the testing I'm doing. It would cut down the hassle (and shelving them at the start would make it completely easy to reapply them at the end, since they'd be taken from tip and reapplied to tip).
After the whole bisection process is done, I need to cancel it and return to the tip of the tree:
hg bisect --reset hg checkout tip # if required: hg unshelve # optional but customary: hg pull -u
(This is the sort of notes that I write for myself because it prevents me from having to reverse engineer all of this the next time around.)
Sidebar: Some related Mercurial bits I want to remember
The command to tell me what checkout I am on is 'hg summary' aka
'hg sum'. 'hg status' doesn't report this information; it's
just for file status. This correctly reports that the current
checkout hasn't changed when a 'hg bisect ...' command aborts due
to uncommitted changes.
I don't think there's an easy command to report whether or not a bisection is in progress. The best way to check is probably:
hg log -r 'bisect(current)'
If there's no output, there's no bisection in flight.
(I believe I've left bisections sitting around in the past by
omitting the 'hg bisect --reset'. If I'm right, things like 'hg
pull -u' and so on won't warn me that theoretically there is a
bisection running.)
2014-07-06
Goroutines versus other concurrency handling options in Go
Go makes using goroutines and channels very attractive; they're consciously put forward as the language's primary way of doing concurrency and thus the default solution to any concurrency related issue you may have. However I'm not sure that they're the right approach for everything I've run into, although I'm still mulling over what the balance is.
The sort of problem that channels and goroutines don't seem an entirely smooth fit for is querying shared state (or otherwise getting something from it). Suppose that you're keeping track of the set of SMTP client IPs that have tried to start TLS with you but have failed; if a client has failed TLS setup, you don't want to offer it TLS again (or at least not within a given time). Most of the channel-based solution is straightforward; you have a master goroutine that maintains the set of IPs privately and you add IPs to it by sending a message down the channel to the master. But how do you ask the master goroutine if an IP is in the set? The problem is that you can't get a reply from the master on a common shared channel because there is no way for the master to reply specifically to you.
The channel based solution for this that I've seen is to send a reply channel as part of your query to the master (which is sent over a shared query channel). The downside of this approach is the churn in channels; every request allocates, initializes, uses once, and then destroys a channel (and I think they have to be garbage collected, instead of being stack allocated and quietly cleaned up). The other option is to have a shared data structure that is explicitly protected by locks or other facilities from the sync package. This is more low level and requires more bookkeeping but you avoid bouncing channels around.
But efficiency is probably not the right concern for most Go programs I'll ever write. The real question is which is easier to write and results in clearer code. I don't have a full conclusion but I do have a tentative one, and it's not entirely the one I expected: locks are easier if I'm dealing with more than one sort of query against the same shared state.
The problem with the channel approach in the face of multiple sorts of queries is that it requires a lot of what I'll call type bureaucracy. Because channels are typed, each different sort of reply needs a type (explicit or implicit) to define what is sent down the reply channel. Then basically each different query also needs its own type, because queries must contain their (typed) reply channel. A lock based implementation doesn't make these types disappear but it makes them less of a pain because they are just function arguments and return values and thus they don't have to be formally defined out as Go types and/or structs. In practice this winds up feeling more lightweight to me, even with the need to do explicit manual locking.
(You can reduce the number of types needed in the channel case by merging them together in various ways but then you start losing type safety, especially compile time type safety. I like compile time type safety in Go because it's a reliable way of telling me if I got something obvious wrong and it helps speed up refactoring.)
In a way I think that channels and goroutines can be a form of Turing tarpit, in that they can be used to solve all of your problems if you're sufficiently clever and it's very tempting to work out how to be that clever.
(On the other hand sometimes channels are a brilliant solution to a problem that might look like it had nothing to do with them. Before I saw that presentation I would never have thought of using goroutines and channels in a lexer.)
Sidebar: the Go locking pattern I've adopted
This isn't original to me; I believe I got it from the Go blog entry on Go maps in action. Presented in illustrated form:
// actual entries in our shared data structure
type ipEnt struct {
when time.time
count int
}
// the shared data structure and the lock
// protecting it, all wrapped up in one thing.
type ipMap struct {
sync.RWMutex
ips map[string]*ipEnt
}
var notls = &ipMap{ips: make(map[string]*ipEnt)}
// only method functions manipulate the shared
// data structure and they always take and release
// the lock. outside callers are oblivious to the
// actual implementation.
func (i *ipMap) Add(ip string) {
i.Lock()
... manipulate i.ips ...
i.Unlock()
}
Using method functions feels the most natural way to manipulate the data structure, partly because how you manipulate it is very tightly bound to what it is due to locking requirements. And I just plain like the syntax for doing things with it:
if res == TLSERROR {
notls.Add(remoteip)
....
}
The last bit is a personal thing, of course. Some people will prefer
standalone functions that are passed the ipMap as an explicit
argument.
The problem with filenames in IO exceptions and errors
These days a common pattern in many languages is to have errors or error exceptions be basically strings. They may not literally be strings but often the only thing people really do with them is print or otherwise report their string form. Python and Go are both examples of this pattern. In such languages it's relatively common for the standard library to helpfully embed the name of the file that you're operating on in the error message for operating system IO errors. For example, the literal text of the errors and exceptions you get for trying to open a file that you don't have access to in Go and Python are:
open /etc/shadow: permission denied [Errno 13] Permission denied: '/etc/shadow'
This sounds like an attractive feature, but there is a problem with it: unless the standard library does it all the time and documents it, people can't count on it, and when they can't count on it you wind up with ugly error messages in practice unless people go quite out of their way.
This stems from one of the fundamental rules of good (Unix) error messages for programs, which is thou shalt always include the name of the file you had problems with. If you're writing a program and you need to produce an error message, it is ultimately your job to make sure that the filename is always there. If the standard library gives you errors that sometimes but not always include the filename, or that are not officially documented as including the filename, you have no real choice but to include the filename yourself. Then when the standard library's error or exception does include the filename, the whole error message emitted by your program winds up mentioning the filename twice:
sinksmtp: cannot open rules file /not/there: open /not/there: no such file or directory
It's tempting to say that the standard library should always include the filename in error messages (and explicitly guarantee this). Unfortunately this is very hard to do in general, at least on Unix and with a truly capable standard library. The problem is that you can be handed file descriptors from the outside world and required to turn them into standard file objects that you can do ordinary file operations on, and of course there is no (portable) way to find out the file name (if any) of these file descriptors.
(Many Unixes provide non-portable ways of doing this, sometimes
brute force ones; on Linux, for example, one approach is to look
at /proc/self/fd/<N>.)
2014-07-04
An interesting Go concurrency bug that I inflicted on myself
While working on my Go sinkhole SMTP server, I managed to stick myself with an interesting little concurrency bug that I feel like writing up today. The server takes a file of control rules, and to be simple it reloads and re-parses the control rules on every new connection. We do want to print a message if there's an error in the rules, but we don't want to print it on every connection; that could be a lot of duplicate messages, even concurrent duplicate messages (since each connection is handled in a separate goroutine). So I adopted the simple Go way of deduplicating messages in the face of concurrency: warning messages are sent down a channel to a single goroutine that receives them all, checks for repeated messages, and prints the message if it's not a repeat. I tested all of this and it worked fine; warning messages were printed, but only once.
Then I decided to be friendly and have the program immediately check the control rules during startup, so it could error out right away if there were problems. The code to do this looked like:
// start warn-once backend
go warnbackend()
_, isgood := setupRules(baserules)
if !isgood {
die("will not continue with rules problems.")
}
When I tested this with a rules file with a deliberate error, the
program printed the 'will not continue' message and exited but did
not print the actual parsing error message. I spent rather a while
scratching my head and trying things before I realized what was
going on: I had a scheduling race. While setupRules() had
dispatched its warning message down the channel and the warnbackend()
goroutine had picked it up (at least conceptually), the goroutine
hadn't gotten as far as actually printing out the message by the
time the main flow of code called die() and the whole program
exited.
(The Go runtime doesn't currently print any warning messages if your program exits with active goroutines.)
This is actually a slightly subtle Go scheduling race. Go guarantees that sending something into an unbuffered channel will block until there is a ready receiver, but as I discovered this is not the same thing as guaranteeing that the receiver will do anything before you continue from the send. If you need the receiver of a message to do anything definite before you do something yourself, you need to do more than just send one message into the channel.
The cure for this bug was to force a synchronization point by
sending a null warning message just before calling die():
_, isgood := setupRules(baserules)
if !isgood {
warnonce("")
die(....)
}
This forces us to wait until warnbackend() has processed and
printed any message (or messages) from setupRules() and returned
to the point where it's waiting to receive something from the channel
again. warnbackend() may or may not process our null message before
the program exits but we don't care about that.
(We know that warnbackend() will process all messages from
setupRules() before processing our null message because Go
guarantees that channel messages are delivered in order.)