== An interesting Go concurrency bug that I inflicted on myself While working on [[my Go sinkhole SMTP server https://github.com/siebenmann/smtpd/]], I managed to stick myself with an interesting little concurrency bug that I feel like writing up today. The server takes a file of control rules, and to be simple it reloads and re-parses the control rules on every new connection. We do want to print a message if there's an error in the rules, but we don't want to print it on every connection; that could be a lot of duplicate messages, even concurrent duplicate messages (since each connection is handled in a separate goroutine). So I adopted the simple [[Go http://golang.org/]] way of deduplicating messages in the face of concurrency: warning messages are sent down a channel to a single goroutine that receives them all, checks for repeated messages, and prints the message if it's not a repeat. I tested all of this and it worked fine; warning messages were printed, but only once. Then I decided to be friendly and have the program immediately check the control rules during startup, so it could error out right away if there were problems. The code to do this looked like: .pn prewrap on > // start warn-once backend > go warnbackend() > > _, isgood := setupRules(baserules) > if !isgood { > die("will not continue with rules problems.") > } When I tested this with a rules file with a deliberate error, the program printed the 'will not continue' message and exited but did not print the actual parsing error message. I spent rather a while scratching my head and trying things before I realized what was going on: *I had a scheduling race*. While _setupRules()_ had dispatched its warning message down the channel and the _warnbackend()_ goroutine had picked it up (at least conceptually), the goroutine hadn't gotten as far as actually printing out the message by the time the main flow of code called _die()_ and the whole program exited. (The Go runtime doesn't currently print any warning messages if your program exits with active goroutines.) This is actually a slightly subtle Go scheduling race. Go guarantees that sending something into an unbuffered channel will block until there is a ready receiver, but as I discovered this is not the same thing as guaranteeing that the receiver will do anything before you continue from the send. If you need the receiver of a message to do anything definite before you do something yourself, you need to do more than just send one message into the channel. The cure for this bug was to force a synchronization point by sending a null warning message just before calling _die()_: > _, isgood := setupRules(baserules) > if !isgood { > warnonce("") > die(....) > } This forces us to wait until _warnbackend()_ has processed and printed any message (or messages) from _setupRules()_ and returned to the point where it's waiting to receive something from the channel again. _warnbackend()_ may or may not process our null message before the program exits but we don't care about that. (We know that _warnbackend()_ will process all messages from _setupRules()_ before processing our null message because Go guarantees that channel messages are delivered in order.)