Wandering Thoughts archives

2010-12-30

Why you need select() even with communication channels

Go has re-popularized the idea of handling all of your blocking waiting-for-things operations by using CSP-like communication channels instead of select() (in Go, using goroutines and channels). However, it's my firm belief that this isn't good enough; despite what some people think, you cannot replace select() in most common CSP-like implementations.

The crucial ability that select() gives you is the ability to stop waiting for something in response to some external event or change in program state. In a select() based environment you stop trying to read or write a file descriptor simply by omitting it from the set of file descriptors you give to select(), and you get a chance to do this every time IO happens (and you can make IO happen in response to other events, using long standing tricks).

In a CSP-like environment, the traditional way to handle outside blocking operations is to perform them in a separate goroutine (in Go's terminology), forwarding the results to the rest of the program over a channel. The goroutine alternates between doing the blocking operation and talking to the channel (sending results to it, getting new requests from it, or both); the rest of the program can then wait for all of its IO, or continue processing, or whatever it wants.

It's relatively easy to interrupt such a goroutine if it's currently trying to talk to the channel; you send it a 'poison pill' message that tells it to shut down. However, sending a poison pill message does nothing until the goroutine can pick it up; if the goroutine is blocked in an outside operation such as read() or write(), it's not looking for messages over its channel. Unless you can either forcefully kill the goroutine or interrupt the blocking operation somehow, you're out of luck. Most of the time you can't interrupt the blocking operation itself (at least not without additional consequences that you don't want) and most CSP-like implementations don't give you a way of killing goroutines (because not allowing that simplifies the runtime environment).

Even without an explicit need to interrupt blocking operations, the result can be more complex simply because you need to communicate decisions about what to do back and forth between multiple pieces, some of which sometimes block and don't generate status messages when you'd like them to. For instance, consider the buffering logic for a network copying program, where you want to have a maximum size internal buffer that can be fed and drained asynchronously, with the reader side stopping reading from the network when the buffer is too full. I think that you wind up with an extra 'buffer' goroutine in the middle just to keep track of the buffer space remaining; you can't delegate the work to the write-out side, because the write-out side might be blocked when the reader needs to know if it should keep reading or stall.

(Disclaimer: I could be missing some well-known way around this here since I don't have much experience with CSP-like environments.)

Sidebar: the two uses of select() here

There are two uses of select() in this situation: waiting for multiple IO sources at once, and allowing you to efficiently and accurately report how much data is still waiting to be written (which only requires waiting on a single IO source). What I'm writing about is the first use. In the network copying example, I'm sort of handwaving the second case by assuming that there is some way of doing it, possibly with support from the runtime.

programming/SelectVsChannels written at 01:29:08; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.