Waiting for both network IO and inter-thread notifications

December 12, 2005

People doing lots of network IO in threaded programs have historically had a problem: waiting for both network IO and for IPC from threads at the same time.

Generally you want to wait for network IO using select() or poll(); trying to use lots of threads, each doing one blocking IO operation, is usually catastrophic to your performance. But both of these only wait on file descriptors, and inter-thread communication in libraries is almost always implemented with mutexes, semaphores, queues, and so on, none of which are select()'able.

Communication between the threads is not the problem, since it's easy to implement a threaded queue if your thread library doesn't already have one. The trick is making the main thread wake up to check the queue.

The best answer on Unix is to use a pipe. The main IO loop selects (or polls) on the readable end's file descriptor as well as its network IO file descriptors; other threads signal the main thread by writing a byte to the writable end. While you can attach meaning to the byte's value, the simplest way is just to use it as a signal to the main thread to check workqueues and so on.

When the main thread processes the queues, it reads an appropriate number of bytes from the pipe and just discards them; the signal has been received. (If the main thread will always clear all of your queues, it can simply read() everything in the pipe in one shot.)

Typical pipes on Unix systems can accumulate at least 4K of pending data before writes to them block, so as long as the main thread is reasonably responsive your other threads can poke it asynchronously. (This is one reason to write only one byte per signal.)

If you spawn other processes, you will need to make sure that they don't inherit either end of the pipe.

Sidebar: order of operations

Just to cover a trivial root: the correct order of operations is:

  • other threads: queue first, then poke main thread.
  • main thread: read from the pipe, then remove from queue.

This insures that wakeup signals never get lost and work is never missed.

If you have queues with a non-blocking 'remove from queue' operation, you can empty the pipe in one shot and then get everything from the queue. In some circumstances you'll wake up from a signal only to find nothing in the queue, but this is harmless.

Written on 12 December 2005.
« Weekly spam summary on December 10th, 2005
What Python threads are good for »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Mon Dec 12 00:51:38 2005
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.