Linux kernel asynchronous IO doesn't work on sockets

July 14, 2007

I've been considering writing a fully asynchronous ATA over Ethernet target driver (I'm not entirely happy with the current one and its performance). If Linux asynchronous IO worked on sockets (specifically raw network sockets), there is a nice simple design where you just set up a pool of buffers and then cycle each one through a little state machine (network in to disk IO to network out).

(This works especially great for AOE because the protocol is both based on raw packets and completely unordered, so you never have to do packet reassembly and can treat each request independently.)

Unfortunately, a limitation of the current Linux kernel AIO support is that it doesn't support asynchronous IO on sockets; attempts to do async IO on them get silently converted into a synchronous operation when you submit the request. Although I haven't tested this, my impression is that the only things that currently support asynchronous IO is block devices and O_DIRECT IO, probably only on local filesystems.

(I believe that within the kernel, the marker to look for is things that return either -EIOCBQUEUED or -EIOCBRETRY when processing requests. In the relatively bleeding edge kernel source I happen to have handy, there doesn't seem to be very much that qualifies.)

Of course, the 'wait for aio IO to complete' system call has the traditional problem with new event systems in Unix: it only waits on aio events, which means that there's no good way to mix waiting for aio events with waiting for anything else. I'm kind of annoyed that people are still designing new system call interfaces with this problem.

(Technically, very recent kernels have an eventfd() system that one can hook to aio so that when aio IO completes, you get something on the eventfd file descriptor, and you can select() or poll() for it. In a year or three, I might actually be able to use that on the systems I'm interested in.)

Written on 14 July 2007.
« You can't change a Python function's local variables from outside
Weekly spam summary on July 14th, 2007 »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sat Jul 14 22:06:58 2007
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.