What should it mean for a system call to time out?

March 13, 2017

I was just reading Evan Klitzke's Unix System Call Timeouts (via) and among a number of thoughts about it, one of the things that struck me is a simple question. Namely, what should it mean for a Unix system call to time out?

This question may sound pointlessly philosophical, but it's actually very important because what we expect a system call timeout to mean will make a significant difference in how easy it would be to add system calls with timeouts. So let's sketch out two extreme versions. The first extreme version is that if a timeout occurs, the operation done by the system call is entirely abandoned and undone. For example, if you call rename("a", "b") and the operation times out, the kernel guarantees that the file a has not been renamed to b. This is obviously going to be pretty hard, since the kernel may have to reverse partially complete operations. It's also not always possible, because some operations are genuinely irreversible. If you write() data to a pipe and time out partway through doing so (with some but not all data written), you cannot reach into the pipe and 'unwrite' all of the already sent data; after all, some of it may already have been read by a process on the other side of the pipe.

The second extreme version is that having a system call time out merely causes your process to stop waiting for it to complete, with no effects on the kernel side of things. Effectively, the system call is shunted to a separate thread of control and continues to run; it may complete some time, or it may error out, but you never have to wait for it to do either. If the system call would normally return a new file descriptor or the like, the new file descriptor will be closed immediately when the system call completes. In practice implementing a strict version of this would also be relatively hard; you'd need an entire infrastructure for transferring system calls to another kernel context (or more likely, transplanting your user-level process to another kernel context, although that has its own issues). This is also at odds with the existing system calls that take timeouts, which generally result in the operation being abandoned part way through with no guarantees either way about its completion.

(For example, if you make a non-blocking connect() call and then use select() to wait for it with a timeout, the kernel does not guarantee that if the timeout fires the connect() will not be completed. You are in fact in a race between your likely close() of the socket and the connection attempt actually completing.)

The easiest thing to implement would probably be a middle version. If a timeout happens, control returns to your user level with a timeout indication, but the operation may be partially complete and it may be either abandoned in the middle of things or completed for you behind your back. This satisfies a desire to be able to bound the time you wait for system calls to complete, but it does leave you with a messy situation where you don't know either what has happened or what will happen when a timeout occurs. If your mkdir() times out, the directory may or may not exist when you look for it, and it may or may not come into existence later on.

(Implementing timeouts in the kernel is difficult for the same reason that asynchronous IO is hard; there is a lot of kernel code that is much simpler if it's written in straight line form, where it doesn't have to worry about abandoning things part way through at essentially any point where it may have to wait for the outside world.)

Written on 13 March 2017.
« CSS, <pre>, and trailing whitespace lead to browser layout weirdness
OpenSSH's IdentityFile directive only ever adds identity files (as of 7.4) »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Mon Mar 13 01:03:40 2017
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.