User mode servers versus kernel mode servers

January 27, 2013

There at least used to be a broad belief that user mode servers for things like NFS, iSCSI, and so on were generally significantly worse than the same thing implemented in kernel mode. I've reflexively had this belief myself, although I no longer think it has much of a basis in fact. Today, sparked by a comment on this entry and my reply there, I feel like working through why (to the best of my ability).

As always, let's start with a fundamental question: what are the differences between a user mode server and a kernel mode version of the same thing? What extra work does a user mode server do that a kernel mode version avoids? My answer is that there are four possible extra costs imposed by user mode: extra context switches between user and kernel mode, the overhead added by making system calls, possibly extra memory copies between user and kernel space, and possible extra CPU time (and perhaps things like TLB misses) that running in user space requires. User space memory can also get swapped out (well, paged out), but I'm going to assume that your machine is set up so that this doesn't happen.

Let's assume that all of these add some overhead. The big question is when this added overhead matters and I think that the general answer has to be 'when a kernel mode server is already running on the edge of the hardware's performance limits'. If a kernel server has lots of extra room in terms of CPU performance, memory bandwidth, and response latency then adding extra overhead to all of them by moving to user space is not likely to make an observable performance difference to outside people. This is especially so if performance is dominated by outside factors such as the network speed, disk transfer speeds, and disk latencies (yes, I have iSCSI and NFS on my mind).

Or in short if you can easily saturate things and reach the performance limits imposed by the hardware, user versus kernel mode isn't going to make a difference. Only when a kernel mode version is having trouble hitting the underlying performance limits is the extra overhead of a user mode server likely to be noticeable.

An equally interesting question is why user mode servers used to be such a bad thing and what changed between then and now. As it happens, I have some views on this which I am going to boil down to point form answers:

  • CPUs are much faster and generally lower-latency than in the old days, and especially they've gotten much faster compared to everything else (except possibly if you're using 10GB Ethernet instead of 1GB Ethernet).

  • OSes have gotten better about system call speed, both at the context switch boundaries between user and kernel space and in the general system call code inside the kernel.
  • OSes have gotten better system call APIs that require less system calls and have more efficient implementations.
  • OSes have devoted some work to minimizing memory copies between kernel and user space for networking code, especially if you're willing to be OS-specific.

  • we now understand much more about writing highly efficient user level code for network servers; this is part of what has driven support for better kernel APIs.

The short, general version of this is is simply that it became much easier to hit the hardware's performance limits compared to how it was in the past.

Some of the old difference may have been (and still be) pragmatic, in that kernel developers generally have no choice but to write more careful and more efficient code than general user level code. Partly this is because the user level code can take various easy ways out that aren't available to the kernel code; by running in a constrained environment with various restrictions, the kernel forces developers to consider various important issues that user level code can just brush under the carpet.


Comments on this page:

From 89.243.103.41 at 2013-01-27 04:29:56:

The canonical example is surely TUX :). Which is out-of-tree and no longer widely discussed; people talk about nginx instead. A Wikipedia contributor agrees with your idea about the history.

"[TUX] served as a test bed (and motivator) for many features which were integrated separately. One major component was the Native POSIX Thread Library, which, with the right tuning parameters, allows userspace web servers to serve web pages at a speed very close to that of a kernelspace web server like TUX but without its limitations."

From 69.158.13.237 at 2013-01-27 10:57:53:

There's an interesting project called "netmap" that's going on at the University of Pisa that's really improved speed:

http://info.iet.unipi.it/~luigi/netmap/

I think going forward that any performance-related arguments are going to fall to the wayside, as being able to saturate a 10GigE card with only one core is now achievable (allowing other cores to do "useful" work), and keeping up with even 40 GigE in userland seems plausible. There's a pretty good TechTalk on it available:

http://www.youtube.com/watch?v=SPtoXNW9yEQ

The code has been incorporated into FreeBSD (HEAD and 9.1+), and patches are available for Linux as well.

From 152.62.109.57 at 2013-01-28 07:07:01:

It is also now possible to do a user-space driver that takes complete control of the hardware driver and essentially runs as fast as the kernel without the extra risks of being inside the kernel, though with an added risk of messing up the hardware and causing problems elsewhere.

In the context of iscsi there is uio-ixgbe and if you combine that with lwip and a good scsi target implementation (tgtd not being one optimized for speed) you could get a zero-copy implementation of iscsi that is just as fast as a kernel one. Ofcourse you lose the normal abstractions of the kernel but you can probably add them as well if needed.

The tradeoff of such an approach is that of losing kernel abstractions, the con is that we are used to them and would need to implement some utilities that exist already (ping, traceroute) to work on the new abstraction. The pro (or con, depending on view) is that the said hw is usable only for the specific purpose it is implemented for. There will be no way to login to the machine through the iscsi devices in this way.

All of this though is something that has become easier and more usable in the last few years and would probably make no sense in earlier days.

-- BaruchEven

Written on 27 January 2013.
« Some places where I think that Unix is incomplete or imperfect
How the modern web 2.0 social web irritates me by hiding discussions »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sun Jan 27 01:19:43 2013
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.