User mode servers versus kernel mode servers
There at least used to be a broad belief that user mode servers for things like NFS, iSCSI, and so on were generally significantly worse than the same thing implemented in kernel mode. I've reflexively had this belief myself, although I no longer think it has much of a basis in fact. Today, sparked by a comment on this entry and my reply there, I feel like working through why (to the best of my ability).
As always, let's start with a fundamental question: what are the differences between a user mode server and a kernel mode version of the same thing? What extra work does a user mode server do that a kernel mode version avoids? My answer is that there are four possible extra costs imposed by user mode: extra context switches between user and kernel mode, the overhead added by making system calls, possibly extra memory copies between user and kernel space, and possible extra CPU time (and perhaps things like TLB misses) that running in user space requires. User space memory can also get swapped out (well, paged out), but I'm going to assume that your machine is set up so that this doesn't happen.
Let's assume that all of these add some overhead. The big question is when this added overhead matters and I think that the general answer has to be 'when a kernel mode server is already running on the edge of the hardware's performance limits'. If a kernel server has lots of extra room in terms of CPU performance, memory bandwidth, and response latency then adding extra overhead to all of them by moving to user space is not likely to make an observable performance difference to outside people. This is especially so if performance is dominated by outside factors such as the network speed, disk transfer speeds, and disk latencies (yes, I have iSCSI and NFS on my mind).
Or in short if you can easily saturate things and reach the performance limits imposed by the hardware, user versus kernel mode isn't going to make a difference. Only when a kernel mode version is having trouble hitting the underlying performance limits is the extra overhead of a user mode server likely to be noticeable.
An equally interesting question is why user mode servers used to be such a bad thing and what changed between then and now. As it happens, I have some views on this which I am going to boil down to point form answers:
- CPUs are much faster and generally lower-latency than in the old days,
and especially they've gotten much faster compared to everything else
(except possibly if you're using 10GB Ethernet instead of 1GB Ethernet).
- OSes have gotten better about system call speed, both at the context switch boundaries between user and kernel space and in the general system call code inside the kernel.
- OSes have gotten better system call APIs that require less system calls and have more efficient implementations.
- OSes have devoted some work to minimizing memory copies between kernel
and user space for networking code, especially if you're willing to be
- we now understand much more about writing highly efficient user level code for network servers; this is part of what has driven support for better kernel APIs.
The short, general version of this is is simply that it became much easier to hit the hardware's performance limits compared to how it was in the past.
Some of the old difference may have been (and still be) pragmatic, in that kernel developers generally have no choice but to write more careful and more efficient code than general user level code. Partly this is because the user level code can take various easy ways out that aren't available to the kernel code; by running in a constrained environment with various restrictions, the kernel forces developers to consider various important issues that user level code can just brush under the carpet.