Wandering Thoughts archives

2012-07-25

The kernel memory addressing problem

One of the engineering issues in writing an operating system kernel is how your kernel gets access to physical memory. This requires some explanation, since on the surface you might think that this is easy; after all, the kernel runs with full access to the machine so how could it have problems accessing memory?

The simple answer is that today's kernels almost always run with virtual memory, not just for user processes but for themselves as well. Although they run with full privileges, kernels still have a virtual address space and (kernel) page tables that map the virtual addresses that the kernel uses into real physical addresses (both for RAM and for memory-mapped devices). This may be required by the CPU (once you turn on virtual memory it may always use page tables) and even if it's not, it's almost always more convenient (for example, the kernel code doesn't have to be relocated depending on what physical address it was loaded at). Once the kernel is using virtual addresses, you get the question of how to map physical memory to (or into) the kernel's virtual address space.

I am not going to try to provide a complete inventory of the different techniques that have historically been used, but there are two general extremes. The easiest situation is if your kernel address space is large enough to include all of the physical memory and address space of the machine with room left over. This allows you to simply establish a direct linear mapping for all of physical memory and often it will let you use huge pages (page table entries that map large amounts of contiguous physical memory in a single entry).

(You need extra room because you need some amount of virtual address space for things like the kernel code and data itself, since you want these to be at a constant spot.)

The polar opposite of this is to explicitly map chunks of physical memory into the kernel address space as you need them and then unmap them afterwards. This generally creates a kernel interface that looks something like mmap(), because this is basically what you're doing. The obvious drawback of this approach is that kernel code has to explicitly manage these mappings, especially removing them when it doesn't need them any more (otherwise you 'leak' kernel address space). However, if you don't have enough (kernel) address space you don't really have any choice.

There are a number of things that make an explicit mapping approach less painful:

  • when the kernel is getting (and releasing) memory for its own use, you generally need a memory allocator anyways. Such an allocator is a natural central place to establish and release mappings, hiding this entire issue from all of its callers.

  • device drivers, which need to map physical memory in order to get access to memory-mapped devices, are generally long-lived; they can often establish the mapping when they're loaded or the device started and then hold it until the device is closed down.

  • if what the kernel is really doing is accessing the memory of a process you need to take special steps anyways (to map from the process's virtual address space to physical memory, to insure that the access is legal, and perhaps to page things back in or allocate memory). Managing a kernel mapping for the eventual page of physical RAM is in many ways the least of the work involved.

A common element in all of these cases is that the kernel often wants to do additional bookkeeping and checking while it's setting up these mappings. For example, you might want to prevent two device drivers from claiming that they both own the same chunk of physical memory.

(Some people would even argue that directly mapping all of physical memory by default is an invitation for kernel programmers to write sloppy code that skips these sort of necessary steps and thus bypasses important safety checks. This is probably especially so for device drivers, which stereotypically are often written by people who are not expert kernel programmers.)

PS: I suspect that there have been CPUs with instructions that let you explicitly use physical addresses and bypass virtual address translation. I don't know if any current CPUs work that way; it seems at least a little bit at odds with current CPU design trends.

tech/KernelAddressingProblem written at 23:21:15; Add Comment

My dislike for what I call 'perverse Test Driven Development'

There is a particular style of TDD that I will call 'perverse TDD' for lack of a better name. In perverse TDD, you are supposed to very literally write the most minimal code possible to pass a new test, even if the code is completely useless and artificial. The ostensible justification and excuse for this is that this makes sure you have tests for all of your code (taken from this site on TDD Django development, because reading its writeup on this is what pushed me over the edge today).

I hate this approach and it drives me up the wall. I think it's a stupid and wasteful way to work, and besides it doesn't really achieve its stated goals. It's very hard (and wasteful) to test enough to defeat serious perversity in the code under test, and it's inefficient to repeatedly rewrite your code (always moving towards what you knew from the start that you were going to write) as you write and rewrite a series of slightly smarter and more specific tests. In fact it's more than merely inefficient; it's artificial make-work.

I see the appeal of having relatively complete tests, but I don't think perverse TDD is the way to get to it. If you need thorough tests, just write a bunch of tests at once and then write the real code that should satisfy them. If you're writing the real code and you find important aspects of it that aren't being tested, add more tests. Tests and code do not exist in an adversarial relationship; attempting to pretend otherwise is fooling yourself.

(Indeed, as the TDD Django writeup shows, at some point even people who do perverse TDD deviate from their strict and narrow path in order to write real code instead of the minimal code that passes the tests. This is because programmers are ultimately not stupid and they understand what the real goal is, and it is not 'pass tests'.)

I feel that perverse testing is in the end one sign of the fallacy that everything can or should be (unit) tested. There are plenty of things about your code that are not best established through testing and one of them is determining if you are testing for enough things and the right things.

programming/PerverseTDDDislike written at 01:41:31; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.