Wandering Thoughts archives

2011-09-04

How some Unixes did shared libraries in the old days

Yesterday I wrote about how mmap() is the core of modern shared libraries. As it happens some Unixes had shared libraries even before mmap() was created, which raises the question of how they did it.

As I mentioned yesterday, the real challenge with shared libraries is the relocation issue, how you deal with the same shared library having to be mapped at different addresses in different processes. The trick answer is not to do that. You may have heard of prelinking; the extreme version of prelinking is to 'prelink' all shared libraries by assigning each of them a static load address (and relocating them for that address), and then always load them at that address in every process. This completely eliminates the need to do any run-time relocation.

Figuring out the load address of each shared library is where it gets interesting. If you're only doing this on a handful of libraries, you can give each of them their own dedicated chunk of address space. If you have more than that, you have to start looking at executables to identify libraries that are never loaded together and so can have address space assignments that conflict with each other.

(If an executable later comes along that has a load address conflict because it uses two libraries that were previously never seen together, it loses; the kernel will refuse to run it or it will probably crash shortly after it starts running. This is one reason that this approach is somewhat of a hack.)

This is clearly not a very general or scalable solution. Typically it was adopted in a desperate attempt to reduce disk space and memory usage on small Unix systems, and so only really had to work for a small number of libraries (libc, libm, perhaps termcap and/or curses, and perhaps the X libraries on systems with X). If you could get it to work for your package's shared library, that was great; if not, well, you got to statically link your library into your programs and use up more disk space and memory.

(The Unix machine I remember seeing this on was the AT&T 3B1. I believe that similar hacks were done on other obscure early attempts to fit a full Unix setup on small personal computers.)

OldSharedLibraries written at 00:37:30; Add Comment

2011-09-03

The core of modern Unix shared libraries

The fundamental Unix development that enabled modern shared libraries is shared copy-on-write mmap(). That's not necessarily obvious, so let me walk through the logic of this.

There are two problems with shared libraries, with the second stemming from the first. The first problem is how to get them loaded into the process's address space at all, and it has many possible solutions. The easiest solution is actually to have the kernel do it, since after all the kernel is already mapping code and data from the executable; all it needs to do is map some additional things from another file (or several other files).

(Kernel loading was actually how some early shared library implementations worked, but that's another entry.)

The second problem comes from the first problem, and it is how to deal with shared libraries being mapped into different places in memory in different processes while still sharing as much physical memory between processes as possible. Here there are many fewer solutions and most of them are not very good (position independent code has limits, for example). The only really good solution is to fix things up by applying runtime relocations to the shared library, using copy on write to de-share only pages that needed relocations.

In theory the kernel could do this relocation as it loaded each shared library into your process. In practice no one wants to have that much complicated code running in the kernel, so the relocation needs to happen in user space. The simplest way to do that is to have user space handle the entire job of loading shared libraries, and to do that user space needs a way to set up those shared copy-on-write mappings for shared libraries, ie it needs mmap() or something very close to it.

(Doing all shared library loading in user space also allows for all sorts of useful flexibility and powerful features that would be awkward or out of place in a kernel implementation. For that matter, it allows the implementation itself to be replaced.)

SharedLibraryCore written at 00:36:05; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.