Why dynamic linking
Originally, shared libraries and dynamic linking was sold as the way to make (newly) bloated libraries tolerable: you'd only have one copy of the library on disk and in memory, no matter how many Motif applications you had running. (Motif in specific and GUI libraries were really the original poster boys for shared libraries.)
As R. Francis Smith notes in his comment on my previous entry, these reasons are not entirely relevant any more. Disk space is cheap, and (although I haven't measured) memory usage by libraries is not likely to be the big memory consumer for modern applications.
One big reason shared libraries and dynamic linking persists is simply that it's there. Having gone through all the effort to implement it, no one is going to stop using it. (There's a general lesson here.)
But there are three big things that shared libraries are good for these days:
- they are the underpinnings of
dlopen(), which has significant uses in its own right.
- one step fixes for bugs and security problems in library routines.
- picking optimized CPU and system dependent routines at runtime.
The latter two are examples of runtime substitution, in these cases of
non-buggy functions and of optimized functions respectively. Runtime
substitution has also been used as a debugging aid, for example to
transparently add a tracking
malloc() to programs that are suspected
of having memory leaks, and there are probably applications using it
to pull off other tricks.
The dynamic linking tax on
Most people will tell you that dynamic linking is an unalloyed good, or at least that any effects on performance are small (for simple programs that don't cascade to a huge set of shared libraries). This isn't necessarily so.
A long time ago, back when dynamic linking was new and people were
suspicious of it, I conducted some timings of how dynamic linking
affected the speed of
fork(). To my surprise, the impact was
significant, and on the hardware of the day it was actually worth
static-linking my shell.
Today, I dug up my test program from 1991 (which just repeatedly fork()s, has the child immediately exit(), and the parent wait()s for the child), and measured how much slower the dynamically linked version ran on various systems that I have convenient access to. The results are:
- Solaris 9 on an Ultra 10: 2.82 times worse
- FreeBSD 4.10 on an SMP Pentium II: 2.12 times worse.
- FreeBSD 3.4 on an SMP Pentium III: 2.48 times worse.
Linux needs a table:
|Kernel||CPU||How much worse|
|126.96.36.199||SMP Pentium III||2.45 times|
|RHEL 4 2.6.9-derived||64-bit SMP Opteron||2.72 times|
|FC4 2.6.13-derived||Pentium 4||1.70 times|
|2.4.32-rc1||Pentium III||1.84 times|
As an experiment, I statically linked the program with dietlibc as well as glibc on the two 188.8.131.52 machines. On the Athlon the dietlibc version was 7.7% faster, on the SMP P3 it was 10% faster. (The ratios in the table are against the static glibc version.)
I'm surprised that the SMP machines didn't pay a worse penalty than uniprocessor machines. It's annoying that Linux still has about the same penalty as Solaris, but on the other hand Linux forks pretty fast to start with; it's in the hundred microsecond range even for dynamically linked code.
One anomaly is that odd things seem to start happening on the FreeBSD 4.10 machine as the number of forks gets higher and higher; the execution times don't scale the way they should. (It's possible that some sort of PID or resource wrapping issue is responsible.)