The other dynamic linking tax
I've already talked about one dynamic linking tax, but here's another one. Presented in illustrated form:
; cat true.c #include <stdlib.h> int main(int argc, char **argv) { exit(0); } ; cc -o true true.c ; cc -o true-s -static true.c ; diet cc -o true-d true.c ; ls -s true true-d true-s 8 true 4 true-d 388 true-s ; strace ./true >[2=1] | wc -l 21 ; strace ./true-s >[2=1] | wc -l 5 ; strace ./true-d >[2=1] | wc -l 2
This is on a Fedora Core 2 machine. On a Fedora Core 4 machine the dynamically linked version makes 22 syscalls and the static linked glibc version makes nine.
strace
's output always has the initial execve()
that starts the
program being traced and we're explicitly calling exit()
, so the
dietlibc version is doing the minimum
number of system calls possible. Everyone else is adding overhead; in
the case of dynamic linking, quite a lot.
This makes a difference in execution speed too. The dynamically linked glibc version runs 1.38 to 1.47 times slower than the dietlibc version, and the statically linked version 1.06 times slower. Admittedly this is sort of a micro-benchmark; most real programs do more work than this before exiting.
I ran into this while trying to measure the overhead of a program that
I wanted to be as lightweight and fast as feasible. strace
turned up
rather alarming numbers for the overhead involved in glibc (although
I believe it enlarges the overhead of system calls, so I'm not going
to cite absolute numbers). So far I am being good and resisting the
temptation to static link it with dietlibc.
Sidebar: just what's going on with glibc?
The statically linked glibc version also calls uname()
and brk()
(twice). The dynamically linked version, well, let's let a table show
the story:
calls | syscall |
5 | old_mmap |
3 | open |
2 | mprotect |
1 | munmap |
1 | read |
2 | fstat64 |
1 | uname |
2 | close |
1 | set_thread_area |
1 | brk |
19 | TOTAL |
This table does not count the initial execve()
or the final
exit_group()
(which glibc calls instead of exit()
).
(Again, this is on a Fedora Core 2 machine. Your mileage will differ
on different glibc versions. On FC4 the static
linked glibc version does a uname
, 4 brk
s, and a set_thread_area
.)
|
|