The other dynamic linking tax

April 5, 2006

I've already talked about one dynamic linking tax, but here's another one. Presented in illustrated form:

; cat true.c
#include <stdlib.h>
int main(int argc, char **argv)
{
    exit(0);
}
; cc -o true true.c
; cc -o true-s -static true.c
; diet cc -o true-d true.c
; ls -s true true-d true-s
  8 true    4 true-d  388 true-s
; strace ./true >[2=1] | wc -l
21
; strace ./true-s >[2=1] | wc -l
5
; strace ./true-d >[2=1] | wc -l
2

This is on a Fedora Core 2 machine. On a Fedora Core 4 machine the dynamically linked version makes 22 syscalls and the static linked glibc version makes nine.

strace's output always has the initial execve() that starts the program being traced and we're explicitly calling exit(), so the dietlibc version is doing the minimum number of system calls possible. Everyone else is adding overhead; in the case of dynamic linking, quite a lot.

This makes a difference in execution speed too. The dynamically linked glibc version runs 1.38 to 1.47 times slower than the dietlibc version, and the statically linked version 1.06 times slower. Admittedly this is sort of a micro-benchmark; most real programs do more work than this before exiting.

I ran into this while trying to measure the overhead of a program that I wanted to be as lightweight and fast as feasible. strace turned up rather alarming numbers for the overhead involved in glibc (although I believe it enlarges the overhead of system calls, so I'm not going to cite absolute numbers). So far I am being good and resisting the temptation to static link it with dietlibc.

Sidebar: just what's going on with glibc?

The statically linked glibc version also calls uname() and brk() (twice). The dynamically linked version, well, let's let a table show the story:

calls syscall
5 old_mmap
3 open
2 mprotect
1 munmap
1 read
2 fstat64
1 uname
2 close
1 set_thread_area
1 brk
19 TOTAL

This table does not count the initial execve() or the final exit_group() (which glibc calls instead of exit()).

(Again, this is on a Fedora Core 2 machine. Your mileage will differ on different glibc versions. On FC4 the static linked glibc version does a uname, 4 brks, and a set_thread_area.)

Written on 05 April 2006.
« Why I don't like resorting to caching
Keeping up with new Python features »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Wed Apr 5 02:19:14 2006
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.