A bit about what life was like on Unix before shared libraries

August 15, 2011

If you look at it right, many MH commands are a relatively thin veneer over a large pool of shared functionality. For example, when you run show to display the current message there is a bunch of infrastructure that turns the concept 'the current message' into a filename for show to open, and of course this infrastructure is common across all MH commands that can work on 'the current message'. Similar bits of infrastructure, large and small, exist across a lot of what MH does (for example, clearly you want each MH command to accept the same arguments for specifying messages to operate on).

MH is written in C. The obvious way to implement all of this shared infrastructure in C is to put it in a library (or several libraries) and then have every MH command link to the infrastructure libraries that it needs (which is usually most of them). Since most of the functionality of MH has been factored out into these libraries, most of its code is library code. All of this is fine and works great on any system with shared libraries; the library code lives in shared libraries and the commands are tiny executables that use the shared libraries.

But Unix did not always have shared libraries, and people have written systems like this on pre-shared-library Unixes (in fact MH itself predates shared libraries). And back in those days, you had a problem; the total size of your system's executables was huge, because each executable was statically linked against these libraries and mostly consisted of duplicated code (wasting both disk and memory space at a time when both were precious).

There was no good solution to this, merely various unpleasant workarounds. One of them was used by the the PBM system (at least as I remember it). Since the executable was the unit of sharing in a pre-shared-library world, PBM could be built so that it merged many of its separate commands into a single executable; the front end code in the executable figured out which command to run by inspecting argv[0]. My memory is that this did not involve refactoring the code, although it did involve contortions in the build process.

(MH itself simply shrugged and used more disk space, perhaps partly because it was already using argv[0] for its own purposes.)

Disclaimer: I may be misremembering which package worked this way. I know that at least one well regarded Unix package did.


Comments on this page:

From 92.75.44.244 at 2011-08-15 09:43:00:

Having recently built a statically linked only Linux distribution, I noticed this of course. :) MH itself is "small" these days (5 MB total), but I now understand well why X11 pushed the development of dynamic linking. Every X11 app (xclock, xeyes, xterm...) had about 6 MB statically linked objects in there. (I guess back then, the archive files were organized a bit better. Also, stuff like libXft didn't exist. Still, it's a major source of overhead.)

From 216.16.225.194 at 2011-08-15 10:52:06:

There are still some things that use argv[0] to change what program they act as, although I don't know if they do it for the same reason. Eg, on OpenBSD cpio, pax and tar are the same file (at least as of 4.3; I'm a little behind on updating).

From 72.14.229.81 at 2011-08-15 12:57:03:

busybox is a single executable hardlinked to several names so that argv[0] determines which command to invoke.

From 12.94.77.210 at 2011-08-15 15:25:22:

MH of course is the example in the KornShell command and programming book http://www.prenhall.com/allbooks/ptr_0131827006.html as something that can be re-written as a series of shell scripts.

From 76.68.76.148 at 2011-08-15 22:38:44:

I remember at least one version of FreeBSD doing this in the mid-90s. They had to squeeze a runnable system image onto a 1.44 MB floppy disk so you could install the system. They took a dozen or so binaries in /bin (cat, ls, cp, chmod, mv, etc.) and built them as one static image which branched on argv[0]. Worked great, until I tried rebuilding the userland from source. Somehow I overwrote the common binary with a single program (probably /bin/cat) and I had to dance a bit to recover it.

Keith Browne

By cks at 2011-08-17 01:10:35:

That crack about MH is a common one to make (the Korn shell people are far from alone in doing so), but it's inaccurate for reasons that I decided to write about in MHComplexity.

Written on 15 August 2011.
« The tragedy of MH
An interesting way to shoot yourself in the foot and a limitation of super() »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Mon Aug 15 00:18:43 2011
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.