Wandering Thoughts archives

2009-03-12

The problem with /var today

When /var was created, people took everything in /usr that got written to and just threw it all into one filesystem. After that, /var became the place that you put anything (besides config files and the like) that needed to change or be written to, regardless of why.

The problem is that /var has wound up with two very distinct sorts of data in it: private program data and public data. Private program data is the entire collection of caches, databases, and other tracking information that various programs use to do their jobs. Public data is everything that users and sysadmins create and look at, with things like /var/mail, /var/log, user crontabs, and so on. (On some systems this may include web pages, SQL databases, and more.)

This matters because the two have very different importances and need very different sorts of handling for things like backups and operating system upgrades. Fundamentally, you don't care about private program data as long as the program works right and you probably actively want to not preserve it when you do things like reinstall the system or roll back to a previous system snapshot. However, you absolutely must preserve public data when you do things like reinstall the system.

That the two sorts of data are aggressively commingled in /var causes all sorts of practical problems for system management. Effectively, /var has been turned into both a system filesystem and user filesystem, and the two generally require very different and conflicting treatment. Attempts to patch this up in software are awkward.

(For example, Sun's Live Upgrade stuff goes to all sorts of contortions to try to copy some bits of your public data between various copies and snapshots of your system's /var.)

The obvious solution is to split /var into two filesystems, one for each sort of data. Unfortunately, changing Unix filesystem habits is a lot of work (and work that really needs to be done by Unix vendors in order for it to stick).

sysadmin/TwoVarsProblem written at 23:22:34;

The not so secret history of /var

Originally, Unix had no /var; what is currently put there went into /usr instead (with some of it going into /etc), so you had /usr/log, /usr/spool, /usr/tmp, and so on. Remnants of this era still linger on in /etc, where you still find a certain number of frequently updated data files like /etc/passwd.

(One might sensibly ask why Unix had both /tmp and /usr/tmp. My guess is that it goes back to the days before /usr, and so /tmp had to be retained when /usr was added but at the same time people wanted a bigger scratch space, so /usr/tmp was created.)

Then along came the idea of diskless workstations (I believe originally from Sun). Even back then, /usr was the biggest system filesystem, so no one was really enthused about the idea of each diskless system having its own copy. Since at this point symlinks had been introduced, people came up with the idea of moving everything writable from /usr into a new filesystem, /var, and leaving symlinks behind so that people and programs could continue to use old paths like /usr/tmp. This left /usr read-only and shareable among all of your diskless clients, which saved a lot of disk space.

(Indeed, a shared /usr and the accompanying disk space savings are probably what made diskless clients viable in the first place.)

Over the years since then, the symlinks have been progressively removed on many systems. But today you can still find them on some systems that especially value backwards compatibility, for example Solaris 10.

In addition to moving things from /usr to /var, a certain number of things were relocated from /etc to /var. Practically speaking this was much less important, since you needed a separate / filesystem for each diskless client anyways, but it did create a culture where system daemons shouldn't normally write to /etc to store PID files and so on.

unix/VarDirectoryOrigin written at 00:57:10;


Page tools: See As Normal.
Search:
Login: Password:

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.