Linux's FHS is not the right answer for where to put data directories

January 15, 2011

I think that the FHS is a good attempt, and any standard like that certainly has to say something about where programs like databases and web servers should put their data files by default. But at the same time it is stupid, and you do not want to follow it in serious production; instead you really do want to relocate the data directories of various programs to different places, even in the face of things like SELinux.

(Of course I argue that the solution is to turn SELinux off.)

For a start, the FHS puts everything in /var. This is sort of forced on the FHS by its desire to not invent lots of new directory hierarchies and the general role assigned to /var, but this doesn't make it a good idea. If you are doing anything significant, you really want to put your various pieces of data into their own filesystems so that you can size and manage them separately (and so that /var filling up doesn't take out your database, and your database filesystem filling up doesn't take out your system logging).

(If you're worried about having to statically partition your systems, just use LVM and leave some free space in your LVM logical volumes.)

Once you're using separate filesystems, you don't want to nest them inside /var; it's harder to manage, more fragile, and generally more annoying. Putting your new filesystems closer to the root filesystem, generally in some local directory structure of your choice, is a lot easier in the long run. This is definitely the case if you think that you will ever want to NFS-mount any of these filesystems.

(Also, increasingly people are building systems that don't cope well with filesystem mount ordering dependencies more complicated than 'mount the root filesystem first'.)

The final reason to avoid /var is precisely because it is the FHS standard. To summarize what could be a long entry, if you care about stability and change management you want to be in exclusive control of your data directories; you don't want to be sharing them with anything else, including the native package system. Standard data directories in /var/ are inevitably shared between what you're doing to them and what the package system is doing to them.

(Yes, perhaps your package system is not supposed to touch files in /var after you've started changing them. Quick, is it smart enough to not restore files that you've deliberately removed? Do you have to remember to do some magic dance when you remove a file just to be sure that it won't come back again in a package update? You can likely make the standard /var directory work, but using your own directory is much simpler and more resilient against the inevitable oversights.)

None of this applies if you're not going to change the contents of the /var directory yourself (either by directly editing things or by working through the program). Unless they're large enough to cause problems or need special performance, you might as well leave things like nameserver caches, mailer spool areas, and so on in /var.

Written on 15 January 2011.
« Wikis are not a simple solution for blogging
Why I don't use either a thin client or a fat client »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sat Jan 15 02:05:59 2011
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.