Why your program should have an actual configuration file

February 9, 2010

Every so often, someone says something like 'you know, our program has a configuration file but also supports runtime reconfiguration via some magic. Clearly this is wrong, so what we should do is get rid of our configuration file and just make sure the running state is persistent'. If they're feeling nice, they add that the running state will be saved as an XML file.

Every time people say this, sysadmins cry. Here is a very important thing for real deployments of your program in real environments: configuration files are a good thing because they are really easy to manage. Running state that is updated by applying changes (often non-idempotent changes) is much harder.

First, let's get something out of the way: machine generated, automatically updated XML files are not configuration files in any conventional sense that is useful to sysadmins. They are an internal persistence mechanism that may, perhaps, have vaguely useful and inspectable contents (but generally not). So regardless of XML or not, if you go down this route you do not have a configuration file but instead a program with configuration state that persists over reboots and restarts.

Let's inventory some of the things that you lose when you merely have persistent configuration state without actual configuration files:

  • you cannot configure the program without the program actually being running. Programs often have undesirable behavior when started in an unconfigured, misconfigured, or inaccurately configured state.

    Among other things, this means that you can't prepare alternate configurations in advance; you must build them on the fly.

    (Or you must build them on another machine or in another instance of the program, shut both down, and port the magic persistence database over in whatever form it is in, assuming that it does not have host or instance specific data buried in it that you must scrub out.)

  • you cannot atomically make a bunch of changes, having them all take effect at once by putting a new configuration file into place and restarting the program (well, unless there's an explicit 'batch changes together' mechanism). Instead you must make the changes reconfiguration operation by reconfiguration operation. Much like before, this can result in the program temporarily operating in a highly undesirable state. At a minimum, it's going to complicate planning changes.
  • corollary: you can't easily switch configurations or choose different configurations based on outside conditions.

  • automatically updating configuration files clash, potentially badly, with attempts to maintain configuration files through version control systems, automated deployment mechanisms, and so on.

  • it is (or should be) easier to understand a configuration that is written out in a configuration file than one that is the implicit results of applying a bunch of configuration change operations.

    (If it is not, let's be honest here: you need a better configuration file format.)

  • it is much easier to update configurations by providing new files than it is to update configurations by applying configuration changes. There are lots of mechanisms to put new files into place; there are very few to carefully run sequences of commands, keeping track of what ones have already been executed successfully.

I could go on, but I think I'm going to stop now; I hope that you get the point. Configuration files don't exist merely because those other programmers are lazy people, they exist because they're actually a pretty good solution to a whole bunch of problems at once. Getting rid of them is almost never forward progress.

Comments on this page:

From at 2010-02-09 09:14:41:

This is one of the things that drove me nuts configuring our Pacemaker clusters. Heartbeat was insanely easy to template out from Puppet, while Pacemaker requires a long string of arcane "crm configure" commands to get anything done and the resulting CIB is completely unreadable.


By rdump at 2010-02-09 13:27:48:

The lack of a repeatable configuration file for setup is actually a security issue as well.

First, programs that have hidden state achieved using in-program option settings or the like are hard (if not impossible) to assure as running in the desired configuration. Errors (or deliberate misconfigurations) can go unnoticed until they are used to bite you in the rear.

In addition, the reason behind detected changes to program state are hard to differentiate between legitimate-if-unexpected action vs. adversary action. Changes to configuration files controlled via cfengine or a source code management system present a clearer chain of control for assessing the inevitable alarms in any monitored system.

For those reasons, using virtualization technologies which rely upon in-program persistent state management for security-sensitive systems (e.g. authentication servers, DHCP servers, DNS servers) can give me stress-induced hives.

By James (trs80) at 2014-03-14 10:08:44:

I've recently seen the Dockerfile format, and that is basically entirely commands to run, which makes me want to run a mile from it. JuJu looks to be much the same. Then again, what is Puppet but a bunch of scripts?

Written on 09 February 2010.
« A thought on deliberately slow disaster recovery
Some thoughts about 6to4 »

Page tools: View Source, View Normal, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Tue Feb 9 00:34:14 2010
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.