Why XML is terrible for configuration files

February 28, 2010

There's a lot of things that get called 'configuration files', so I want to be specific that I mean the sort of configuration files that have three primary uses: they're written by people, used by programs, and later read by people who are trying to figure out what the programs are set up to do. These are the kind of configuration files that XML is terrible for.

(There's a whole ecology of 'configuration files' that are generated by one program and consumed by others, and are pretty much never touched by people. I don't care what format they're in, and there's perfectly sensible reasons to use XML for them.)

The reasons that XML is terrible for configuration files are right there in the description of what these sorts of configuration files are for. The only one of those three things that XML makes easy is being used by programs; XML is famously difficult and annoying to write by hand and equally hard to read (partly because it is so verbose; excess verbosity causes people to lose track of where they are). Common ways to structure data in XML files make this even worse because they tend to be designed for the convenience of programs, not to be comprehensible to people.

General XML editors improve the situation somewhat but I feel that they don't really do all that much for making it genuinely easy for people to write and read XML. This goes doubly so if your XML format has data structuring issues.

XML is also prone to a particular disease, which is best illustrated by asking a question: is your XML format actually documented, in full detail, in a way that is at least as good as the Atom feed format specification or the description of your favorite program's regular configuration file? All too often the answer is that it is not, because people have the peculiar impression that using XML with verbose element and attribute names plus some sketchy documentation is sufficient.

(Please note that a DTD is not documentation. Try again.)

This issue cannot be solved by creating a nice user-friendly program to create and maintain the XML configuration file. If you do this, what you have really done is created a program without a real configuration file that is instead configured only through an GUI interface. And you still have the documentation problem; it's just that you now have to document the effects of the program instead of the configuration file.

(For bonus points, this configuration process is generally asynchronous so you can't immediately see the effects of your configuration changes.)

Comments on this page:

From at 2010-03-01 04:46:50:

I really agree with you and I have been really annoyed since the XML became fashion and been used more and more for configuration files!

From at 2010-03-01 13:47:44:

I find YAML an adequate compromise -- unambiguously machine readable (although sometimes annoying in specifics) but reasonably natural for people to read and write.

In truth, what I think of as "the Windows INI format" (also used by Samba, puppet, and probably plenty of other non-Windows systems) may win for this. YAML can be parsed into more interesting structures, but for most configuration files, that's unimportant, and just having a sectionable set of keyword = value settings is probably plenty.


Written on 28 February 2010.
« The dividing line between supporting code and forking it
A building block of my environment: rxexec »

Page tools: View Source, View Normal, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sun Feb 28 22:28:30 2010
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.