Configuration (and configuration files) is not and cannot be generic

August 19, 2021

In a comment on my entry on (not) using YAML for human-written configuration files, Wolfgang Richter asked what I thought of Dhall as a standard for configuration. I'm afraid that I have to rain on everyone's parade: I don't believe that any generic and general language can be used for configuration files for programs that need complex configuration (setting aside some problems with using languages, like reasoning about the result and the general lack of clarity).

Programs with complex configuration needs are almost always expressing some sort of logic about some domain (and sometimes more than one), whether this is conditional logic or description of transformations or something else. Both the logic of what can be expressed and done and the terms and elements of the domain are specific and custom to the program. This is what you see in firewall rules, whether OpenBSD PF or Linux's various generations of systems, in Exim (in SMTP ACLs, routers, and other elements), in Prometheus for recording and alert rules, label rewriting, and more, and even in Apache once you start using its full capabilities.

Pretty much by definition, a generic, general language doesn't have either the specific logic and restrictions or the specific terms and elements of the program's domain. As such you cannot express this complex configuration directly in the language. Either you embed strings or data structures representing this logic in your general language (the YAML approach) or you use the language to verbosely fake the logic and terms with things like (apparent) subroutine calls (the configuring in a real language approach). In both cases the result lacks readability, clarity, and often error checking. You clearly get the most readable and clear configuration file from a (good) custom configuration language that lets you directly express the program's logic about its domain.

(With that said, it's quite possible to create custom configuration languages that aren't very good at this. Language design is a skill and configuration files are far too often designed for the program to consume more than for people to read.)

Programmers love generality, and to some degree don't want to do language design, so I understand the eternal appeal of some universal language for program configurations. But no universal language like Dhall can be a genuinely good configuration language for a program with complex configuration needs.

(People who feel that their general configuration language of choice can be this are invited to write a moderately complicated Apache virtual host configuration (with several sets of permissions and proxying for different URLs and sub-URLs, and don't forget Let's Encrypt's HTTP authentication) or a set of OpenBSD PF rules in their language, and see how it comes out. It would make for some interesting blog posts for anyone so inclined.)


Comments on this page:

One way to look at it is that config is a UI. In terms of GUIs, there was some hope in the 90s that you could make a GUI toolkit that was good enough that you could do RAD with only standard widgets and not need anything custom. I think today most people will acknowledge that you can do 90% with standard widgets, but there will always be a leftover 10% that need custom UI code.

OTOH, if you think about it, in terms of sheer serialization, you can store any data you want in an RDBMS. Even data that is better thought of as graph oriented or blob stores or whatever can be squeezed into an RDBMS. So too, you can probably express any config as an array of objects with string natural foreign keys to one another, ie. blobs of YAML or JSON or TOML. It's just going to be a nightmare of unexpressed constraints like "field foo needs to be a regex if field bar is set to the string regex, otherwise it's a glob" etc.

So the problem is a problem of UI. To square the circle, you could do things like Caddy does where the final format for all config is JSON, but there are various adaptors so that users can write their config in a nicer language or configure things through RPC calls. The trouble then is if you need to backpropagate a change from the JSON to the UI language, it can be difficult or impossible.

I agree with Carl that a separate tool should be used. To make my comment brief, look at configuring a Lisp program in Lisp. The configuration could describe itself on the same terms the program is described.

Written on 19 August 2021.
« It's surprising how many things assume you have available bandwidth
Seeing what all Cinnamon keyboard shortcuts are »

Page tools: View Source, View Normal.
Search:
Login: Password:

Last modified: Thu Aug 19 01:36:08 2021
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.