The reasoning problem with describing things with a programming language

February 19, 2014

Every so often systems need to have things described to them. Configuration files are one obvious example but there are plenty of others, such as what to do to build software, what to do when packages are installed or removed, and what to do when you're starting and stopping services. When you're creating such a system it's often tempting to make the description take the form of a program written in some programming language, partly because this is often the easy way to create domain specific languages and partly because often it's easy to think of what you are doing as less describing things and more making things happen.

Unfortunately there is a problem with this. The more you describe things through a programming language (and the more general the programming language is), the less your system can determine things about the description from the outside. Actually, let's call the description what it is: a script. Often you can't determine very much about what the script does without in some way executing the script itself. If you're unlucky, you can't determine much without the script actually running for real and doing whatever it does. And even if you have a good 'dry run' mode you generally can't capture why the script did what it did except at a very low level of 'the script made this series of conditional decisions before it acted'.

What this leaves you with is a relatively black box and the problem with black boxes is that there's not much you can do to reason about them. For example, let's take package management. Suppose you have a bunch of similar packages to install or update and they have a bunch of postinstall scripts between them. Are there common operations across these postinstall scripts that you can perform only once, such as an expensive regeneration of a system-wide MIME database? You don't know. You don't even necessarily know this if the postinstall scripts are all identical (depending in part on package manager semantics).

While there are cases where you need the full arbitrary power of a programming language, you should try very hard to avoid resorting to a programming language for everything. White-box as much of your descriptions as possible and reserve the general programming language as a last ditch escape hatch. If people are using the programming language very much it's a sign that your descriptions lack enough power and you need to do something about that.

(There is also a usability problem with using programming languages for configurations and descriptions.)

Sidebar: gradiations of describing things

The best way of describing things is declaratively: you directly declare X, Y, and Z.

The second best way is a procedural language that winds up making declarative statements. You run a chunk of code and it ends up declaring X, Y, and Z. Configuration files written in programming languages often wind up doing this (where the 'declaring' may be in the form of, say, setting specific variables). Among other issues, this suffers from the problem that you know the end declarations but you don't know very much about why they wound up that way.

The worst way is a procedural language that takes actions; the code just does X, Y, and Z. Here you can only discover what is being described by running the code and watching what it does (if you even bother watching, as opposed to just standing back and letting it act).

Comments on this page:

By dozzie at 2014-02-19 04:35:07:

this is often the easy way to create domain specific languages

Warning: what you describe is not a domain specific language. It's merely domain specific API in general purpose language. DSL is when you create a language dedicated to use in specific field. Matlab is a DSL, while Rake or SCons are not (rakefile is just a bunch of Ruby code).

This is typical mistake mostly made by Rubysts. They take clever library interface as a separate language.

By the way, CFEngine and Puppet are both DSLs, while Chef is a framework with no associated DSL.

Written on 19 February 2014.
« People can always unsubscribe from your mailing lists
Some rough things about the naming of SAS drives on Linux »

Page tools: View Source, View Normal, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Wed Feb 19 00:34:30 2014
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.