2022-12-16
A practical issue with YAML: your schema is not actually documentation
These days, YAML is used as the configuration file format for an increasing amount of systems that I need to set up and operate for work. I have my issues with YAML in general (1, 2), but in the process of writing configuration files for programs that use YAML, I've found an entirely practical one, which I will summarize this way: a YAML schema description is not actually documentation for a system's configuration file.
It's entirely possible to write good documentation for a configuration system that happens to use YAML as its format. But while showing people the schema is often easy, it's not sufficient. Even a complete YAML schema merely tells the reader what values can go where in your configuration file's structure. It doesn't tell them what those values mean, what happens if they're left out, why you might want to set (or not set) certain values, or how settings interact with each other.
(And people don't even always do complete YAML schemas. I've seen more than one where the value of some field is simply documented as '<string>', when in fact only a few specific strings have meaning and others will cause problems.)
I don't know why just dumping out your YAML schema is so popular as a way to do configuration file 'documentation'. Perhaps it's because you have to do it as a first step, and once you've done that it's attractive to add a few additional notes and stop, especially if you think the names of things in your schema are already clear about what they're about and mean. Good documentation is time consuming and hard, after all.
I suspect that this approach of reporting the schema and stopping is used for YAML things other than configuration files, but I haven't encountered such things yet. (I've only really encountered YAML in the context of configuration files, where it's at least better than some of the alternatives I've had to deal with.)
(All of this assumes that your configuration file is as simple as a set of keys and simple values. Not all configuration files are so simple, but systems with more complex values tend to write better documentation. Possibly this is because a dump of the schema is obviously insufficient when the values can be complex.)