My problem with YAML's use of whitespace

April 30, 2019

Over on Mastodon, there was a little exchange:

@ceejbot:
current status: yaml

please send your thoughts & prayers to be with me at this difficult time

@cks:
It's impressive how yaml took Python's significant whitespace and somehow made it far worse.

I've been thinking about my remark since I said it, and I think I've finally put my finger on a large part of why I feel that way about the YAML I've written for Prometheus. As best as I can put it simply, what makes Prometheus YAML such a bad experience here is that it uses deeply nested blocks and it's written without explicit de-indents. Helping this along is that the standard indent is only two spaces, which blurs indent levels together.

Here, let me show you what I mean. This is the kind of structure I wind up working on all of the time in Prometheus's YAML configuration files:

groups:
- name: fred
  rules:

    # A big comment that in practice
    # is several lines long
    - ath: bth
      # Maybe another comment
      cth: a long string
      [...]
      dth:
        [...]
        eth: host
      gth:
        # This may have a comment too
        # even a multi-line one
        hth: a long string

If you can't really tell these indentation levels apart in the first place, well, that is one of the drawbacks of two-space indents being the cultural standard in YAML.

All of this is almost always more than one screen long (unless I really stretch out my terminal window). Now, imagine that you're coming along and you want to add another rule after the first one. How do you figure out how much to indent it? You have to exactly match the '- ath: bth' indent level, but that's quite possibly off the top of your screen, so you're scrolling up and down trying to match the indent level. Alternately you have to remember that the last line of the previous thing is N indent levels in from the top (for a varying N depending on what you're writing) and de-indent that much relative to it.

Although Python uses significant whitespace too, you don't usually write Python code in this deeply nested way (and the Python standard is four-space indents, which makes things much more visibly distinct). Python also has very predictable indent levels for most things that you're going to be adding (normally the def for a new function isn't indented at all, the def for a new class method is one indent level in, and so on). And stylistically, sprawling and deeply nested functions and code would often be considered a code smell. People deliberately avoid them and work to flatten deeply indented structures.

In the kind of YAML that Prometheus uses, sprawling and deeply nested structures are everywhere. Everything is a thing inside another thing inside a third thing and so on, so everything gets indented and indented and indented. There are almost no explicit de-indentation markers that you write, either intrinsically as part of the objects or culturally as, say, a '# end of <thing>' comment at the outer indent level at the end.

(I have other issues with YAML, but I think I will defer those to another entry. Also, the indentation I'm using here may have one unnecessary level to it, and it's certainly got an inconsistency; part of this is inherited, and part of it is because I do not really understand the rules of when YAML indentation is required and when it's optional.)


Comments on this page:

By Seth at 2019-05-01 11:25:08:

I've found "rainbow levels" highlighting to be a huge help once I'm more than three or so indents in.

https://github.com/thiagoalessio/rainbow_levels.vim

So I can generally know what color a new block is supposed to be.

Written on 30 April 2019.
« Notifications and interruptions, and my view on them
One of my problems with YAML is its sheer complexity »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Tue Apr 30 23:34:10 2019
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.