2025-02-23
JSON has become today's machine-readable output format (on Unix)
Recently, I needed to delete about 1,200 email messages to a
particular destination from the mail queue on one of our systems.
This turned out to be trivial, because this system was using Postfix
and modern versions of Postfix can output mail queue status information
in JSON format. So I could dump the mail queue status, select the
relevant messages and print the queue IDs with jq
, and feed this to Postfix to delete the
messages. This experience has left me with the definite view that
everything should have the option to output JSON for 'machine-readable'
output, rather than some bespoke format. For new programs, I think
that you should only bother producing JSON as your machine readable
output format.
(If you strongly object to JSON, sure, create another machine readable output format too. But if you don't care one way or another, outputting only JSON is probably the easiest approach for programs that don't already have such a format of their own.)
This isn't because JSON is the world's best format (JSON is at
best the least bad format). Instead it's
because JSON has a bunch of pragmatic virtues on a modern Unix
system. In general, JSON provides a clear and basically unambiguous
way to represent text data and much numeric data, even if it has
relatively strange characters in it (ie, JSON has escaping rules
that everyone knows and all tools can deal with); it's also generally
extensible to add additional data without causing heartburn in tools
that are dealing with older versions of a program's output. And
on Unix there's an increasingly rich collection of tools to deal
with and process JSON, starting with jq
itself (and hopefully
soon GNU Awk in common configurations). Plus, JSON can generally
be transformed to various other formats if you need them.
(JSON can also be presented and consumed in either multi-line or single line formats. Multi-line output is often much more awkward to process in other possible formats.)
There's nothing unique about JSON in all of this; it could have been any other format with similar virtues where everything lined up this way for the format. It just happens to be JSON at the moment (and probably well into the future), instead of (say) XML. For individual programs there are simpler 'machine readable' output formats, but they either have restrictions on what data they can represent (for example, no spaces or tabs in text), or require custom processing that goes well beyond basic grep and awk and other widely available Unix tools, or both. But JSON has become a "narrow waist" for Unix programs talking to each other, a common coordination point that means people don't have to invent another format.
(JSON is also partially self-documenting; you can probably look at a program's JSON output and figure out what various parts of it mean and how it's structured.)
PS: Using JSON also means that people writing programs don't have to design their own machine-readable output format. Designing a machine readable output format is somewhat more complicated than it looks, so I feel that the less of it people need to do, the better.
(I say this as a system administrator who's had to deal with a certain amount of output formats that have warts that make them unnecessarily hard to deal with.)