2015-05-06
Why keeping output to 80 columns (or less) is still sensible
When I talked about how monitoring tools should report timestamps and other identifying information, I mentioned that I felt that keeping output to 80 columns or less was still a good idea even if meant sometimes optionally omitting timestamps. So let's talk about that, since it's basically received wisdom these days that the 80 column limit is old fashioned, outdated, and unnecessary.
I think that there are still several reasons that short output is sensible, especially at 80 columns or less. First, 80 columns is still the default terminal window size in many environments; if you make a new one and do nothing special, 80 columns is what you get by default (often 80 by 24). This isn't just on Unix systems; I believe that eg Windows often defaults to this size for both SSH client windows and its own command line windows. This means that if your line spills over 80 columns, many people have to take an extra step to get readable results (by widening their default sized window) and they may mangle some existing output for the purposes of eg cut and paste (since many terminal windows still don't re-flow lines when the window widens or narrow).
Next, there's an increasingly popular class (or classes) of device with relatively constrained screen size, namely smartphones and small tablets. Even a large tablet might only be 80 columns wide in vertical orientation. Screen space is precious on those devices and there's often nothing the person using the device can really do to get any more of it. And yes, people are doing an increasing amount of work from such devices, especially in surprise situations where a tablet might be the best (or only) thing you have with you. Making command output useful in such situations is an increasingly good idea.
Finally, overall screen real estate can be a precious resource even on large-screen devices because you can have a lot of things competing for space. And there are still lots of situations where you don't necessarily need timestamps and they'll just add clutter to output that you're actively scanning. I won't pretend that my situation is an ordinary one; there are plenty of times where you're basically just glancing at the instantaneous figures every so often or looking at recent past or the like.
(As far as screen space goes, often my screen winds up completely covered in status monitoring windows when I'm troubleshooting something complicated. Partly this is because it's often not clear what statistic will be interesting so I want to watch them all. Of course what this really means is that we should finally build that OS level stats gathering system I keep writing about. Then we'd always be collecting everything and I wouldn't have to worry about maybe missing something interesting.)
Unix's pipeline problem (okay, its problem with file redirection too)
In a comment on yesterday's entry, Mihai Cilidariu sensibly suggested that I not add timestamp support to my tools but instead outsource this to a separate program in a pipeline. In the process I would get general support for this and complete flexibility in the timestamp format. This is clearly and definitely the right Unix way to do this.
Unfortunately it's not a good way in practice, because of a fundamental pragmatic problem Unix has with pipelines. This is our old friend block buffering versus line buffering. A long time ago, Unix decided that many commands should change their behavior in the name of efficiency; if they wrote lines of output to a terminal you'd get each line as it was written, but if they wrote lines to anything else you'd only get it in blocks.
This is a big problem here because obviously a pipeline like 'monitor |
timestamp
' basically requires the monitor
process to produce output
a line at time in order to be useful; otherwise you'd get large blocks
of lines that all had the same timestamp because they were written to
the timestamp
process in a block. The sudden conversion from line
buffered to block buffered can also affect other sorts of pipeline
usage.
It's certainly possible to create programs that don't have this problem, ones that always write a line at a time (or explicitly flush after every block of lines in a single report). But it is not the default, which means that if you write a program without thinking about it or being aware of the issue at all you wind up with a program that has this problem. In turn people like me can't assume that a random program we want to add timestamps to will do the right thing in a pipeline (or keep doing it).
(Sometimes the buffering can be an accidental property of how a program was implemented. If you first write a simple shell script that runs external commands and then rewrite it as a much better and more efficient Perl script, well, you've probably just added block buffering without realizing it.)
In the end, what all of this really does is that it chips away quietly at the Unix ideal that you can do everything with pipelines and that pipelining is the right way to do lots of stuff. Instead pipelining becomes mostly something that you do for bulk processing. If you use pipelines outside of bulk processing, sometimes it works, sometimes you need to remember odd workarounds so that it's mostly okay, and sometimes it doesn't do what you want at all. And unless you know Unix programming, why things are failing is pretty opaque (which doesn't encourage you to try doing things via pipelines).
(This is equally a potential problem with redirecting program output to files, but it usually hits most acutely with pipelines.)