Wandering Thoughts archives

2015-05-04

Monitoring tools should report timestamps (and what they're monitoring)

This is a lesson learned, not quite the hard way but close to it. What is now a fairly long time ago I wrote some simple tools to report the network bandwidth (and packets per second) for a given interface on Linux and Solaris. The output looked (and looks) like this:

 40.33 MB/s RX  56.54 MB/s TX   packets/sec: 50331 RX 64482 TX

I've used these tools for monitoring and troubleshooting ever since, partly because they're simple and brute force and thus I have a great deal of trust in the numbers they show me.

Recently we've been looking at a NFS fileserver lockup problem, and as part of that I've spent quite some time gathering output from monitoring programs that run right up to the moment the system locks up and stops responding. When I did this, I discovered two little problems with that output format up there: it tells me neither the time it was for nor the interface I'm monitoring. If I wanted to see what happened thirty seconds or a minute before the lockup, well, I'd better count back 30 or 60 lines (and that was based on the knowledge that I was getting one report a second). As far as keeping track of which interface (out of four) that a particular set of output was from, well, I wound up having to rely on window titles.

So now I have a version of these tools with a somewhat different output format:

e1000g1 23:10:08  14.11 MB/s RX  77.40 MB/s TX   packets/sec: 37791 RX 66359 TX

Now this output is more or less self identifying. I can look at a line and know almost right away what I'm seeing, and I don't have to carefully preserve a lot of context somehow. And yes, this doesn't show how many seconds this report is aggregated over (although I can generally see it given two consecutive lines).

I was lucky here in that adding a timestamp plus typical interface names still keep output lines under 80 characters. But even in cases where adding this information would widen the output lines, well, I can widen my xterm windows and it's better to have this information than to have to reconstruct it afterwards. So in the future I think all of my monitoring tools are at least going to have an option to add a timestamp and similar information, and they might print it all the time if it fits (as it does here).

PS: I have strong feelings that timestamps et al should usually be optional if they push the output over 80 columns wide. There are a bunch of reasons for this that I'm not going to try to condense into this entry.

PPS: This idea is not a miracle invention of mine by any means. In fact I shamelessly copied it from how useful the timestamps printed out by tools like arcstat are. When I noticed how much I was using those timestamps and how nice it was to be able to scroll back, spot something odd, and say 'ah, this happened at ...' right away, I smacked myself in the forehead and did it for all of the monitoring commands I was using. Fortunately many OmniOS commands like vmstat already have an option to add timestamps, although it's sometimes kind of low-rent (eg vmstat prints the timestamp on a separate line, which doubles how many lines of output it produces and thus halves the effective size of my scrollback buffer).

sysadmin/ReportTimeAndId written at 23:58:52; Add Comment

What I want to have in shell (filename) completion

I've been using basic filename completion in my shell for a while now, and doing so has given me a perspective on what advanced features of this I'd find useful and which strike me as less useful. Unfortunately for me, the features that I'd find most useful are the ones that are the hardest to implement.

Put simply, the problem with basic filename completion is that any time you want to use even basic shell features like environment variables, you lose completion. Do you refer to some directories through convenience variables? Nope, not any more, because you can choose between completing a long name and not completing, say, '$ml/afilename'.

(I know, bash supports completing filenames that use environment variables. Probably zsh does as well. I'm not interested in switching to either right now.)

But environment variables are just one of the ways to shorten filenames. Two more cases are using wildcards to match unique or relatively unique subsets of a long filename and using various multi-match operators to specify several filenames in some directory or the like. Both of these would be handy to be able to do filename completion for. In fact, let's generalize that: what I'd really like is for my shell to be able to do filename completion in the face of any and all things that can appear in filenames and get expanded by the shell. Then I could combine the power of filename completion and the power of all of those handy shell operators together.

Of course this is easier said than done. I know that what I'm asking for is quite a challenging programming exercise and to some extent a design exercise once we get to the obscure features. But it sure would be handy (more handy, in my biased opinion, than a number of other completion features).

(I've never used eg bash's smart context aware autocompletion of program arguments in general, so I don't know for sure how useful I'd find it; however, my personal guess is 'not as much as full filename completion'. I'm so-so on autocompletion of program names; I suppose I do sometimes use programs with annoyingly long names, so my shell might as well support it. Again I'd rate it well below even autocompletion with environment variables, much less the full version.)

unix/MyShellCompletionDesire written at 01:16:52; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.