Wandering Thoughts archives

2018-11-30

Today I (re-)learned that top's output can be quietly system dependent

I'll start with a story that is the background. A few days ago I tweeted:

Current status: zfs send | zfs recv at 33 Mbytes/sec. This will take a while, and the server with SSDs and 10G networking is rather bored.

(It's not CPU-limited at either end and I don't think it's disk-limited. Maybe too many synchronous reads or something.)

I was wrong about this being disk-limited, as it turned out, and then Allan Jude had the winning suggestion:

Try adding '-c aes128-gcm@openssh.com' to your SSH invocation.

See also: <pdf link>

(If you care about 10G+ SSH, you want to read that PDF.)

This made a huge difference, giving me basically 1G wire speeds for my ZFS transfers. But that difference made me scratch my head, because why was switching SSH ciphers making a difference when ssh wasn't CPU-limited in the first place? I came up with various theories and guesses, until today I had a sudden terrible suspicion. The result of testing and confirming that suspicion was another tweet:

Today I learned or re-learned a valuable lesson: in practice, top output is system dependent, in ways that are not necessarily obvious. For instance, CPU % on multi-CPU systems.

(On some systems, CPU % is the percent of a single CPU; on some it's a % of all CPUs.)

You see, the reason that I had confidently known that SSH wasn't CPU-limited on sending machine, which was one of our OmniOS fileservers, is that I had run top and seen that the ssh process was only using 25% of the CPU. Case closed.

Except that OmniOS top and Linux's top report CPU usage percentages differently. On Linux, CPU percentage is relative to a single CPU, so 25% is a quarter of one CPU, 100% is all of it, and over 100% is a multi-threaded program that is using up more than one CPU's worth of CPU time. On OmniOS, the version of top we're using comes from pkgsrc (in what is by now a very old version), and that version reports CPU percentage relative to all CPUs in the machine. Our OmniOS fileservers are 4-CPU machines, so that '25% CPU' was actually 'all of a single CPU'. In other words, I was completely wrong about the sending ssh not being CPU-limited. Since ssh was CPU limited after all, it's suddenly no surprise why switching ciphers sped things up to basically wire speed.

(Years ago I established that the old SunSSH that OmniOS was using back then was rather slow, but then later we upgraded to OpenSSH and I sort of thought that I could not worry about SSH speeds any more. Well, I was wrong. Of course, nothing can beat not doing SSH at all but instead using, say, mbuffer. Using mbuffer also means that you can deliberately limit your transfer bandwidth to leave some room for things like NFS fileservice.)

PS: There are apparently more versions than you might think. On the FreeBSD 10.4 machine I have access to, top reports CPU percentage in the same way Linux does (100% is a single-threaded process using all of one CPU). Although both the FreeBSD version and our OmniOS version say they're the William LeFebvre implementation and have similar version numbers, apparently they diverged significantly at some point, probably when people had to start figuring out how to make the original version of top deal with multi-CPU machines.

solaris/TopCPUPercentDifference written at 23:01:36; Add Comment

I've learned that sometimes the right way to show information is a simple one

When I started building some Grafana dashboards, I of course reached for everyone's favorite tool, the graph. And why not? Graphs are informative and beyond that, they're fun. It simply is cool to fiddle around for a bit and have a graph of your server's network bandwidth usage or disk bandwidth right there in front of you, to look at the peaks and valleys, to be able to watch a spike of activity, and so on.

For a while I made pretty much everything a graph; for things like bandwidth, this was obviously a graph of the rate. Then one day I was looking at a graph of filesystem read and write activity on one of our dashboards, with one filesystem bouncing up here and another one bouncing up there over Grafana's default six hour time window, and I found myself wondering which of these filesystems was the most active one over the entire time period. In theory the information was in the graph; in practice, it was inaccessible.

As an experiment, I added a simple bar graph of 'total volume over the time range'. It was a startling revelation. Not only did it answer my question, but suddenly things that had been buried in the graphs jumped out at me. Our web server turned out to use our old FTP area far much more than I would have guessed, for example. The simple bar graph also made it much easier to confirm things that I thought I was seeing in the more complex and detailed graphs. When one filesystem looked like it was surprisingly active in the over-time graph, I could look down to the bar graph and confirm that yes, it was (and also see how much its periodic peaks of activity added up to).

Since that first experience I have become much more appreciative of the power of simple ways to show summary information. Highly detailed graphs have an important place and they're definitely showing us things we didn't know, but simple summaries also reveal things too.

(I'd love the ability to get ad-hoc simple summaries from more complex graphs. I don't need 'average bandwidth over the graph's entire time range' very often, but sometimes I'd rather like to have it rather than having to guess by eyeball. It's sort of a pity that you can't give Grafana graphs alternate visualizations that you can cycle through, or otherwise have two (or more) panels share the same space so you can flip between them. As it stands, we have some giant dashboards.)

sysadmin/SimpleGraphsAdvantage written at 01:15:42; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.