Using Wireshark's Statistics menu to get per-host traffic volume

September 4, 2019

As part of my casual Internet browsing, I recently read 6 Lessons we learned when debugging a scaling problem on GitLab.com. As sort of an aside (although listed as a lesson), the article mentioned Wireshark's Statistics menu and how it can show you per-conversation information (and thus let you find specific sorts of conversations, such as short ones). I didn't think about it much at the time, but this mention stuck in the back of my mind (as such things often do, at least for a while).

Today I had a situation where we had a saturated OpenBSD firewall and I very much wanted to find out roughly what hosts were responsible for the traffic. OpenBSD has per-interface statistics (which let me see that the firewall's interface was saturated with incoming traffic), but it doesn't have anything more granular by default and we didn't have any traffic accounting stuff set up in our PF rules. I tried a plain tcpdump, but this firewall sits in front of enough hosts that the output was overwhelming. As I was thinking unhappy thoughts about trying to write some awk on the fly, a little light went on; perhaps Wireshark could help. So I used tcpdump to capture a minute or two of traffic to a file, copied the capture file over to my Linux machine, and fired up Wireshark.

(Since I only cared about packet sizes, not packet contents, I was able to let tcpdump truncate packets to keep the file size down.)

The answer is yes, Wireshark absolutely had something that could help; the 'Endpoints' option on the Statistics menu gives you a breakdown of the traffic by various endpoint categories, including IPv4 hosts (it will also do it by host+port combination). This immediately pointed me to the high-volume hosts at work.

Using packet captures for this isn't necessarily as useful and precise as real traffic volume information that is measured directly and reliably by the host in some way, and it likely has more overhead. But it has the large virtue that we can use it in any situation where we can run tcpdump for a while, and almost everything has tcpdump. I can use it with our OpenBSD firewalls to find traffic sources, I can use it with our Linux fileservers to figure out which NFS clients are doing a high volume of read or write IO, and I'm sure I can use it in plenty of other situations too.

(One that just occurred to me is trying to find out who is doing an unusually large number of DNS queries to our DNS servers. We don't have query logging, but we can capture a couple of minutes of traffic to port 53.)

Although I wish we hadn't had this problem today, I'm glad that I now have another tool for troubleshooting problems. And I'm glad that I read that article and its mention of Wireshark stuck in my mind. I really do never know when this stuff will come in handy.


Comments on this page:

By Jesper at 2019-09-04 01:22:58:

Look up sFlow and NetFlow if streaming sampled flow / packet header information to another host for analysis (there are tools to parse and present this data too) sounds like something you want to do. It probably is, it's very useful for troubleshooting issues such as these and finding anomalies, from one place. Hardware boxes support it too.

Written on 04 September 2019.
« Another way to do easy configuration for lots of Prometheus Blackbox checks
If you use the rarfile module, make sure you're using version 3.0 (or later) »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Wed Sep 4 00:48:43 2019
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.