Chris's Wiki :: blog/sysadmin/NetworkCablesGoBad Commentshttps://utcc.utoronto.ca/~cks/space/blog/sysadmin/NetworkCablesGoBad?atomcommentsDWiki2021-09-16T02:31:41ZRecent comments in Chris's Wiki :: blog/sysadmin/NetworkCablesGoBad.By MikeP on /blog/sysadmin/NetworkCablesGoBadtag:CSpace:blog/sysadmin/NetworkCablesGoBad:f6db7d5d3995ed7fda061585fbd671df75fca09cMikePhttps://snowcrash.ca/<div class="wikitext"><p>Throwback to years ago I went into the data centre for something and saw five or six other staff all gathered at one end, so I went over to see what was up and in one mixed lab, the PCs were DHCPing just fine but the Macs were not. They had checked and/or replaced nearly everything but the uplink cable from the lab's switch to the router. Why not that cable? Well, it's always been working, we've unplugged it and plugged it back in again, can't be the issue, but we've been here hours. So I replaced it and everything started working fine again. Don't know why, didn't care why.</p>
<p>Ever since, I try to do the "it's stupid, but it's easy" sort of fixes before I start digging into the real time-consuming stuff. It's almost never the easy stuff, but when it is, it sure feels good to have spent a couple of minutes to get the "it's stupid but" fix rather than hours and finally ending up on that one.</p>
</div>2021-09-16T02:31:41ZBy Chris Siebenmann on /blog/sysadmin/NetworkCablesGoBadtag:CSpace:blog/sysadmin/NetworkCablesGoBad:b9e4c94ec2ed327260bd36f3000691cdc4102e29Chris Siebenmann<div class="wikitext"><p>Ivan, we use our existing Prometheus and Alertmanager setup, mostly
because it was already there and so easy. I wrote up a description of
the metrics (and alerts) in <a href="https://utcc.utoronto.ca/~cks/space/blog/sysadmin/PrometheusCheckingNetworkInterfaces">PrometheusCheckingNetworkInterfaces</a>.</p>
<p>Although I didn't fully emphasize it in the entry, checking the network
interface state can't substitute for an end to end check through Blackbox
or whatever. There are system problems that just can't be picked up by
the existing interface metrics. But there are also problems that end
to end checks probably won't detect (as we saw with our cable going bad
and dropping down to 100M).</p>
</div>2021-06-30T03:59:51ZBy Greg Ruben on /blog/sysadmin/NetworkCablesGoBadtag:CSpace:blog/sysadmin/NetworkCablesGoBad:7dfbbf7bf80bae5db1b3de71278a25593b2b76c1Greg Ruben<div class="wikitext"><p>Thanks for the tip... I will bring up 'bonding' in all important servers (at least until homeoffice/COVID).</p>
</div>2021-06-29T08:33:54ZBy Ivan on /blog/sysadmin/NetworkCablesGoBadtag:CSpace:blog/sysadmin/NetworkCablesGoBad:75e04e5c78468e0051f30833c136dd5ec6403dfeIvanhttps://www.tomica.net<div class="wikitext"><p>Reminds me of an MySQL issue we once had... we had an high traffic HA MySQL setup and one of the replicas was constantly getting behind and replication lag was raising. </p>
<p>This particular server was not slower than other replicas in any regard. CPU and Disks were on par to other replica servers and there was basically no reason why it waa behaving the way it was... </p>
<p>We first thought there was an issue with SSD drives exhausting write capacity or being close to full, but we soon observed eth interface being capped at 100 base t. Switching thr cable for a new one resolved all of the issues ofc, and the server cought up in a mater of 2-3 minutes.</p>
<p>Interested, how did you set up monitoring and alerting for this? Using alertmanager/prometheus or something else?</p>
</div>2021-06-29T03:32:16ZBy Hales on /blog/sysadmin/NetworkCablesGoBadtag:CSpace:blog/sysadmin/NetworkCablesGoBad:3433e29e8e824d15329f9a8847044461f06afc5aHaleshttps://halestrom.net<div class="wikitext"><p>I've found that it's not always the patch cable that's bad, sometimes it's the gold-plated spring fingers inside the ethernet jacks themselves. Multiple patch cable changes do not always fix dirty fingers.</p>
<p>A toothbrush can help. Keep in mind that ethernet jack fingers have variable sliding contact with the pins of the ethernet cable's plug, it's not just 'one spot' that is always used for electrical contact. This is why you can jiggle ethernet cables around a mm or so and the link (hopefully) doesn't go down.</p>
</div>2021-06-26T05:00:10Z