OpenSSH sshd's 'MaxStartups' setting and Internet-accessible machines
Last night, one of our compute servers briefly stopped accepting SSH connections, which set off an alert in our monitoring system. On compute servers, the usual cause for this is that some program (or set of them) has run the system out of memory, but on checking the logs I saw that this wasn't the case. Instead, sshd had logged the following (among other things):
sshd[649662]: error: beginning MaxStartups throttling sshd[649662]: drop connection #11 from [...]:.. on [...]:22 past MaxStartups
I'm pretty sure I'd seen this error before, but this time I did some reading up on things.
MaxStartups is a sshd configuration setting that controls how many concurrent unauthenticated connections there can be. This can either be a flat number or a setup that triggers random dropping of such connections with a certain probability. According to the manual page (and to comments in the current Ubuntu 22.04 /etc/ssh/sshd_config), the default value is '10:30:100', which drops 30% of new connections if there are already 10 unauthenticated connections and all of them if there are 100 such connections (and a scaled drop probability between those two).
(OpenSSH sshd also can apply a per-'source' limit using PerSourceMaxStartups, where a source can be an individual IPv4 or IPv6 address or a netblock, based on PerSourceNetBlockSize.)
Normal systems probably don't have any issue with this setting and its default, but for our sins some of our systems are exposed to the Internet for SSH logins, and attackers probe them (and these attackers are back in action these days after a pause we noticed in February). Apparently enough attackers were making enough attempts early this morning to trigger this limit. Unfortunately this limit is a global setting, with no way to give internal IPs a higher limit than external ones (MaxStartups is not one of the directives that can be included in Match blocks).
Now that I've looked into this, I think that we may want to increase this setting in our environment. Ten unauthenticated connections is not all that many for an Internet-exposed system that's under constant SSH probes, and our Internet-accessible systems aren't short of resources; they could likely afford a lot more such connections. Our logs suggest we see this periodically across a number of systems, which is more or less what I'd expect if they come from attackers randomly hitting our systems. Probably we want to keep the random drop bit instead of creating a hard wall, but increase the starting point of the random drops to 20 or 30 or so.
(Unfortunately I don't think sshd reports how many concurrent unauthenticated connections it has until it starts dropping them, so you can't see how often you're coming close to the edge.)
|
|