OpenSSH sshd's 'MaxStartups' setting and Internet-accessible machines

May 5, 2024

Last night, one of our compute servers briefly stopped accepting SSH connections, which set off an alert in our monitoring system. On compute servers, the usual cause for this is that some program (or set of them) has run the system out of memory, but on checking the logs I saw that this wasn't the case. Instead, sshd had logged the following (among other things):

sshd[649662]: error: beginning MaxStartups throttling
sshd[649662]: drop connection #11 from [...]:.. on [...]:22 past MaxStartups

I'm pretty sure I'd seen this error before, but this time I did some reading up on things.

MaxStartups is a sshd configuration setting that controls how many concurrent unauthenticated connections there can be. This can either be a flat number or a setup that triggers random dropping of such connections with a certain probability. According to the manual page (and to comments in the current Ubuntu 22.04 /etc/ssh/sshd_config), the default value is '10:30:100', which drops 30% of new connections if there are already 10 unauthenticated connections and all of them if there are 100 such connections (and a scaled drop probability between those two).

(OpenSSH sshd also can apply a per-'source' limit using PerSourceMaxStartups, where a source can be an individual IPv4 or IPv6 address or a netblock, based on PerSourceNetBlockSize.)

Normal systems probably don't have any issue with this setting and its default, but for our sins some of our systems are exposed to the Internet for SSH logins, and attackers probe them (and these attackers are back in action these days after a pause we noticed in February). Apparently enough attackers were making enough attempts early this morning to trigger this limit. Unfortunately this limit is a global setting, with no way to give internal IPs a higher limit than external ones (MaxStartups is not one of the directives that can be included in Match blocks).

Now that I've looked into this, I think that we may want to increase this setting in our environment. Ten unauthenticated connections is not all that many for an Internet-exposed system that's under constant SSH probes, and our Internet-accessible systems aren't short of resources; they could likely afford a lot more such connections. Our logs suggest we see this periodically across a number of systems, which is more or less what I'd expect if they come from attackers randomly hitting our systems. Probably we want to keep the random drop bit instead of creating a hard wall, but increase the starting point of the random drops to 20 or 30 or so.

(Unfortunately I don't think sshd reports how many concurrent unauthenticated connections it has until it starts dropping them, so you can't see how often you're coming close to the edge.)


Comments on this page:

It is a serious DoS issue. Putting SSH on a non-standard port doesn’t help much, and per-IP limits don’t protect against botnets. Some companies have taken to putting SSH behind a VPN like WireGuard that won’t respond to unauthenticated connection requests.

By Etienne Dechamps at 2024-05-07 14:47:23:

An effective way to mitigate this is to reduce the amount of time a connection can stay in the unauthenticated state using the LoginGraceTime option, which by default is too generous (2 minutes).

I used to have the same issue as the one described in the post, but since applying the following config I haven't seen it again:

MaxStartups 60:30:100
LoginGraceTime 20
Written on 05 May 2024.
« We have our first significant batch of servers that only have UEFI booting
What affects what server host key types OpenSSH will offer to you »

Page tools: View Source, View Normal.
Search:
Login: Password:

Last modified: Sun May 5 22:34:43 2024
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.