Some thoughts on when you can and can't lower OpenSSH's 'LoginGraceTime'

May 7, 2024

In a comment on my entry on sshd's 'MaxStartups' setting, Etienne Dechamps mentioned that they lowered LoginGraceTime, which defaults to two minutes (which is rather long). At first I was enthusiastic about making a similar change to lower it here, but then I start thinking it through and now I don't think it's so simple. Instead, I think you can look at three broad situations for the amount of time to log in you give people connecting to your SSH server.

The best case for a quite short login grace time is if everyone connecting authenticates through an already unlocked and ready SSH keypair. If this is the case, the only thing slowing down logins is the need to bounce a certain amount of packets back and forth between the client and you, possibly on slow networks. You're never waiting for people to do something, just for computers to do some calculations and for the traffic to get back and forth. Etienne Dechamps' 20 seconds ought to be long enough for this even under unfavourable network situations and in the face of host load.

(If you do only use keypairs, you can cut off a lot of SSH probes right away by configuring sshd to not even offer password authentication as an option.)

The intermediate case is if people have to unlock their keypair or hardware token, touch their hardware token to confirm key usage, say yes to a SSH agent prompt, or otherwise take manual action that is normally short. In addition to the network and host delays you had with unlocked and ready keypairs, now you have to give fallible people time to notice the need for action and respond to carry it out accurately. Even if 20 seconds is often enough for this, it feels rushed to me and I think you're likely to see some amount of people failing to log in; you really want something longer, although I don't know how much longer.

The worst case is if people authenticate with passwords. Here you have fallible humans carefully typing in their password, getting it wrong (because they have N passwords they've memorized and have to pick the right one, among other things), trying again, and so on. Sometimes this will be a reasonably fast process, much like in the intermediate case, but some of the time it will not be. Setting a mere 20 second timeout on this will definitely cut people off at the knees some of the time. Plus, my view is that you don't want people entering their passwords to feel that they're in a somewhat desperate race against time; that feels like it's going to cause various sorts of mistakes.

For our sins, we have plenty of people who authenticate to us today using passwords. As a result I think we're not in a good position to lower sshd's LoginGraceTime by very much, and so it's probably simpler to leave it at two minutes. Two minutes is fine and generous for people, and it doesn't really cost us anything when dealing with SSH probes (well, once we increase MaxStartups).

Comments on this page:

From at 2024-05-09 00:52:33:

(If you do only use keypairs, you can cut off a lot of SSH probes right away by configuring sshd to not even offer password authentication as an option.)

If only :(

For a long time all my personal servers had SSH exposed to Internet, as they had always been limited to keypair-or-Kerberos (GSSAPI, not the password-emulated one) and I disliked the idea of relying on VPNs – but while disabling password auth does indeed cut off login hammering attempts, it doesn't really prevent the probes themselves – and in particular, it does nothing against bots that make a ton of connections all of a sudden (or against those that send password auth requests despite it not being offered...)

A few times recently, I actually had a machine hit MaxStartups several times when a botnet opened all its connections in the same instant (I think it was 3-4 per address but a fairly large amount in total), shutting me out before they even got to the authentication stage.

I still dislike the idea of requiring a VPN to access even my "entry point" hosts, but I ended up using it for other reasons anyway (as I kept setting up more junk in what passes for my "lab" and of course it only had private IPv4), so at some point I went for it and blocked most inbound SSH from outside.

One semi-related thought was that I've set up a small Dropbear "jump host" (on port 2 because why not) in case of emergencies, and also configured an Apache httpd to act as an inbound HTTP CONNECT proxy for the few odd cases where I might want to connect from a place that blocks SSH (with OpenSSH that involves calling proxytunnel from ProxyCommand while PuTTY supports HTTP proxies natively). If people don't want to VPN and all you want is to block off the botnets, then hiding the servers behind an HTTP proxy might be an effective option – if a very ugly one...

By Mickey Black at 2024-05-09 12:44:30:

I'm looking at the sshd_config(5) manual page, and can't help but notice that the whole thing seems to be written very matter-of-factly: saying, fairly well, what exactly each option does, while not giving any hint about why the options exist or what they could be reasonably used for.

MaxStartups is a good example. One could easily infer the denial-of-service behavior that's not stated outright. But how was the default value of "10:30:100" chosen? What determines whether a value is reasonable in some environment? When should I use the "random early drop" feature? Why does an unauthenticated-connection limit exist at all? Likewise for LoginGraceTime, how would a too-low number affect user experience?

My guess is that these two options exist because there's significant cost to keeping an unauthenticated connection around. I suspect OpenSSH's famous "privilege separation" feature might be making things costly compared to the servers that could handle 10,000 clients ("the C10K problem"), 25 years ago, on a 1 Gbit/s connection. But Wikipedia says that a single server can now handle millions of connections. Without any sense as to what the cost of an unauthenticated OpenSSH client is, I really have no idea whether the default limit is still reasonable or has become terribly dated. Maybe I should multiply it by a thousand. Maybe OpenSSH should be re-architected till it can reasonably handle 10,000 or 100,000 ("openssl speed ed25519" shows that my server can verify upward of 10,000 signatures per second—but "top" suggests that's single-threaded, and this is a 16-core 32-thread system!).

Written on 07 May 2024.
« What affects what server host key types OpenSSH will offer to you
All configuration files should support some form of file inclusion »

Page tools: View Source, View Normal, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Tue May 7 21:48:37 2024
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.