2021-04-26
The question of how to do non-annoying multi-factor authentication for SSH
Suppose, hypothetically, that you have access to a general multi-factor authentication (MFA) system such as Duo (with the choice of MFA system not being under your control), and that you would like to use this for secure SSH logins to your collection of (Ubuntu) servers. This is generally easy by itself, with pretty much any MFA system having a PAM module that adds a second factor challenge to your regular SSH authentication. Unfortunately the result of a straightforward MFA integration with SSH logins is going to be quite annoying for some people to use, because every time they log in to any machine they will have to pass an MFA challenge as well as their regular login authentication. If you only log in to a few machines every so often, this is okay. If you're frequently logging in and out of multiple machines, you're going to be irritated.
(One consequence of this is that it encourages everyone to stay logged in to all of your machines all of the time. This is not necessarily what you want for good security.)
The general industry solution for this seems to be short term SSH user certificates that are issued through some MFA/SSO gateway; Smallstep has a writeup of this with links to other people who have done it. I have a number of questions about how well this works on Windows and how easy it is to teach people to use it (and to troubleshoot problems), plus you also have to run your own MFA/SSO gateway and signing infrastructure (which is critical both for security and for people being able to log in to any of your systems).
(I'm going to assume that Windows 10 has a decent SSH agent story these days. The SSH user certificate stories usually load your newly issued certificate into the agent, although that leaves me with other questions.)
The general simple solution seems to be putting the MFA somewhere other than SSH logins; common approaches are MFA VPNs and MFA remote desktop connections. You still require (single factor) authentication on SSH logins as usual, but you only allow them from MFA-protected sources instead of from anywhere (either the general Internet for people like us or an organizational intranet). This solution also lets you have MFA for SSH logins from outside a trusted area without forcing MFA for all SSH logins, which otherwise gets complicated with the MFA PAM module solution. The remote desktop solution has the drawback that people need a remote desktop at work to connect to.
The natural Unix and SSH expert solution is a variant of this. You have
an MFA protected SSH jumphost and people establish a shared connection to it that they authenticate and let sit. With
an existing connection master, they can run ssh
in other windows and
sessions and it will piggy-back on the already authenticated master
connection to the jumphost, then pass onward to perform regular SSH
authentication to the hosts past the jumphost. Your other hosts are not
accessible via SSH except from the jumphost or the internal network.
I'm not sure how well connection sharing works with current Windows
SSH clients, and in general it's subject to issues if the single TCP
connection has problems or is ever broken.
A less nice but easier to explain version of this is that people
log in to the jumphost and start screen
or tmux
sessions there
that they use to log in to whatever machines they want. This version
is resilient against the person to jumphost connection being broken,
but has the drawback that people who reasonably want to log in to
different machines in different windows are faced with multiple MFA
challenges.
It would be nice if there was some easy to use way that would get SSH on the clients to interact nicely with web based MFA SSO. How I'd like it to work is that if you already have a SSO authenticated browser session (which you probably do), your SSH clients transparently inherit this single signon without you having to do anything special. However, I don't think there are any such systems without additional software.
PS: In some high security environments, the answer is that every SSH login must be protected by MFA no matter what and that people will have to live with that inconvenience. This doesn't describe our environment, and if we tried to treat it that way I'm pretty certain we'd find people bypassing us in various ways.
Maybe a local, on-machine caching DNS resolver should be standard (for us)
We have traditionally configured
our Ubuntu servers to have an /etc/resolv.conf
that points at our
central recursive DNS resolvers. People in research group sandbox
networks have generally done
likewise, partly because it's usually been the easiest thing to do.
Machines have to consult our local resolvers in order to correctly
look up other local machines, and once you're doing that you might
as well not add any extra layers (which have generally taken extra
work to add). But there's a downside to this configuration.
Every so often someone either writes or runs a program that does a lot of hostname lookups. Often this is as part of making a lot of connections, for example to fetch a bunch of external resources. Very few programming languages and standard libraries cache the results of those lookups even if they are all of the same hostname (and for good reason, especially in a world where the IP associated with a hostname can change rapidly). But in our environment, this results in a flood of requests to our local resolvers, a flood that would be drastically reduced by even a little bit of local caching. Local caching would also make the responses faster, since even on the same network, an over the network DNS query is slower than querying a daemon on your own machine.
Adding an extra layer of DNS caching does create some operational issues, especially if it caches negative answers. These issues can be reduced if DNS answers are only cached for a very short amount of time, but that generally takes extra configuration (if it's even possible). It's also traditionally taken an extra setup step and extra configuration in general, which is part of our bias against doing it. However, systemd is on its way to changing that with systemd-resolved, although there are plenty of questions about how it will work in an environment like ours and whether Ubuntu will ever adopt it as a standard part of server installs.
So far, we've been aggressive about disabling systemd-resolved in our install system (and haven't set up any other local caching resolver). However I'm starting to wonder if we should change that, especially if Ubuntu switches to normally wanting systemd-resolved on (so that, for example, netplan is unhappy with you if resolved isn't running).
(To really answer this question we should probably get fine grained query statistics from our DNS servers, or at least packets per second statistics. But that's a longer term project for various reasons.)