Wandering Thoughts archives

2023-08-15

Maybe we shouldn't default to allowing logins on machines

We run what is now a relatively odd and different environment, where we provide a number of standard services (like IMAP email) along with a general multi-user Unix login experience on both general 'login' servers and compute servers (along with a closely related SLURM cluster). Historically, most of the Unix machines involved in doing all of this have needed to have our full set of logins and along with that have been set up with everyone's home directories (and all of our other filesystems) NFS mounted from our fileservers. As part of this overall NFS environment, we have a long standing system to propagate login and password information around.

Right from the beginning, this password propagation system has been able to filter out logins and rewrite passwd information on each machine. This can be used to correct home directory locations, edit shells to where they live on this particular machines, or, in its modern usage, sometimes zap passwords or force-change people's shells to various administrative ones.

(When we started with this system we had an environment with various different Unixes, so 'bash' might not be in the same place on all machines.)

Our historical habit and configuration default has been to not apply any real modification to our global password file on 'general NFS mounts' machines, regardless of what the purpose of the machine is. Making a machine be, for example, restricted so that only staff can log in takes an extra step or two and historically we don't bother to do this. The end result is that by default, everyone with an account on our systems can log in to a surprising number of servers, including many that aren't 'login' or 'compute' servers and so aren't ones that people would normally touch.

For a long time this was relatively harmless. But these days people keep finding flaws in CPUs, like Zenbleed and Downfall, which allow one login to relatively readily steal information from other logins and processes, and these days come with dramatic and working proofs of concept. These flaws make it directly dangerous to allow general user logins, and they also make it harder to handle mitigations like microcode updates, which require rebooting the server, which is easier when you know that no random person will be trying to log in to it.

For this reason and others, we have been slowly making more and more servers be 'staff only' servers, where all non-staff logins get rewritten to have a 'you can't log in to this server' shell. At this point we have more staff-only servers than unrestricted servers, although an unrestricted server is still the default. Given how pervasive this restriction has become in our environment, quite possibly it should become the default, and the few unrestricted machines be specifically set up for that.

One of the quiet advantages of defaulting to disallowing non-staff logins is that we'll find out right away if something the machine is for actually requires non-staff to have valid shells in /etc/passwd. This has historically been one of our concerns when we considered closing down access to machines; there was always the worry that we might break something that was currently working. If we default to closed, we'll find out before we put a machine into production (or at least no later than when it becomes 'production' but doesn't work for people).

PS: Before I started writing this entry I hadn't looked at the number of unrestricted machines compared to the number of staff-only ones, so I hadn't realized that the unrestricted machines were now a small minority. We've slowly and quietly gone a lot further on restrictions that I'd expected.

sysadmin/MaybeNotAllowLogins written at 23:07:50;


Page tools: See As Normal.
Search:
Login: Password:

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.