How we propagate password information in our fileserver infrastructure

June 27, 2010

As mentioned earlier, we have a fileserver infrastructure and so we need some way of propagating account and password information around (and letting people actually update their passwords). The old traditional answer is NIS, the new traditional answer is LDAP, and we don't really like either so we wrote our own.

Given the the Unix system UID problem, any such system has three parts: where each machine's account information lives, how global account information propagates around, and how you combine global accounts and system accounts together.

Our answer to the first question is that each machine has a complete local copy of /etc/passwd, /etc/shadow, and /etc/group. This is the simple approach, because everything is guaranteed to work with local files since that is how single, isolated machines work.

(We also feel nervous about adding another point of failure to our fileserver infrastructure in the form of a master account machine that must be up in order for anyone to be able to log in anywhere.)

We've also chosen a simple way to handle propagating the global account information around; we use our existing fileserver infrastructure. We have a central administrative filesystem where the global passwd, shadow, and group files live, and every client machine NFS mounts it under a standard name. The one complication of the NFS mount approach is that client machines must have root access to the filesystem in order to read the global shadow file, which means that we have to be very careful about which machines we allow to have write access to it.

(The use of an NFS filesystem is really a small implementation detail. Our Solaris fileservers use the same system and programs to keep their /etc/passwd in sync, but they don't NFS mount the administrative filesystem because we don't believe in NFS crossmounts on the fileservers. Instead they copy the files over with rsync.)

The update process is somewhat complicated. First, the global passwd et al is the authoritative source of global accounts, while each machine's /etc/passwd et al is the authoritative source of its local system accounts. We tell the two apart based on UID and GID; our global user logins and groups are always within a specific UID and GID range (one that is chosen to not clash with local system UIDs and GIDs).

We propagate updates to global accounts by periodically running an update script that extracts all of the system accounts from /etc/passwd, extracts all of the global accounts from our master passwd, merges the two together, and writes out an updated /etc/passwd et al if anything changed. Because it's convenient, this also updates the passwords of any system accounts from the global shadow file if they're present there.

(This avoids having to change the root password on every single system we have, which would be a great disincentive to changing it at all.)

In the process of handling global accounts, the program allows us to both selectively exclude (or only include) some and to selectively or unselectively mangle accounts in various ways. We can change shells (for example to give accounts a shell that just tells them they can't log in to this machine), remap where home directories are in various useful ways, and so on. Also, if there is a conflict between a system login or group name and a global login or group name, it renames the global login or group by sticking a prefix on it.

(Naturally we put the update script itself in the central administrative filesystem too, because that makes maintaining it simpler. The only thing that lives on the client machines is the crontab entry that invokes the whole system every so often.)

Password changes are handled by a cover script for passwd that ssh's off to our password master machine and runs the real passwd program there. The global passwd et al are just straight copies of the password master machine's /etc/passwd et al, although they get run through a checking program before they get copied from /etc into the central administrative filesystem. This is an important safeguard against stupid mistakes when updating the master machine's /etc/passwd et al.

(The details of how things work on the password master machine are somewhat complicated, so I'm not going to put them here.)


Comments on this page:

From 195.26.247.141 at 2010-06-28 08:36:30:

(We also feel nervous about adding another point of failure to our fileserver infrastructure in the form of a master account machine that must be up in order for anyone to be able to log in anywhere.)

In which case, don't you also feel nervous about using NFS at all, assuming you use NFS for mounts on the clients and don't do something like rsync all the data to each machine?

By cks at 2010-06-28 11:55:37:

We're already dependent on NFS to start with (or some networked filesytem) for user home directories and other filesystems. Thus, adding another filesystem that we need working to have our environment working doesn't make us particularly nervous.

(We're actually less dependent on the administrative filesystem than on things like user home directories and /var/mail.)

Written on 27 June 2010.
« The Unix system UID and login name problem
The great irritation of hidden access controls »

Page tools: View Source, View Normal.
Search:
Login: Password:

Last modified: Sun Jun 27 01:19:36 2010
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.