Wandering Thoughts archives

2014-03-09

Solaris gives us a lesson in how not to write documentation

Here are links to the manpages for reboot in Solaris 8, Solaris 9, and Solaris 10 (or the more readable Illumos version, which is probably closer to the Solaris 11 version). They are all, well, manual-pagey, and thus most system administrators have well honed skills in how to read them. If you read any of these, it probably looks basically harmless. If you read them in succession you'll probably wind up feeling that they're all basically the same, although Solaris 10 has grown some new x86-related stuff.

This is an illusion and a terrible mistake, because at the very bottom of the Solaris 9, Solaris 10, and Illumos versions you will find the following new section (presented in its entirety):

NOTES

The reboot utility does not execute the scripts in /etc/rcnum.d or execute shutdown actions in inittab(4). To ensure a complete shutdown of system services, use shutdown(1M) or init(1M) to reboot a Solaris system.

Let me translate this for you: since Solaris 9, reboot shoots your system in the head instead of doing an orderly shutdown. Despite the wording earlier in the manpage that 'the reboot utility performs a sync(1M) operation on the disks, and then a multi-user reboot is initiated. See init(1M) for details', SMF (or the System V init system in Solaris 9) is not involved in things at all (and thus no multi-user reboot happens). Reboot instead simply SIGTERMs all processes. That stuff I quoted from the DESCRIPTION section is now a flat out lie.

This is a drastic change in reboot's behavior. It is at odds with reboot's behavior in Solaris 8 (as far as I know), the traditional System V init behavior, and reboot's behavior on other systems (including but not limited to Linux). Sun decided to bury this drastic behavior change in a casual little note at the bottom of the manpage, so far down that almost no one reads that far (partly because it is after all of the really boring boilerplate).

This is truly an epic example of how not to write documentation. Vital changes go at the start of your manpages, not the very end, and they and their effects should be very clearly described instead of hidden behind what is basically obfuscation.

(The right way to do it would have been a complete rewrite of the DESCRIPTION section and perhaps an update to the SYNOPSIS as well.)

By the way, this phrasing for the NOTES section is especially dangerous in Solaris 10 and onwards where SMF services normally handle shutdown actions, not /etc/rcnum.d scripts (or inittab actions). In anything using SMF it's possible to read this section but still not realize what reboot really does because it doesn't explicitly say that SMF is bypassed too.

Update: As pointed out in comments by Ade, this appears to be historical Solaris behavior (contrary to what I thought). However, it is not the (documented) behavior of other System V R4 systems such as SGI Irix and it is very likely to surprise people coming from other Unixes.

solaris/RebootDangerousManpage written at 22:19:47; Add Comment

Why we don't change Unix login names for people

Every so often as system administrators we are a bit lazy. Or perhaps you could say that we are a bit sane. One of those cases here is that we do not, ever, change people's Unix login names. If you really want or need a change in login name, what we tell you to do is request a new account with the right login name, then transfer all your files to it and tell us to delete the old login.

(Users can set up their own email redirection from the old login to the new one, assuming they want to.)

In theory changing a Unix login name is easy; all you need to do is edit /etc/passwd to change it (both in the login name and in the home directory), then rename the home directory itself. Except we should probably change the login name in secondary groups in /etc/group. But we're not done, because users have a second home directory on our web server; we need to change that.

Unfortunately we've only started. Right now we have six separate machines that run Samba, all with separate Samba password files. I'm not exactly sure how you rename a Samba login but we'd have to do it on all of those machines. We also have at least a dozen machines where users might have crontab files (but probably don't). If you rename a login you need to rename the crontab file (as far as I know) so we'd have to check all of them and fix anything we found. The login being renamed might also have a user managed webservers that uses a URL under the user's web pages; that would need to get renamed.

This is quite a list and I'm not even sure that I've thought of all of the places where the user's login name might be hiding in our environment (and yes, I'm ignoring at jobs for the moment). In theory we could try to do all of this and make sure not miss a single thing. In practice it is much easier and much more reliable to get people to use our well-honed and frequently used procedures for creating and deleting accounts.

(We make accounts all the time and delete them periodically. We might 'rename' a login once a year.)

Can things still fall through the cracks, especially if the person getting the new login name doesn't notice? Certainly. But one subtle advantage here is that we aren't promising more than we can really deliver. If we promised to rename an account you might reasonably expect that all of this additional state would get transferred. Since we're merely making a new account it's clear (at least in theory) that additional state is something you have to worry about.

PS: A pragmatic side advantage of this approach is that we don't push back against who people want login name changes in the way we might if doing a login rename was a lot of manual work on our part. There actually used to be a policy that we just didn't do login renames short of acts of very high powers; that went away when we decided to do them the easy way. Nowadays it is more 'you want to change your login? well, sure, you'll be doing most of the work' (although we don't say this on in our support documentation).

sysadmin/WhyNoLoginRenames written at 01:32:57; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.