Wandering Thoughts archives

2008-01-31

The sysadmin's life (again)

I made a bugfix to our mail server's configuration today.

Size of the bugfix: 25 characters added to an existing line.

Size of the comment sort of explaining why the bugfix is necessary: six lines.

(And a complete discussion of the issue would be much longer.)

Of course, this is not unique to system administration; programming also has that sort of bugfixes. I think system administration may be more prone to it, because so few of our tools, especially the ones with complicated logic, have actual programming languages. Without languages, our logic and thus the changes in that logic become all that more dense and cryptic.

Another way to put it is that very few people create configuration systems with the goal of communicating with both the program and other people; instead they are almost always aimed just towards communicating with the program. This generally winds up leaving a great many things implicit, and then to try to make them explicit we write six lines of comments for 25 character changes.

(Let's not talk about the amount of effort to test and validate changes, either. The idea of testability is pretty much a foreign concept for most systems that I have to deal with; I can only dream of automated functional tests unless I feel like building a pretty large amount of infrastructure.)

TheSysadminLifeII written at 22:55:14; Add Comment

2008-01-27

Classic crontab syntax mistakes

The crontab file format is unfortunately prone to mistakes, since it requires you to carefully count words and, for bonus fun, requires a different number of words in different contexts. There are three classic crontab syntax mistakes that I have seen (and sometimes committed):

  • adding a username when it's not necessary

    This generally results in mail from cron complaining that it cannot run 'username'.

  • failing to specify a username when it's necessary

    This results in mail about 'no such user program', assuming that you gave the program arguments. (A good version of cron will complain about it even if you didn't give the program arguments, because then it is a crontab line without a program to run.)

  • putting too many *'s in.

    If you are lucky, this produces mysterious email about cron not being able to execute the first file in your home directory. If you are not lucky, the first file in your home directory is executable, and it will get run with the rest of the files in your home directory as arguments.

    Alternately, if this is in a situation that needs a username, you get email about being unable to find the username '*'.

System crontab entries need usernames; on Linux systems this is /etc/crontab and the /etc/cron.d/* files. Per-user crontab entries, including root's, do not need or accept usernames.

(By 'per-user crontab entry' I mean what you get when you use 'crontab -e' and its kin.)

Personally I strongly suggest that you do not give root a per-user crontab on a machine with system crontabs, because doing so is a great way to confuse people. In fact, if you have system crontabs I don't think that any system account should have a per-user crontab; just put everything into the system crontabs with the right usernames specified.

ClassicCrontabMistakes written at 23:59:44; Add Comment

2008-01-25

A modest suggestion about test accounts

Here is a modest suggestion that has recently occurred to me:

Don't give your test accounts the same password as your regular account.

It's not that I'm all that worried about security issues; it's that I want to avoid accidentally logging in as one account when I'm trying for the other. With separate passwords, I have to make an absent-minded mistake with both the username and the password, instead of just the username, and I figure this improves my odds.

I have to admit that I've never actually made this mistake, but I have had times when I looked at the username just to make sure. I suspect that slower typists have fewer problems here because they think more about what they're typing; I wind up typing a lot of things more by reflex than by conscious thought, often including my usernames and passwords.

And for all that I'm rather casual about them, there are real security issues, especially if you have to test systems whose password handling you don't entirely trust. And there's an awful lot of things these days that will 'helpfully' remember access passwords for you so they can do things automatically the next time around.

TestAccountSuggestion written at 23:55:09; Add Comment

2008-01-17

Lab notebooks are not changelogs

Here's something from my previous entry that I should clarify: lab notebooks are not changelogs.

Lab notebooks and changelogs are two different things. Lab notebooks are scribbled at the time for yourself, and should include everything. Changelogs are written after the fact for other people, and should include only the things that actually turned out to matter.

In other words, changelogs are the sysadmin equivalent of lab reports and scientific papers. You can no more use a lab notebook as a changelog than you could submit a lab notebook as scientific paper, and for much the same reason; fundamentally, a lab notebook has observations while a changelog has conclusions.

(Changelog is perhaps not the right word; I'm using it by analogy to the changelogs that programmers write. Sysadmins write changelogs to document what you did and why: you did X to cure problem Y, or to achieve result Z.)

Another difference is the medium. Changelogs ought to be electronic for all the obvious reasons, while I think that your primary lab notebook ought to be paper, because it is more flexible and easier to use under any circumstances. At the same time, having a shared electronic lab notebook of some sort is so useful that reusing a good changelog system for it is awfully tempting (and I suspect that a lot of people do just that; certainly we do to some extent).

One corollary of all of this is that I consider changelogs to be (part of) documentation, but I do not consider lab notebooks to be. Lab notebooks are a memory aid.

LabbooksVsChangelogs written at 23:57:40; Add Comment

2008-01-16

Why sysadmins should keep a lab notebook

Yesterday, a coworker and I were working on a performance issue we're having with our new SAN RAID controller. We had a hypothesis about what might provoke the problem, so we sat down, fired up some tests, and watched our logs; nothing showed up. Later on in the day, we saw some odd indications in other logs and wanted to see if they correlated with the tests we'd done, but you can already guess the punchline: we hadn't recorded when we started and stopped the tests.

This wasn't because we were stupid idiots (although you may disagree); it happened because we were focused on what we were looking for at the time of the experiment, which was going to give us a yes or no answer right away.

The important thing about a lab notebook is not so much the physical object; it is the discipline of writing everything down, even if you don't think you need it at the time. Keeping a record, even a simple one, means that you do not have to rely on fallible memory and guesswork when you later want to look back to summarize what experiments you've done (especially the unsuccessful ones, especially the fine details), or what exactly you did in the process of fixing the mysterious problem, and so on.

(This is especially important for problem fixes, because humans have a great habit of assuming that the last thing we did has to be what worked and then brushing everything else out of the way. And we can do this without even consciously realizing what we're doing.)

As my experience handily demonstrates, keeping a lab notebook is especially important during problems and crises, when you have no real idea what is going on, what to do next, and how to fix things. When we know the least is exactly the time when we need to record the most, because we just don't know what's going to turn out to be important in the end (and we are prone to overconfident, hopeful guessing).

(As a side benefit, scrawling grumpy remarks in your lab notebook can be a good stress relief that does not involve ranting at your coworkers.)

SysadminLabNotebook written at 23:51:41; Add Comment

2008-01-11

The importance of killing processes with the right signal

Here is an important corollary to KillOrderImportance: when your system is overloaded, you should always kill processes with 'kill -9'.

There are a fair number of sources that will tell you that you should always kill processes with something besides kill -9, unless they won't die from lesser measures. In an overloaded situation this is terribly wrong: either the processes don't have a signal handler for the lesser signal, in which case the two are equivalent, or they do have a signal handler, in which case using the lesser signal simply causes them to wake up (if they were sleeping) and churn around more.

Even in general I tend to be somewhat dubious about the advice; usually, when I am killing a process with anything except a small list of signals, I want it gone. Using 'kill -9' makes completely sure of this, in one go, without any fuss and bother.

(If you really want to give a process a chance to clean up, you need to know what sort of program it is. User level programs tend to only catch SIGHUP, if that, while demons probably don't catch SIGHUP but may catch SIGTERM, since many Unixes send everything SIGTERM as part of shutting down.)

KillSignalImportance written at 23:41:04; Add Comment

2008-01-01

An unpleasant thing about system administration

One of the unpleasant things about system administration is how it can turn me into a petty authoritarian, full of irritation with users that have the temerity to 'break the rules' and not follow directions, even if (or especially if) the rules are themselves impositions from outside that I myself dislike and object to. Instead of cheering on the users, I become angry with them, apparently merely because they have the temerity to not do what I told them to. In some dark place in my mind, people not doing what I told them to is more important than anything else, more important than my own dislike of the imposed rules, enough so to turn me into a willing collaborator with rules that I normally dislike.

(As if I could actually tell them what to do, too, so in a sense I'm getting angry at them for ignoring me.)

Looking in the mirror and seeing a petty tyrant staring back is not something that I like very much. It leaves me a with a hollow feeling, even if I catch it while the impulse is still percolating in the back of my mind or when it is just the first flare of irritation. (And I don't always; an uncomfortable number of times the realization has only struck me some time later.)

I won't say that thinking about this has led to me having more sympathy for the petty authoritarians one runs into in life, but perhaps it's given me some better understanding of them, even if it's not a comfortable one. (I don't find the idea that a petty tyrant may lurk not too far beneath the surface of many people to be exactly comforting.)

UnpleasantSysadminThing written at 23:24:01; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.