The big trick of running lots of systems

July 4, 2005

There's a lot of rules of running large-scale systems, with lots of machines. I'll probably be writing up my own version of them at some point. But they all really come down to one big trick:

Don't administer individual machines.

That's it. Everything else is in the implementation details. (Of course, the devil is always in the details.)

But what does it mean? More or less what it says: you should never deal with machines one by one, ideally not even if one of them is exploding. Dealing with machines one by one is somewhat like trying to get through a swamp on foot; you can make progress, but oh so very slowly, and slogging through the mud is very tiring.

This deep principle underlies a lot of large scale system administration tools, including things like LDAP, NIS, and automounters. (Which are just ways of making it so that you don't have to worry about /etc/passwd and /etc/fstab and so on on each machine.)

(Like the best big tricks this is in some ways a very Zen thing, so it's hard to find much to say about it that doesn't feel like belaboring the obvious.)

Written on 04 July 2005.
« There's two sorts of large systems
Fedora Core 4's buggy Anaconda »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Mon Jul 4 23:59:17 2005
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.