Scalable system management is based on principles

March 29, 2012

Here is something that I strongly believe:

Scalable system management is based on principles, not software.

If you get the important ideas that underly scalable system management (even if this is just a gut level understanding that you couldn't clearly articulate), the software ultimately doesn't matter and isn't necessary; you can build your own if and when you need it (although there are good reasons to use standard software). Conversely, if you do not understand the principles all of the best practices software in the world will not necessarily help you. There is very little software that will actively prevent you from managing systems in ways that turn out to be a bad idea.

(This is a variant of the old aphorism that you can write Fortran in any language.)

Or to put it directly: you do not get scalable system management just by using Cfengine, Puppet, Chef, or today's hotness. You get it by understanding what you need for scalable system management and then using whatever tools are necessary.

(Thinking otherwise is a cargo cult approach, where you believe that you can get the same results just by going through the same motions with enough fidelity to the originals.)

What makes this especially unfortunate in my view is that the actual principles of scalable system management are generally not really set out anywhere. Most people are left to fumble towards an understanding through (painful) experience or to pick things up through osmosis from documentation for good management software and how people use it. Perhaps it doesn't help that the standard style of system management writeups is generally long on descriptions of tools but short on the why of it all.

(Also, perhaps part of the issue is that people who've reached the point where they can write these things up wind up feeling that the principles of scalable system management are so obvious they don't really need to be mentioned.)

(It's my suspicion that the same thing is true of scalable software deployment and probably other things in the modern 'devops' arsenal, although I'm theorizing without actual experience here.)

Sidebar: some definitions

By system management I mean, well, managing systems; installing them and maintaining them and keeping them running. By scalable system management I mean a way of doing this where you can scale up the number of systems you run without having to scale up how many people you have managing them.

(System management is part of what gets called 'operations', but not all of it.)

Written on 29 March 2012.
« How I (once) did change management with scripts
Why I no longer believe that you need Solaris if you want ZFS »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Thu Mar 29 18:54:01 2012
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.