2005-06-29
Reconsidering network authentication delays
If you use ssh to connect to a machine and mistype your password, you'll probably have to sit through a couple of seconds of delay before you can try again. In theory this delay is supposed to slow down large-scale password guessing attempts. In practice it's pointless, as anyone who is getting mass-ssh-probed can attest.
Delaying after failed login attempts started in the world of physical terminals, where it works because the supply of physical terminals you can try to log in on is finite (especially if you aren't Reed Richards). However, a new 'terminal' for a network login is only another TCP connection away. Mass-scanning programs can (and do) open multiple connections in parallel, so trying to rate-limit any particular connection in isolation is mostly pointless. If you want to slow down mass ssh attacks, you need to rate-limit at least by IP address. (Since scanning often seems to come from zombie networks, even this may not help much.)
So what delays on bad passwords in the OpenSSH server is really accomplishing is to slow down real users who make typos, while not particularly getting in the way of attackers. (Since it means processes hang around longer, it may even hurt you under heavy load.)
There's a use for a little bit of delay during password checking; you probably want to make sure it takes at least as long as starting up a new connection to the daemon. But more delay than that is probably not getting you anything except periodically annoyed users.
2005-06-28
Scripting and automation capture knowledge
NewsForge has been running a series of articles by Brian Warshawsky on the 'Ten Commandments of system administration'. His tenth commandment is 'Thou shalt not waste time doing repetitive and mundane tasks' (article here, with links to the other nine), where he tells people to automate such tasks through shell scripts.
There's a number of good reasons to follow this advice; for a start, writing shell scripts is probably less boring than doing these mundane tasks over and over. But there's a slightly inobvious one: scripts document how to do things.
When you perform a task by hand, the knowledge of how to do that task may exist only in your head (which presents certain problems if you want, for example, an undisturbed vacation). When you automate the task through a script, the script itself serves as documentation for how the task is done (maybe not good documentation, but commenting the script can help with that).
Thus, when you automate something you're both getting out of a boring task and creating documentation at the same time. Since both are virtuous sysadmin activities, you're getting a nice two for one deal.
Scripts have an additional, highly useful property as documentation: although they may not be clear or easy to follow, they are guaranteed to be accurate documentation. Given the tendency for much documentation to go out of date at the drop of a hat, this can be very handy.
2005-06-24
An unchanging system should be stable
One of my principles of system administration is an unchanging system should be stable. Stable means more than 'doesn't crash', it means 'doesn't need to be fiddled with all the time'; no hand-holding, no cleaning up afterwards. It means a system that can be left to run quietly in the corner.
This does mean that you have to make it so that ordinary things can't
cause the system to explode. The two big ones are making sure that
system logs don't get too big and that temporary directories like
/tmp and /usr/tmp get cleaned out regularly. (Fortunately modern
systems mostly do this for you.)
After I've done this, I've found that things that keep demanding my attention are usually symptoms of some deeper issue that I need to deal with:
- I've failed to automate a necessary procedure
- I need to fix or work around some piece of buggy software
- there's a dying or broken piece of hardware somewhere
- the systems are underconfigured or overloaded
There is a rare and unfortunate final case:
- a vendor's stuck me with braindead software that demands I interact with it by hand.
And really, if the systems aren't overloaded, the users aren't asking for changes, and there's no buggy software or broken hardware, why should a system need attention? Everything left looks a lot like monkey make-work, whether self-inflicted or forced on you by vendors.
Sidebar: What about security?
New security issues are changes, and all changes require you to do things.
I feel that things that look for local security problems should only alert you if there is something you need to deal with, such as a reliable sign of a breakin.
(Keeping logs of other information is okay, but looking at them is usually just boredom inducing monkey-work; what are you going to do with the information, and is it actually productive?)
2005-06-22
Future Sysadmin Jobs
I am fond of giving everyone who tells me they're thinking of being system administrators a rant disguised as a question:
When things are automated in the future, there are going to be three sysadmin jobs left: the secretary filling in forms, the stock clerk swapping the backup tapes and refilling the printer paper, and the troubleshooter they call in when there's problems. Which one do you want to be?
I firmly believe that all routine system administration will be automated sooner or later (yes, despite vendors failing to really deliver on promises of this for at least ten years). Let's be honest here: a lot of day to day sysadmin work can be done by well-trained monkeys (and some of it is), and when you're in that situation sooner or later the monkeys get replaced by computers.
What's left when that happens is physical labour (changing tapes and printer paper), data entry (the secretary typing information into forms to create accounts or the like), and fixing things when something exceptional happens and the system goes wrong.
So: in that future, which one are you going to be? (And how much of your job could a well-trained monkey do today?)
I know where I'm aiming.
Sidebar: but what about a job creating the automation?
That's (systems) programming, not systems administration, and I think there are not going to be all that many jobs in it. Compare how many Windows admins there are with how many people Microsoft has writing Windows administration tools.
(And if you want a career in it, you're going to want to follow a much different path; for a start, at least as much programming as systems administration.)
2005-06-17
The problem with CPAN (and other similar systems)
At one level, CPAN is a great thing: people really like having a simple way of installing Perl packages (and a big archive of them). It's such a good idea that it's being copied over and over: Python's distutils, Ruby's RubyGems, the R statistic package's CRAN, and no doubt others.
But CPAN and things like it have a problem: they're a package management system. Or, to be more detailed, they're another package management system, on top of the one that our Unix systems already have.
Multiple package systems on a single computer means that no single package system has a full picture of the system. This causes various problems:
- to get a complete picture of what's on the system, I have to remember to use multiple tools (and remember to how to use all of them).
- two different tools can both think they own or exclusively manage certain files (for example, index files of all the packages installed). The extreme case is installing the same thing through the OS package manager and a program's own package manager.
- missing cross packaging system relationships; for example, things installed through CPAN likely depend on the version of Perl installed by the OS's package manager. Does the OS package manager know enough to tell me that upgrading Perl because of a security fix is going to orphan all of those CPAN packages I need?
- satisfying dependencies: when I try to install a core OS package BazOrp, which requires Python package FooBar (version 1.6.1 to 1.7.8) how does the core OS package management system know that I installed FooBar 1.7.7 through Python's distutils and it's OK to go ahead?
And this simplifies the problems, because most of these CPAN-like things are not actually package management systems, they are package installation systems. All they do is install things; they don't keep a package inventory (especially with version numbers), they usually don't have much of an idea of package dependencies, and often they can't even remove what they just installed.
The situation is worse when I work in large-scale environments, with tens to hundreds of systems. Systems that large can't deal with computers by hand; they have to be managed through automated systems.
In that sort of environment, every program with its own package system means that I would have to obtain or build an automated system to manage that package system. Since the package system itself is unlikely to provide the basic management tools (inventory, dependencies, etc), I would have to build those, too.
You may have guessed the punchline: as a result of all of this, we don't and can't use CPAN, distutils, RubyGems, CRAN, and so on. Of course this is sometimes difficult to explain to users, who are know to approach us to ask 'there is this CPAN module I need, can you please install it on the machines?' and then don't understand why I break down and twitch.
Solution: build real OS packages
I already have to deal with the OS's package management system, so the best way to make my life easier is to make your package installation system build OS packages for me, instead of directly installing files.
This shouldn't be too difficult, as your installation system already has most of the information necessary, such as what files are going to be installed and a package description. Don't worry too much about dependencies, as a decent OS packaging system will be capable of working them out for you.
On Linux systems, supporting building Debian .debs and RPMs will get you most of the way to making people entirely happy. (You don't have to decide which distributions to support; with generic building support, you support everyone using that packaging format.)
Existing support for this
Debian's dh-make-perl builds CPAN packages into Debian .debs.
The CPAN RPM::Specfile package and its cpanflute2 program will build
RPMs from CPAN packages. (Getting it in RPM form to bootstrap this
properly may be a pleasantly recursive exercise.) There's also the
cpan-to-rpm.pl program from
here,
to do everything in one go. (I believe cpanflute2 has had some
problems for us in the past, but I have blotted them out from my
mind.)
Python distutils has a bdist_rpm command for building RPMs,
but this doesn't work reliably for somewhat complicated packages
in the versions I've tried. (Yes, I should file bug reports and
produce patches to fix things. Someday, when I have enough time
to fully investigate the situation.)