Wandering Thoughts archives

2008-12-21

Part of why managing firewalls is hard

Let me say it up front: managing a firewall of any decent complexity is hard. Sooner or later you start losing track of rules and what's actually going on, writing half-redundant rules, and so on; in short, your firewall ruleset descends into sysadmin superstition. I've recently realized that part of why this happens is that there are three views of your firewall's behavior that you need, and you can't get all of them just from reading your firewall rules; at most you can get one. (Often you don't get any.)

Firewalls are about certain sorts of sources being allowed to do certain sorts of traffic to certain destinations; in the abstract you say 'NFS clients are allowed to do NFS to NFS servers, and no one else is allowed to do NFS to anywhere'. You can want to look at your firewall from the perspective of any of those three things:

  • what traffic is allowed to this machine (or group thereof)?
  • what can this source of traffic do and reach?
  • what are all of the rules for NFS?

(It is tempting to think that you only have one source of traffic, that being the outside world, but I think this is wrong twice over. First, internal machines making outgoing connections are also a source, and you probably have them, and second sooner or later you are going to be treating some outside machines specially.)

You can write a firewall rule system that makes any one of these three the central focus (and you can of course write rule systems that make none of them the central focus, because the rules are expressed at a lower level). But you cannot write a rule system that makes all of them the focus simultaneously, and so you are always going to have to slice up and analyze your firewall rules to get two out of these three views. Even if you can do this, it is quite difficult for people to keep track of all three views at once and synthesize an overall picture from the combination (and thus you get fragile complexity).

This insight makes me vaguely depressed because it means that I can't solve my firewall problems by coming up with the right clever way and the right high level language to specify the firewall rules in. No matter how clever I get, no single thing can give me the overall view of what's going on; it's always going to be hard to get that.

(In my opinion, common OS level firewall rule systems are best viewed as a kind of firewall assembly language (Linux more than OpenBSD); by themselves they are too low level to give you any of these views. You can no more easily understand your firewall by reading PF or iptables rules than you can easily understand anything but a tiny and trivial program by reading its assembly.)

FirewallViewComplexity written at 23:15:45; Add Comment

The role of superstition and folklore in system administration

Just like users have folklore, system administration does too. Our sort of superstition is a bit different, though (well, usually): it is the kind of thing where you say 'I don't know why that's there, but let's not remove it just in case'. When our system environments reach a certain level of fragile complexity and we start losing track of the fine details, of course our informed actions start descending into rote procedures.

(This really accelerates when new people come on board; they weren't around to build the systems, so they don't have the picture of how everything fits together in their head, and I think that even good documentation will never really build it.)

Once you lose track of exactly why something is done, the principles of change control come into play. When something works as it is and you're not certain why, you have only two real choices; you can take the time to see if it's still correct and necessary, or you can just keep on doing it until something explodes. It should be no surprise which choice busy sysadmins usually make, and thus you get the superstitions, all of those things that once had a reason (we hope) but we no longer know what it was.

(This growth of superstition shows up in any area of system administration where you can lose track of things, like expensive names.)

Descents into superstition are not fatal, but they are expensive to reverse; you have to actively make the time to reverse engineer how your system really works, and do it thoroughly enough that you're confident that you didn't miss anything. Sometimes you can only successfully get rid of the superstition when you replace the entire system (so you haven't so much fixed it as rendered it irrelevant).

Recognizing when a system is sliding into superstition is important, because it's a serious warning sign both that your system is too complex and that you do not understand it well enough. Continuing with things as they are is likely to result in more and bigger superstitions taking hold, with the attendant loss of real understanding and control of your system.

SysadminsAndSuperstitions written at 01:48:30; Add Comment

2008-12-09

How Amanda uses what restore program to use, a correction

In AmandaRestorePrograms I wrote, about what to do if Amanda didn't recognize properly recognize which sort of dump program it had used to back up a filesystem:

  • put a 'restore' program (either a cover script or just a symlink to ufsrestore) somewhere in our $PATH when we do Amanda restores.

Allow me to correct myself: this doesn't actually work as I wrote it. (When I wrote the original article, we hadn't had to actually test this; we have since then.)

The problem is that Amanda does not actually search $PATH when it is executing the restore program (including when it is plain 'restore'); it simply executes the program directly by path (which is sensible, since it normally knows the exact path). When it tries to execute the default restore program it uses no path, and thus is actually trying to run './restore'.

(In Unix terms, Amanda uses execve() instead of execvp() or the like.)

So: you have to put your 'restore' program in the current directory (possibly in the directory that Amanda will restore to, if you've changed that inside amrecover). This does work, although it's slightly inconvenient.

AmandaRestoreProgramsII written at 14:06:28; Add Comment

2008-12-06

On line endings and honesty

Dear software, I have a small and simple request, from a working system administrator:

Please stop pretending that \r\n and \n are the same thing.

That is, please stop pretending that MS-DOS line endings and Unix line endings are the same thing, because they are not. Pretending that they are is one of those collective hallucinations that only work if absolutely everyone is playing along. Sooner or later (usually sooner) you will run into a program that does not play along, and things go violently off the rails.

(By 'you' I actually mean 'I'.)

If you are a Unix program and you want to be helpful this way, you should have an explicit switch to turn on ignoring the difference, and this switch should not be on by default. (And if you are a Linux distribution, you should not turn this switch on for me to be helpful. Turning it on is not doing me any favours; rather the contrary.)

In fact, let me be pretty strong: programs that are helpful this way (including both vim and less and, as I found out today, kdiff3) are actually harmful, because their lies make it much harder to diagnose what is going on when some program does notice or object, transforming an ordinary problem into a frustrating, time-consuming mystery. If I have to drag out od to diagnose my problems, you are doing it wrong.

(The most frustrating problems are when programs notice in small ways, such as changing the value of the last field in all of the lines because they now have invisible ^Ms at the end. Programs that object are actually the easy case.)

LineEndingHonesty written at 02:44:00; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.