The problem of testing firewall rules changes

June 16, 2010

In an earlier entry, I mentioned that firewalls are a classical case of difficult testing where differences between your test and your production environments can be vitally important. Let's elaborate on that.

Suppose that you have some firewall rules changes that you want to make. As a good developer-style sysadmin, you are not going to just dump them on your production firewall; instead you have a test firewall that you push rules to first for testing. But here's the question: how is your test firewall's networking configured, specifically, do you give it test IPs and networks, or do you configure it exactly identically to the production firewall, using the production firewall's IPs and networks?

If you give it production IPs and networks, it obviously has to be completely isolated from your production environment. In turn this means that it needs to have its own supporting (and testing) network infrastructure (with multiple machines, network connections, etc), and you have to somehow push configuration updates into that test network infrastructure.

(I'm going to assume that our only concern is testing firewall rules changes; we're going to assume that things like firewall monitoring systems continue to work fine, so we don't have to build something to test them inside this isolated environment.)

If your test firewall uses test IPs and networks, it doesn't have to be completely isolated from your production environment and can reuse a bunch of your existing update and management infrastructure. This sounds good, but there's a problem: errors in IP addresses and network blocks are exactly one of the problems with firewall changes, yet you can't test for these errors if your test firewall uses test IPs and network blocks. Your test version of the change, using test IPs, can be done right, yet you've made a mistake when writing out the production IPs; you'll only find out when you push the update to the production firewall and things start breaking.

(So what differences between your test and production environments are acceptable to have? My only thought right now is that differences in things that you don't change seem safe, because then you can verify all of those differences once and know that things are good from then onwards.)


Comments on this page:

From 69.113.211.148 at 2010-06-16 09:25:43:

The amount of work that goes into configuring a test environment should be proportional to the danger associated with the changes you're making.

If you're a normal institution, and the biggest danger associated with a rule change on a firewall is that you receive a couple of calls or Nagios alerts and have to take 30 seconds out of your day to roll back the config before you try again, then it's not a big deal -- this is specifically the overbearing kind of change management I argue against when I discuss the failings of process frameworks like ITIL and COBIT with people.

My stance on testing with the team I work with has always been this: if you break something, and can fix it before you cost anyone money, and nobody loses any data, then don't even bother testing it somewhere else first because it's a waste of time.

On the other hand, if you're upgrading a large mail system, ERP application or other critical piece of software with no real downgrade path, you'd better believe your dev network is going to replicate your production network as closely as you can afford.

--Jeff

Written on 16 June 2010.
« The practicalities of non-GPL'd Linux kernel modules
One reason why I prefer browser windows to browser tabs »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Wed Jun 16 00:52:45 2010
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.