2014-10-29
Quick notes on the Linux iptables 'ipset' extension
For a long time Linux's iptables firewall had an annoying lack in that it had no way to do efficient matching against a set of IP addresses. If you had a lot of IP addresses to match things against (for example if you were firewalling hundreds or thousands of IP addresses and IP address ranges off from your SMTP port), you needed one iptables rule for each entry and then they were all checked sequentially. This didn't make your life happy, to put it one way. In modern Linuxes, ipsets are finally the answer to this; they give you support for efficient sets of various things, including random CIDR netblocks.
(This entry suggests that ipsets only appeared in mainline Linux kernels as of 2.6.39. Ubuntu 12.04, 14.04, Fedora 20, and RHEL/CentOS 7 all have them while RHEL 5 appears to be too old.)
To work with ipsets, the first thing you need is the user level tool for
creating and manipulating them. For no particularly sensible reason your
Linux distribution probably doesn't install this when you install the
standard iptables stuff; instead you'll need to install an additional
package, usually called ipset
. Iptables itself contains the code to
use ipsets, but without ipset
to create the sets you can't actually
install any rules that use them.
(I wish I was kidding about this but I'm not.)
The basic use of ipsets is to make a set, populate it, and match against it. Let's take an example:
ipset create smtpblocks hash:net counters ipset add smtpblocks 27.112.32.0/19 ipset add smtpblocks 204.8.87.0/24 iptables -A INPUT -p tcp --dport 25 -m set --match-set smtpblocks src -j DROP
(Both entries are currently on the Spamhaus EDROP list.)
Note that the set must exist before you can add iptables rules that
refer to it. The ipset
manpage has a long discussion of the various
types of sets that you can use and the iptables-extensions
manpage has
a discussion of --match-set
and the SET
target for adding entries
to sets from iptables rules. The hash:net
I'm using here holds random
CIDR netblocks (including /32s, ie single hosts) and is set to have
counters.
It would be nice if there was a simple command to get just a listing of
the members of an ipset. Unfortunately there isn't, as plain 'ipset
list
' insists on outputting a few lines of summary information before
it lists the members. Since I don't know if these are constant I'm using
'ipset list -t save | grep "^add "
', which seems ugly but seems likely
to keep working forever.
Unfortunately I don't think there's an officially supported and
documented ipset
command for adding multiple entries into a set at
once in a single command invocation; instead you're apparently expected
to run 'ipset add ...
' repeatedly. You can abuse the 'ipset restore
'
command for this if you want to by creating appropriately formatted
input; check the output of 'ipset save
' to see what it needs to look
like. This may even be considered a stable interface by the ipset
authors.
Ipset syntax and usage appears to have changed over time, so old discussions of it that you find online may not work quite as written (and someday these notes may be out of date that way as well).
PS: I can sort of see a lot of clever uses for ipsets, but I've only
started exploring them right now and my iptables usage is fairly basic
in general. I encourage you to read the ipset
manpage and go wild.
Sidebar: how I think you're supposed to use list sets
As an illustrated example:
ipset create spamhaus-drop hash:net counters ipset create spamhaus-edrop hash:net counters [... populate both from spamhaus ...] ipset create spamhaus list:set ipset add spamhaus spamhaus-drop ipset add spamhaus spamhaus-edrop iptables -A INPUT -p tcp --dport 25 -m set --match-set spamhaus src -j DROP
This way your iptables rules can be indifferent about exactly what goes into the 'spamhaus' ipset, although of course this will be slightly less efficient than checking a single merged set.
Unnoticed nonportability in Bourne shell code (and elsewhere)
In response to my entry on how Bashisms in #!/bin/sh
scripts aren't
necessarily bugs, FiL wrote:
If you gonna use bashism in your script why don't you make it clear in the header specifying #!/bin/bash instead [of] #!/bin/sh? [...]
One of the historical hard problems for Unix portability is people writing non-portable code without realizing it, and Bourne shell code is no exception. This is true for even well intentioned people writing code that they want to be portable.
One problem, perhaps the root problem, is that very little you do on Unix will come with explicit (non-)portability warnings and you almost never have to go out of your way to use non-portable features. This makes it very hard to know whether or not you're actually writing portable code without trying to run it on multiple environments. The other problem is that it's often both hard to remember and hard to discover what is non-portable versus what is portable. Bourne shell programming is an especially good example of both issues (partly because Bourne shell scripts often use a lot of external commands), but there have been plenty of others in Unix's past (including 'all the world's a VAX' and all sorts of 64-bit portability issues in C code).
So one answer to FiL's question is that a lot of people are using
bashisms in their scripts without realizing it, just as a lot of
people have historically written non-portable Unix C code without
intending to. They think they're writing portable Bourne shell scripts,
but because their /bin/sh
is Bash and nothing in Bash warns about
things the issues sail right by. Then one day you wind up changing
/bin/sh
to be Dash and all sorts of bits of the world explode,
sometimes in really obscure ways.
All of this sounds abstract, so let me give you two examples of
accidentally Bashisms I've committed. The first and probably quite
common one is using '==
' instead of '=
' in '[ ... ]
' conditions.
Many other languages use ==
as their string equality check, so at some
point I slipped and started using it in 'Bourne' shell scripts. Nothing
complained, everything worked, and I thought my shell scripts were fine.
The second I just discovered today. Bourne shell pattern matching allows
character classes, using the usual '[...]
' notation, and it even has
negated characters classes. This means that you can write something like
the following to see if an argument has any non-number characters in it:
case "$arg" in *[^0-9]*) echo contains non-number; exit 1;; esac
Actually I lied in that code. Official POSIX Bourne shell doesn't
negate character classes with the usual '^
' character that Unix
regular expressions use; instead it uses '!
'. But Bash accepts
'^
' as well. So I wrote code that used '^
', tested it, had it
work, and again didn't realize that I was non-portable.
(Since having a '^
' in your character class is not an error in
a POSIX Bourne shell, the failure mode for this one is not a
straightforward error.)
This is also a good example of how hard it is to test for
non-portability, because even when you use 'set -o posix
' Bash
still accepts and matches this character class in its way (with
'^
' interpreted as class negation). The only way to test or find
this non-portability is to run the script under a different shell
entirely. In fact, the more theoretically POSIX compatible shells
you test on the better.
(In theory you could try to have a perfect memory for what is POSIX compliant and not need any testing at all, or cross-check absolutely everything against POSIX and never make a mistake. In practice humans can't do that any more than they can write or check perfect code all the time.)