Wandering Thoughts archives

2007-08-26

A limitation in Linux's policy based routing

One of the more advanced things you'd like to do with Linux's policy based routing and a dual identity scenario is to be able to make more flexible decisions about what goes out what interface. Consider the case where you have two internet connections, one slow but reliable and the other one fast but currently flaky, and you have a different IP address on each. You would like to send not so important traffic (such as web browsing) over the fast but flaky connection while still having important traffic like your ssh sessions go over the slow but reliable one.

(Why yes, my DSL is being flaky at the moment.)

In theory the way to do this is simple: you use iptables to put a mark on whatever packets you want routed explicitly, and then you use ip rule to set up rules that route explicitly marked packets out the corresponding interface.

(Alternately you use marks to classify packets, so port 80 traffic would get the 'http' mark, and then set up routing rules to declare which way any particular class of packets was supposed to go.)

However, this doesn't work, or at least doesn't work the way you want. The problem is that by the time the packet passes through iptables to get marked, the kernel has already decided what source address it will have. If you put your mark-checking ip rules after your explicit source address based ones, they won't do anything; if you put them before, they will cause packets to go out the wrong interface for their source address.

To fix this situation up, you need to change the source IP address of the packets to fix them up. Unfortunately the only way I know of doing this is to use source-NAT on appropriate outgoing packets, which strikes me as inefficient and ugly for various reasons, and possibly sometimes dangerous.

(I can see why iptables behaves this way, since rules in the mangle OUTPUT chain may want to inspect the packet's source address. But it's still contrary to the documentation, at least in theory. It also implies that packets are probably going through the IP rules table twice, once before the mangle OUTPUT table, to assign the origin address and so on, and once afterwards.)

PolicyBasedRoutingLimitation written at 21:56:09; Add Comment

2007-08-24

Linux and accidentally multipathed disks

We have one Dell 2950 that has a PERC 5/? RAID controller, bought as somewhat of an experiment, and Ubuntu 6.06 has an interesting problem with it: the kernel sees both the real disks and the PERC RAID devices (in that order), with the RAID devices being slightly smaller (the PERC controller seems to use a bit of space at the end of the disk, presumably to store its setup information.)

As you might expect, having the same disk visible through two paths gives you some interesting problems. Ironically, old-style Linux setups with hard-coded device names in places like /etc/fstab have the fewest issues, while programs that try to auto-recognize things generally went rather off the rails. There are a fair number of these places in a modern Linux system (not all of which Ubuntu uses):

  • auto-starting software RAID devices
  • auto-starting LVM volumes
  • mounting filesystems by label and enabling swap devices by label

(All of this assumes that you are exporting the disks from the PERC RAID controller as raw disks or at most mirrored disks; I suspect you would get even more fun if you had things in a RAID 5 disk.)

There are similar issues in too-clever installers; if you weren't careful, it was possible to accidentally cause the installer to write partition tables to both a raw disk and the RAID version of the disk. (And if you installed on a hardware RAID 1 disk, you had to adjust GRUB after the install to tell it what disk was BIOS disk 0.)

It has since struck me that a driver that could do this deliberately would make an interesting test case for actual multipath support in programs and systems. Of course, these days you can just construct that with iSCSI, since it is easy enough to export the same disks through multiple network interfaces (or IP aliases).

PercUbuntuProblem written at 23:23:14; Add Comment

2007-08-22

Redirecting traffic to another machine with Linux's iptables

Let us suppose, as a not entirely hypothetical example, that you have added an A record for your subdomain name that points to one of your login servers, so that people can do 'ssh subdomain' and have it work. Let us further suppose that this login server is not your web server and you now want to make http://subdomain/ also do something useful, instead of giving connection refused errors.

One way to do this is to use iptables on the login server to redirect any connections to its port 80 off to the actual web server machine. Assuming that W is the IP address of your web server, what you need on your login server is:

  1. echo 1 >/proc/sys/net/ipv4/ip_forward

    Without IP forwarding enabled, the kernel will just drop our redirected packets instead of (re)routing them as we want.

  2. iptables -t nat -A PREROUTING -p tcp --dport 80 -j DNAT --to-destination W

    This sends the traffic off by rewriting the destination of attempts to connect to the login server's port 80 to be attempts to connect to the web server. Because it is in the PREROUTING table it does not affect connections made on the login server itself.

  3. iptables -t nat -A POSTROUTING -p tcp -d W --dport 80 -j MASQUERADE

    This rewrites the origin of connections to the web server's port 80 to appear to come from the login server. (Using -j SNAT --to-source L where L is the IP address of the login server is equivalent but longer, and the difference is unimportant for machines with permanently up interfaces.)

The last step is necessary because we need to make reply packets from the web server go through the login server so that it can reverse the transformation. Otherwise when a client connects to port 80 on the login server, the web server will see a connection from the client to it and send reply packets directly back to the client, where the client will ignore them because as far as it is concerned it connected to the login server, not the web server.

(Well, technically the client sends RSTs to the web server instead of completely ignoring the packets, assuming that no firewalls intervene.)

The drawback of dealing with the situation this way is that the web server will see (and log) all of this traffic as coming from the login server instead of from its real origin. This may or may not matter to you.

If the packets were already going through the login server on the way back (perhaps it is also your PPP server), you wouldn't need the third step but you'd want to be more specific in the second step, so that only packets to port 80 on the login server itself are affected. (Otherwise you would be creating a not so transparent proxy, where all websites are your web server.)

(To answer an obvious question: one reason to not just do this on your firewall is if you want even internal attempts to use the URL http://subdomain/ to do something useful.)

IptablesRedirection written at 23:22:37; Add Comment

2007-08-07

A surprise with the Provides header in RPM

Normally, one of the things RPM bases dependencies on is package names. The problem with doing only this is packages that want to depend on capabilities instead of specific packages; for example, that there is some mailer installed, not specifically that Sendmail is installed. To deal with this, RPM introduced the Provides: directive, which lets an RPM package tell the overall RPM system that it is providing something that is not obvious from its name, its files, and so on.

In implementing this, RPM has chosen to not represent Provides separately in its internal databases. When a package Provides something, to RPM that package is that something, as much as it is its regular name; effectively the same RPM package is two or more RPMs. This turns out to have an interesting and surprising consequence, best illustrated with an anecdote.

A while ago when Fedora switched firmly to CUPS, we decided that while we would use CUPS on the clients we would keep using LPRng for our print server. Since Fedora no longer packaged LPRng, I built the RPM myself and tried to install it; as is my usual habit, I used rpm's -U option. Things promptly blew up screaming that I was trying to remove the CUPS libraries that half the known Gnomeiverse depended on.

The problem turned out to be that CUPS had (at that time) a Provides that said it provided 'LPRng = 3.8.15-3'. Thus as far as RPM was concerned, CUPS was LPRng version 3.8.15-3, and since I was doing an upgrade install of a more recent version, it needed to remove the old version, removing all of those shared libraries that programs needed.

(Using 'rpm -i' instead would not have really helped, because we would have run into trouble the first time we needed to upgrade our own LPRng package. My solution was to rebuild the CUPS RPM without that Provides: line in the specfile.)

ProvidesSurprise written at 23:34:58; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.