Wandering Thoughts archives

2008-02-22

Why I am not fond of Ubuntu's management of kernel updates

It's really simple: we installed the latest Ubuntu 6.06 kernel update last night, since it's a security fix. Today, our machines started panicing with 'kernel BUG at fs/nfs/inode.c:174' messages (three machines so far, one of them three times), so we wanted to revert back to the old kernel.

Guess what: it wasn't there any more. Apparently Ubuntu feels free to have (some) kernel updates overwrite your currently installed kernel, instead of supplementing it with a new version.

(For extra bonus points, this update carried with it a strong warning that the kernel ABI had changed and you would need to recompile any third-party modules. Gosh, I hope you already had any modules you'd need loaded before you overwrote all your old kernel's modules as part of this update, since the new kernel's modules are really unlikely to load in your running kernel.)

I don't really have words to explain how stupid this is. It is trivial to completely version kernels so that you can have even multiple package builds installed next to each other, so trivial that everyone does it (even Ubuntu). And when kernel updates can introduce explosive bugs, it is vital to do this so that people can revert to the previous, working version. Ubuntu does this with sufficiently major updates within a single kernel version; they just don't do it all the time.

Wrong. Broken. Worse, it shows that Ubuntu fundamentally does not get it.

(For a very special bonus, there is no simple way to find out what kernel package point release version you're currently running; the point release number is not part of uname -r or present in any kernel boot messages. The best you can do is to use the kernel's compilation date and cross-check it against the release date of packages and the 'Debian' changelog that Ubuntu supplies.)

Sidebar: our kernel panics

Our panics are with the Ubuntu kernel version '2.6.15-51.66'; we've seen them on both x86 and x86_64 machines. The reported panic is kernel BUG at fs/nfs/inode.c:174, in nfs_clear_inode, with traces that run back to sys_umount and sys_close; the problem may be related to forced unmounts, especially forced unmounts that fail. We are doing NFS v3 mounts from Solaris 8 (SPARC) NFS servers.

UbuntuKernelManagement written at 23:57:21; Add Comment

2008-02-10

A basic introduction to prelinking on Linux

At least on the x86 architecture, shared libraries are not entirely made up of position independent code. This means that there is a certain amount of relocation that you have to do when you load a shared library into memory at run time. The basic idea behind prelinking is to try to do this relocation ahead of time; for each shared library, you pick a default location in memory and 'prelink' it so that if it is loaded at that location it doesn't need any run-time relocation. Then the dynamic loader tries to load prelinked libraries at their prelinked locations if at all possible.

(The exact details are explained in the prelink manpage.)

Prelinking has two advantages: because they need to do less relocation at runtime, programs both start faster and use less memory (they dirty fewer pages of shared libraries with per-process relocations). It has the downside that it changes shared libraries and binaries on disk for each system (and changes them again any time you upgrade a shared library), which makes various sorts of security verification harder.

Red Hat enables prelinking by default (in both Fedora and Red Hat Enterprise). Ubuntu and Debian do not seem to do so, although you can turn it on by installing the prelink package and configuring it appropriately.

Prelinking is not a new idea. The first implementation I remember seeing was in SGI's Irix, but in a sense its ancestry goes back to some of the first shared library implementations, which had no dynamic relocation and just statically assigned addresses to shared libraries.

Sidebar: prelink and DT_GNU_HASH

The first time dynamically linked code wants to do something like call an external function, it has to look through all of the symbol tables in all of the various bits of code to find the function. DT_GNU_HASH is the name of a GNU extension to use efficient, fast to search hash tables for these symbol tables; it and related optimizations can significantly speed up practical program startup time.

Unlike prelinking, DT_GNU_HASH is done once when a shared library is built. Because these lookups have to be done whether or not the shared libraries involved have been prelinked, prelinking and DT_GNU_HASH are complementary and systems can do both.

Modern versions of Red Hat (both Fedora and Enterprise) use DT_GNU_HASH; Debian stable does not. Ubuntu 6.06 (their long term support release) does not, but I believe that current versions do.

PrelinkingIntro written at 23:21:42; Add Comment

2008-02-01

Isolating network interfaces on Linux

Consider a not entirely hypothetical situation: you have an office machine that serves as one end of a GRE tunnel, and, in addition to its official network interface, has a fluctuating number of secondary interfaces on various internal VLANs for testing, debugging, and so on. The simple approach for such a machine is to just turn on global IP forwarding and cross your fingers that no one will decide to make the machine their gateway (apart from the GRE link). But this is not ideal; if nothing else, it may alarm coworkers that you have an unofficial router on the network.

What we really want to do is to isolate the secondary interfaces, making it so that we won't forward their packets and we won't forward packets to them for other people. The first part is selective IP forwarding; just turn forwarding on only for eth0 and the GRE tunnel. The easiest way to do the second part is to use some policy based routing.

For my office machine, I decided to simplify things by declaring that the GRE tunnel was allowed to reach everything and thus only traffic from eth0 needed to be restricted. First we need to add a routing table for the non-local routes that eth0 is allowed to use, ie the target of the GRE link:

ip route add R dev GRE table 10

(Here R is the remote IP and GRE is the GRE tunnel device. You may want to add a 'src LOCAL-IP' as well.)

Next we need some rules to restrict eth0 traffic:

ip rule add iif eth0 priority 5000 table 10
ip rule add type blackhole iif eth0 priority 5001

Translated, this drops any traffic from eth0 that isn't going to the remote end of the GRE tunnel, exactly as if that interface didn't do IP forwarding. (Packets to the machine itself are dealt with by an earlier, default ip rule.)

This is not complete isolation, because we have not given the machine a dual identity for its own traffic. In my situation this is basically harmless, so I haven't gone to the extra effort.

IsolatingInterfaces written at 23:54:30; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.