2008-08-08
How to exploit unsigned repository metadata
Courtesy of Don Kaminsky's slides for his Black Hat DNS talk, I wound up reading Attacks on Package Managers, which identifies a number of attacks that can be mounted if you can set yourself up as a package repository mirror for your favorite Linux distribution. A number of the attacks are straightforward, but judging from the Slashdot discussion the dangers of unsigned repository metadata are not as clear. So here is an attempt to explain them simply.
Suppose that you are an attacker with a repository mirror (the paper shows that this is not a challenge), and further suppose that there is a package, call it 'frobnitz', that has a security hole that you can exploit (this too is rarely a challenge). Unfortunately (for you) it is an uncommon package that's only rarely installed (perhaps it is an obscure browser plugin for some weird format such as VRML). Your goal is to force frobnitz to be installed on as many machines as possible so that you can exploit it.
With unsigned repository metadata and an uncautious package manager, you have two approaches:
- for every newly updated package that gets added to the repository,
edit its dependency information in the repository metadata to
claim that the new version now depends on frobnitz. Many package
managers will then download and install frobnitz as part of
updating the other packages without noticing that the repository
dependency metadata disagrees with the actual (signed) package
metadata.
- generate a version of the repository metadata that claims that frobnitz provides a huge number of things (especially virtual packages and dependencies), and then take steps to insure that frobnitz will be the preferred thing to satisfy any of these dependencies (possible for many package managers). This will cause many people who install a new package to pick up frobnitz in the process, although the overall install may be broken (since the actual thing they need isn't there, as you fooled the package manager into installing frobnitz instead).
Both of these attacks work fine with signed packages (as do all of the attacks in the paper), and they will also work if only some version of frobnitz is vulnerable, since you can always make that the current version in your repository metadata.
(If you have to use an old version of frobnitz you have only a limited time window to exploit the vulnerability; sooner or later the user's system will contact an honest mirror and fetch a fixed version of frobnitz. Of course, this is where Kaminsky's DNS exploit can come into the picture.)
The good news is that even without signed repository metadata, both attacks can be prevented by a cautious package manager that re-checks dependency and provides information against the actual downloaded packages. (I suspect that a number of package managers will be getting updates soon, since making package managers cautious is a lot easier than changing your repository metadata format.)
A workaround for the Python module search path issue on Unix
One of the little challenges with writing Unix programs in Python is
the search path problem. The natural structure of Python programs is to
split functionality up into a bunch of modules and then import them
all in the main program, but the natural structure of a Unix program is
to put its binary into one directory (such as /usr/bin) but all of
its helper bits into a second, completely different directory. So the
problem is: how is a Unix Python program supposed to find its modules?
(The normal Python search path for import is the directory that the
program is being run from, such as /usr/bin or $HOME/bin, and the
system Python package areas.)
The obvious solution is to have your program start off by adding its
module directory to sys.path. The problem here is that this is an
installation dependent location, which means that you're customizing
your main program each time it gets installed. I dislike this and find
it ugly, plus I maintain that overriding sys.path gets in the way of a
number of things and can cause subtle problems.
It turns out that there is a simple workaround, hinted at by the aside there: put the Python main program into its library directory along with the rest of its modules, and turn the command that gets installed into the binary directory into a tiny shell script that is just:
#! /bin/sh
exec /where/ever/prog.py "$@"
This results in the library directory being added to the search path, because it is the directory that the actual Python program is being run from. (And you can be confidant that Python will know what that is, since the program is being started by absolute path.)
It is easy to create this tiny shell script as part of your installation
process (when you're sure to know where the library directory is, since
you are about to put things in it). As a bonus, your main program can
still have a .py extension so that you can easily do things like check
it with pychecker or import it
into an interpreter to poke at something.
(Credit where credit is due: I didn't invent this trick. I believe I first saw it being used by a Python program on Fedora or one of the pre-Fedora Red Hat version.)