How packaging systems should handle kernel updates

July 24, 2008

As a system administrator, I have some relatively strong opinions on how Linux distributions and packaging systems should handle kernel updates. Since I have just gone through the experience of yet another mass kernel update of our servers, I feel like writing them up:

  • you should be able to have multiple kernels installed at once.
  • it should be easy to tell which package's kernel you are running, and thus whether or not you are running the most current kernel.
  • installing a kernel update should never overwrite an existing kernel image; it should always install a new kernel.

    (The one time when it is marginally acceptable to overwrite an existing kernel image is when the package update has literally no changes in the kernel; all it does is fix some packaging mistake. But that should be vanishingly rare anyways.)

  • old kernels should get removed sooner or later, at a rate that is configurable but that has a sensible default. I do not need every kernel ever released for my distribution sitting around on my disks (especially if they have security holes).

    (Ideally there would be some sort of time-based minimum expiry, so I can say 'never remove a kernel until I haven't run it for a month'.)

  • you should be able to 'pin' any particular kernel so that it is never removed. You really want this if you have a 'last known good' kernel and you want to keep experimenting with kernel updates to see if your distribution got it fixed this time around.

  • the running kernel and its modules should never, ever get removed (or overwritten).

I don't know of any distribution that gets all of these right. Ubuntu fails spectacularly at never overwriting existing kernels; Red Hat has no good way to pin a particular kernel version so that it won't get removed (unless the yum versionlock plugin has very recently fixed its interaction with the 'keep the most recent N' plugin). No one has good kernel expiry.

Oh, and I prefer as much of this as possible to be in the kernel packages themselves, instead of in tools that are used to manage them, so that I cannot accidentally cause problems by using the wrong package management tool. This means that I somewhat prefer the Debian/Ubuntu approach of making different kernel versions have different package names to the Red Hat approach of making different kernel versions be different versions of the same package and putting the smarts in yum and other tools.


Comments on this page:

From 65.172.155.230 at 2008-07-25 16:22:02:

So as someone with the power to fix this for you (in Fedora/RHEL/CentOS at least)...

Doing timebased expiry based on when you last ran a package is probably infeasible, esp. for the kernel. Doing it based on install time is doable but I don't think you'd want that. One of the problems with time based is that it implies a soft limit, whereas the last N is a hard limit ... so assuming the kernel package doesn't change a huge amount in size you can know how much space is being taken up by kernels (the main user of install_only). So I feel I have to ask, why don't you just make your "keep last N" value be higher ... and manually "yum rm" the kernels that don't work? Or just set it to never and make /boot big enough?

Pinning a particular version of something is an interesting idea, and probably doable without too much pain. I'm assuming just putting a "keep_install_only = kernel-0:2.6.25.10-86.fc9.x86_64" in yum.conf would be enough? The only real problem here is what happens in some edge conditions, for instance when we first did install_only we set it to 2 ... the idea being you can have the currently running kernel and the next update. The problem is if you are running kernel-2, "yum update" to kernel-3 (but don't reboot to it) and then do "yum install kernel" yum does exactly what you told it and removes kernel-3 and installs kernel-1. keep_install_only would just give SAs even more rope disguised as a feature.

Moving this down into rpm would maybe be possible longer term. ... but I wouldn't hold your breath (it seems like a higher level feature to me). And the "failure" case is just that "rpm -ivh" doesn't remove things, so not exactly a problem IMO. I'm don't understand the last comment ... are you using "rpm -Uvh" on kernels? IMNSHO the weird way .deb's embed the versions into the package names just seems wrong on so many levels ... but, meh.

From 65.172.155.230 at 2008-07-25 16:32:38:

Also, what happens with versionlock and installonly ... AFAICS versionlock way before installonly can do anything, so it would work as intended (although versionlock would be just a fancy way of saying exclude=kernel* ... and you wouldn't have the --disableexcludes option).

By cks at 2008-07-27 00:59:42:

Describing the versionlock problem got long enough that I put it in its own entry, YumVersionlockIssue.

Time based minimum expiry is useful to me because I may not fully trust a new kernel until I have run it for a certain amount of time (not all of the kernel problems that we've run into happened immediately). So up until that point I want to keep the old kernel around just in case we turn up an issue.

(This would be less of an issue if distributions preserved old updates in their update repositories for a good long time, but they don't. And for some distributions you can't even make your own mirror to keep copies yourself.)

For pinning kernels, I think that a pinned kernel should be excluded from yum's calculations of how many kernel versions to keep, so it would be 'N versions, plus any pinned kernels'. This is only slightly surprising, is easy to explain, and avoids accidents.

For my last comment: while I don't use rpm -U on kernels (or for much of anything any more), I would indeed feel better if running 'rpm -U' on a big collection of RPMs couldn't accidentally override all of the clever work on managing kernel updates. This is probably a marginal concern since I doubt very many people use rpm directly any more, especially for package updates.

From 65.172.155.228 at 2008-07-27 16:35:25:

Time based minimum expiry is useful to me because I may not fully trust a new kernel until I have run it for a certain amount of time (not all of the kernel problems that we've run into happened immediately). So up until that point I want to keep the old kernel around just in case we turn up an issue.

This seems something better suited to versionlock (see your other post for my comment). Also, esp. given the kernel developers hate of atime, I can't think of any good way to find out "time last X booted".

(This would be less of an issue if distributions preserved old updates in their update repositories for a good long time, but they don't. And for some distributions you can't even make your own mirror to keep copies yourself.)

RHEL does this, we keep every version of everything since GA in the metadata. You can see this on the yum 3.2.X branch by adding --showduplicates on your list/search command (on the older versions there is a configuration optionto do the same for just list). Of course just yum update kernel won't ever go backwards, you need to do yum install kernel-<version>

For my last comment: while I don't use rpm -U on kernels (or for much of anything any more), I would indeed feel better if running 'rpm -U' on a big collection of RPMs couldn't accidentally override all of the clever work on managing kernel updates.

Note that with the latest yum-tmprepo you can do yum --tmprepo=path/to/dir/ so hopefully there is even less reason to do things like that.

From 65.172.155.228 at 2008-07-27 17:09:36:

This would be less of an issue if distributions preserved old updates in their update repositories for a good long time, but they don't. And for some distributions you can't even make your own mirror to keep copies yourself.)

RHEL does this, we keep every version of everything since GA in the metadata.

I should also point out that the major reason the non-paying distros. like CentOS and Fedora don't do this is due to bandwidth and storage constraints (so if the university or toronto was willing to host all the data things might change quickly :).

However I am arguing internally that Fedora should keep one or two older versions (and yum-presto might well help the argument), and by Fedora 10 we will very likely keep the last security update version, as well as the latest version, for each package.

We (yum developers) are also working on some changes for reposync so that you'll get a "merge mode" so you can create a local repo. with every version in it.

By cks at 2008-07-27 23:32:13:

It's excellent news that RHEL preserves all updates, since RHEL is the hard-to-mirror distribution that I was thinking about. I'm not as concerned about Fedora, since for Fedora one can just stage all updates into a private repository before applying them and then never remove things from that repository.

(PS: in the interests of honesty I will note that I used magic site admin powers to make a formatting change in the previous comment so that it no longer has a really long line.)

From 204.52.215.2 at 2008-07-31 16:35:25:

Ahhh... how wonderful the world would be if your suggestions were snapped up!

Personally, I'm an OpenSuSE user, and after having some really nasty issues with distro upgrades (specifically, trying a 10.1 to 11.0 upgrade), I've started to setup two primary OS partitions on all of my new boxes at the first install, one of them being left blank, so that I can upgrade on the unused (or oldest) partition, and fallback to my last OS version if needed...

-J. Antman

By cks at 2008-07-31 23:34:05:

Snapshots or some other way of doing easily rolled back updates would solve a whole lot of issues in one stroke, but it's a hard problem and I don't know if anyone is doing really well at tackling it.

Written on 24 July 2008.
« One thing that I dislike about typical debuggers
dict.setdefault() as a concurrency primitive »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Thu Jul 24 22:31:51 2008
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.