Wandering Thoughts archives

2012-09-23

How we handle Ubuntu LTS versions

In a recent entry I noted that we have production machines running all three currently supported versions of Ubuntu LTS (8.04, 10.04, and 12.04). This might strike people as a little bit peculiar, so I'll describe how we handle Ubuntu LTS versions.

Machines that people actively log in to, such as our login and compute servers, always run the current LTS version so that people can get the latest software versions and shiny bits. We don't upgrade right away when a new LTS release comes out, because it takes time to update our install system for the new release, test it, fix the inevitable problems and glitches, and then schedule the actual machine upgrades (well, reinstalls). But we do try to upgrade before too long and we usually manage it; for 12.04 it came out in April and we were upgrading our login and compute servers in mid August, which is pretty fast by our scale.

(Here's an embarrassing admission: I only recently realized what the numbering scheme for Ubuntu releases is. Now that I know, it's much easier to see when they came out and how old they are.)

Other server machines are often not worth constantly upgrading to the latest and greatest LTS release. If what they're running works and there aren't any additional features or improvements that we want from a new LTS release, we generally don't bother upgrading them every time; it's a bunch of work for no real gain (especially given the testing we usually need to do). Instead we only upgrade when we're forced to, and what forces us is when a LTS release is going to stop being supported. With a LTS release very two years and a five year support period for LTS releases, this means that servers generally skip every second LTS release.

Given this, you might wonder why we have 10.04 machines at all. There are several reasons. First, some of our servers were initially 6.06 machines; they skipped 8.04 and were upgraded to 10.04. Second, some of the servers were built for the first time during the 10.04 era (just as some servers were built for the first time at 8.04). Third, sometimes we do feel that there are advantages to upgrading machines that don't strictly need it; for example, new versions of Samba and our IMAP server can be useful enough to prompt an upgrade by themselves.

(There's also advantages to making sure that all machines that run some software are running the same version, so that eg we might upgrade all machines that run Exim to 12.04 to synchronize versions.)

OurUbuntuLTSVersions written at 00:57:48; Add Comment

2012-09-13

Why you need mass package rebuilds in some circumstances

In the previous entry, I mentioned that sometimes you need to rebuild a package for reasons unrelated to any changes to it; the example I gave was rebuilding packages due to fixing a code (mis)generation bug in your compiler. This may sound obscure, but in fact situations like this are more common than you might think. You really should be rebuilding packages after any significant change to the basic toolchain and compilation environment, such as a significant new version of GCC. It's also a good policy to build all of your packages for a distribution with the version of the compiler and basic environment that will ship with the distribution instead of carrying over binary packages from past versions of your distribution.

The core reason for this is not to pick up fixes and improvements in the compiler et al, but to make sure that your packaging is reproduceable. As an extreme example, suppose that you move from GCC 4.x to GCC 5.x in the new version of your distribution and that GCC 5.x is pickier about some things in people's code that GCC 4.x silently accepted. If you carry over old binary packages built on previous versions of your distribution instead of rebuilding everything, you may be shipping packages for your new version that can't be rebuilt on it because their source contains something that GCC 5.x no longer accepts.

Rebuilding when the compilation environment changes means that you know that you actually can build all packages from source on a current system. Shipping the result lowers the chance that previously unseen code generation problems are lying there waiting to surface the moment you have to rebuild and ship the package for some other reason.

(Here's a scenario that should like fun. Suppose that you have an old binary package compiled with an old compiler version. You get a bug report for the package, patch the source to make what you think is a fix, rebuild the package (which uses the current compiler), and ship it. Unknown to you, the current version of the compiler has a bug where it mis-compiles one part of the program. Shortly after you ship the update, you start getting reports that your update broke things; in fact, it seems to have broken things that look totally unrelated to your change. Good luck on tracking that problem down, especially if it only affects a small subset of users.)

WhyDoMassPackageRebuilds written at 01:27:05; Add Comment

2012-09-12

The core problem with developers doing their own packaging

As I've discussed, Linux package systems generally have some degree of support for developers embedding all of the things you need for source packages directly in the master source and distribution tarballs or the like. However, pretty much everyone now discourages developers actually doing this; these days, distributions would generally much rather have you let them do the packaging (and many of them will ignore your work if you try to include it).

There are a number of pragmatic reasons for this; one of the larger ones is that it's not very common for developers to also be packaging experts for a Linux distribution. But let's set that aside and assume that, say, you're an expert Debian Developer and you're going to build the Debian packages for your program. There's still a problem with such 'native' packages.

The core problem with including the packaging data in the master source distribution is that this can only be packaging for one distribution. Debian packages are not Ubuntu packages and Ubuntu 12.04 packages are not Ubuntu 10.04 packages; your in-tree debian/ directory can only be right for one of them. What this really means is that packaging metadata is tied to the distribution, not to the source code. Putting the packaging metadata together with the source code is yoking together two very separate things in a way that artificially constrains both of them.

(Among other things, notice that this sort of native package requires changing the distribution source in order to make a package update. There are many reasons for doing this that have nothing to do with changes to the source, including things like 'there's a bug in GCC that might have affected this program, we need to rebuild it with the fixed GCC just to be sure'.)

The practical best case result of embedding packaging control files in the master source is that one lucky distribution version gets to use the master source as a 'native' package, at least for a while, and everyone else puts together a non-native package with some alterations (and then ignores or erases your bundled control files).

In my view it's easier overall to always have a 'non-native' source package, one with packaging control files separate from the upstream source distribution. This creates uniformity and avoids the problem of having to convert a native package to a non-native package during things like mass rebuilds.

UpstreamPackagingProblem written at 02:01:43; Add Comment

2012-09-09

The core difference between Debian source packages and RPMs

At least from my perspective, the two big source package formats in the Linux world are Debian's and (source) RPMs. I've worked with both (although far more with RPMs than with debs) and recently I've formed an opinion on what the core difference between them is and what each is better (or best) at.

The Debian source format is optimized for the case where the 'upstream' developer is also effectively the Debian packager (in what Debian calls a 'native' package). The Debian control files live in the general distribution tarball and you can build Debian packages right from the development tree with no fuss and bother. You don't need to have any extra bureaucracy or keep things outside the source tree.

The RPM source format is optimized for packaging (and changing) other people's packages. Everything lives outside the source tree (indeed in a completely separate area) and from the start all modifications were supposed to be made as a sequence of patches. In theory RPM has support for 'native' packages, packages with a spec file integrated into their source tarball, but I don't think many people really use this and it's certainly not the natural way to work with RPM packages.

Even though RPM has some 'native' support, it has no way to build a package from an unpacked source tree the way that Debian does. By contrast, building from an unpacked tree is the fundamental operation in Debian packaging. If you're developing your program and want to repeatedly build the package the Debian approach is much more convenient. The flipside is true (in my opinion) if you're packaging and possibly modifying an upstream package; there the RPM approach is cleaner and easier to work with, as I've sort of grumbled about before.

This doesn't quite make arguments about which source format is better into arguments about editors, but in my opinion it does move the question one step removed. The right question is not which is better but which situation is more common.

(In my biased opinion I believe that the answer is 'packaging other people's programs' and in fact it's proven to be a mistake to have the upstream developer try to also package the program, but the latter is a topic for another entry.)

DebianVsRPMSourcePackages written at 23:40:27; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.