Why I don't like the Debian source package format

August 19, 2012

I'm about to start modifying an Ubuntu package to build a modified version, so I've been reminded of all of the reasons that I don't really like the way Debian source packages work (Ubuntu uses the Debian package format). To explain why, I need to start with a brief description of how Debian source packages are put together.

Like all source package formats, Debian source packages need to contain the four essential things, but they do it in an odd way. Source packages generally come in two different forms; let us call these the distribution form, which is what you download, and the working form, which is what you expand the distribution form into. The distribution form of a Debian source package has three components: a .dsc file of basic metadata, one or more bundles (usually one as a compressed tar archive) of the upstream source code, and a bundle of Debian modifications that are applied on top of it. The working form of Debian source packages is a directory tree of the source code, which is created by unpacking the upstream source code and then applying the Debian modifications to it; after this, the source tree will have a debian/ subdirectory with control files that describe how to build and package the source.

The largest problem with this is that the working form of Debian source packages has the upstream source already modified. The upstream source is not patched as part of building the binary packages (well, mostly); the upstream source is patched the moment that you even look at the source package. A direct consequence of this is that the control files do not exist until after you have patched the upstream source, or at least they do not exist in any readily accessible format.

Now we get to the mess that is the bundle of Debian changes. Debian source packages have two formats for this (or at least two dominant, non-experimental formats); the changes can be either a single giant patch file (which must include creating the debian/ subdirectory) or a tarball with the Debian control files and a set of quilt-based patch files (cf). The quilt-based format is sane, but it's the newer one. The giant patch file format is the older one and still quite common, but there are two things wrong with it.

The first thing wrong with it is that it's terrible as a format for modifications, especially for sysadmins who want to add a change to an existing Debian or Ubuntu package. Let me count the way:

  • a single giant diff is hard to read, especially when it also includes creating the control files in the debian/ directory.

  • a single giant diff smashes logically separate changes together into one big mess.

  • it's impossible to separate out your own changes so that you can easily apply them on top of the next Debian or Ubuntu package update; you're better off keeping diffs of your changes outside of the source package system and then (re)applying them by hand.

The second thing wrong with it is that all of its problems spawned a bunch of workarounds, which basically use quilt or something like it in the debian control files to apply changes as the software is built. The large scale consequence of this is unpredictability. When you get and set up a Debian source package as a non-specialist, the resulting source tree may or may not reflect the Debian modifications and there is no single procedure that you can follow to make changes to it (especially maintainable changes).

(A meta-consequence of the introduction of a new and radically different format is that there are plenty of guides on 'how to do things with Debian source packages' out there on the web that hasn't been updated to discuss the new format. This is again a problem for non-specialists, who don't know enough to know that the guide is incomplete until something goes wrong. Note that I am barely more than a non-specialist (and most of that is due to writing this entry).)

A smaller irritation but something that is symptomatic of how Debian does things is how you change the build version of generated packages. A normal, sane packaging system would have an explicit field for this in metadata somewhere. In the Debian package format, the build version is derived by implication: each entry in the Debian changelog for the package (found in the debian/ subdirectory) has a build version attached to it, so the build system 'simply' finds the build version of the top (most recent) changelog entry and uses that.

(On reflection, I've decided not to say anything about the debian/rules file. I'm a non-expert in working with Debian source packages and so I suspect that everything about it is much clearer and more obvious to people who are Debian package building experts. Every source package system involves a certain amount of arcana and doing the actual compiling and so on is often where it's concentrated.)

Written on 19 August 2012.
« What everyone needs in source packages
Sysadmins hate updates (more or less) »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sun Aug 19 02:21:49 2012
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.