The problem with CPAN (and other similar systems)

June 17, 2005

At one level, CPAN is a great thing: people really like having a simple way of installing Perl packages (and a big archive of them). It's such a good idea that it's being copied over and over: Python's distutils, Ruby's RubyGems, the R statistic package's CRAN, and no doubt others.

But CPAN and things like it have a problem: they're a package management system. Or, to be more detailed, they're another package management system, on top of the one that our Unix systems already have.

Multiple package systems on a single computer means that no single package system has a full picture of the system. This causes various problems:

  1. to get a complete picture of what's on the system, I have to remember to use multiple tools (and remember to how to use all of them).
  2. two different tools can both think they own or exclusively manage certain files (for example, index files of all the packages installed). The extreme case is installing the same thing through the OS package manager and a program's own package manager.
  3. missing cross packaging system relationships; for example, things installed through CPAN likely depend on the version of Perl installed by the OS's package manager. Does the OS package manager know enough to tell me that upgrading Perl because of a security fix is going to orphan all of those CPAN packages I need?
  4. satisfying dependencies: when I try to install a core OS package BazOrp, which requires Python package FooBar (version 1.6.1 to 1.7.8) how does the core OS package management system know that I installed FooBar 1.7.7 through Python's distutils and it's OK to go ahead?

And this simplifies the problems, because most of these CPAN-like things are not actually package management systems, they are package installation systems. All they do is install things; they don't keep a package inventory (especially with version numbers), they usually don't have much of an idea of package dependencies, and often they can't even remove what they just installed.

The situation is worse when I work in large-scale environments, with tens to hundreds of systems. Systems that large can't deal with computers by hand; they have to be managed through automated systems.

In that sort of environment, every program with its own package system means that I would have to obtain or build an automated system to manage that package system. Since the package system itself is unlikely to provide the basic management tools (inventory, dependencies, etc), I would have to build those, too.

You may have guessed the punchline: as a result of all of this, we don't and can't use CPAN, distutils, RubyGems, CRAN, and so on. Of course this is sometimes difficult to explain to users, who are know to approach us to ask 'there is this CPAN module I need, can you please install it on the machines?' and then don't understand why I break down and twitch.

Solution: build real OS packages

I already have to deal with the OS's package management system, so the best way to make my life easier is to make your package installation system build OS packages for me, instead of directly installing files.

This shouldn't be too difficult, as your installation system already has most of the information necessary, such as what files are going to be installed and a package description. Don't worry too much about dependencies, as a decent OS packaging system will be capable of working them out for you.

On Linux systems, supporting building Debian .debs and RPMs will get you most of the way to making people entirely happy. (You don't have to decide which distributions to support; with generic building support, you support everyone using that packaging format.)

Existing support for this

Debian's dh-make-perl builds CPAN packages into Debian .debs.

The CPAN RPM::Specfile package and its cpanflute2 program will build RPMs from CPAN packages. (Getting it in RPM form to bootstrap this properly may be a pleasantly recursive exercise.) There's also the cpan-to-rpm.pl program from here, to do everything in one go. (I believe cpanflute2 has had some problems for us in the past, but I have blotted them out from my mind.)

Python distutils has a bdist_rpm command for building RPMs, but this doesn't work reliably for somewhat complicated packages in the versions I've tried. (Yes, I should file bug reports and produce patches to fix things. Someday, when I have enough time to fully investigate the situation.)

Written on 17 June 2005.
« AJAX vs Dialups
SMTP IP firewall stats at June 18th, 2005 »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Fri Jun 17 22:05:47 2005
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.