Wandering Thoughts archives

2010-09-15

Why I like modules to be in the Python standard library

Even when there may be perfectly good third party modules for something, I really want there to be a module for it in Python's standard library. Part of the reason is obviously how I find third party modules to be awkward, but another part of it is what I call the selection problem.

The selection problem is the problem of picking a third party module (sometimes even finding it, although pypi helps with that) and figuring out if it's any good. Simply figuring out the quality of a module is a bunch of work, and the amount of work multiplies drastically if there's several third party modules that all do what I want. Often, the only way I can really tell if a module is going to work well is to actually try using it. Generally this has to be in a real program (I find toy examples both frustrating to write and uninformative), which means that if I have picked poorly I may have wasted a bunch of time and effort. Even if I can rule out a module relatively early, I had to spend the time to read documentation or skim code or the like, and that time's all wasted.

(And frankly it's frustrating to run into near misses, modules that almost do what I need and almost work. Faced with this, it often at least feels easier to write something from scratch myself if what I want isn't too big.)

When a module has made it into the standard library, I don't have to go through all of this; I can just use the module, secure in the confidence that this is a good implementation of whatever it is that I want to do. Someone else has already gone through all of this quality assurance work, and if there were multiple implementations the Python people have probably either picked the best one or at least determined that they are more or less equivalent and so I am not missing anything very important by not looking at the other options.

(Yes, sometimes this confidence is misplaced. But generally it's at least close.)

Update: see also WhyInStandardLibraryII for additional comments on the time drain of the selection problem.

python/WhyInStandardLibrary written at 23:52:01; Add Comment

An overview of the Debian and RPM source package formats

This is a brief and jaundiced overview of the format of Debian and RPM source packages, what the Debian and RPM package systems theoretically use to generate the compiled binary packages that people actually install. As usual, this applies to all distributions that use the Debian .deb package format or the Red Hat .rpm package format, although specific details vary. Also, I'm going to simplify to the common case.

A source RPM contains a specfile, a source tarball, and some number of patches. The specfile describes the package, names the source tarball and the patches, and contains a script that configures and compiles the binaries (I simplify). It can also contain scripts that will be run when the binary package is installed, removed, upgraded, or a number of other events. Specfiles support a complicated system of text macros, macro substitution, conditional 'execution' of portions of the specfile (which may wind up omitting or including some patches), and even more peculiar things; these are used to automate a lot of standard parts of the package build process, such as configuring a program that uses standard GNU autoconf.

There is no fixed layout of where all of these pieces go when a source RPM is unpacked and built; it depends on your local configuration, although some arrangements are more sensible than others.

(Note that those RPM settings have probably gotten slightly broken since 2006, since they seem to now be doing slightly odd things for me. RPM macros have a lot of magic in them.)

A Debian source package contains a description file, a source tarball, and a patch. After unpacking the source tarball and applying the patch, there must be a top level subdirectory called debian. Files in this subdirectory are used to control the rest of the build and packaging process; although a number are required, the most important one is debian/rules, which is the Makefile used to build the package.

(Note that this subdirectory can contain lots of things besides the Debian package building control files. For instance, if the Debian package wants to run scripts when it's installed, removed, or so on, it will usually store the scripts in debian/.)

Much like RPM specfiles and their macros, Debian rules files support a complicated system of helper programs to do most of the actual work. A typical Debian rules file cannot be fully understood without knowing what these programs do (some of this can be deduced from their names). Debian being Debian, I believe that there are several generations and versions of these helper programs (and no doubt epic flamewars have been fought over which ones to use when).

(Debian helper programs are better documented than RPM macros, for various reasons. Or at least more conveniently documented, since they have manpages.)

A Debian rules file may or may not further patch the source in the process of building it. One style of Debian package rolls both making any necessary modifications to the package source code and creating the contents of the debian directory into the initial patch; another uses the initial patch only to create the debian directory and then, RPM-like, applies a series of source patches from the debian directory during the build process. Determining which approach any particular Debian package uses may require close attention to the rules file, although if there is a debian/patches directory the odds are good that this source package uses some version of RPM-like two stage patching.

(In the Debian way, there appear to be at least three different systems for doing such patching, each somewhat different.)

linux/DebianAndRPMSourcePackages written at 01:09:45; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.