How to fail at versioning
Today, Ubuntu released a PAM security update and we applied it. On Ubuntu 10.04, this upgraded the Ubuntu PAM package from version 1.1.1-2ubuntu5 to 1.1.1-2ubuntu5.2 (other Ubuntu LTS releases had similarly versions numbers); as you can see, this is a very minor version bump. As you'd expect, this did not change, eg, the libpam shared library soname version (not even the minor version).
Our logs promptly exploded with error messages like:
CRON: PAM unable to dlopen(/lib/security/pam_env.so): /lib/libpam.so.0: version `LIBPAM_MODUTIL_1.1.3' not found (required by /lib/security/pam_env.so)
(This appears to have affected nearly any daemon or persistent process that uses PAM; cron is just the most obvious one.)
You can make an entire list of ways that this was a versioning fail, and in fact it's a fail on several levels. First, what was labeled as a minor packaging update introduced an ABI incompatibility, one that basically forces a system reboot at that. Second, despite having various versioning mechanisms available to it, PAM made no use of them; for example, it did not bump sonames (not even by a minor number). Finally, PAM claims to have versioning (eg, its library soname is versioned) but it has unversioned components with version dependencies. Here pam_env is clearly versioned; a specific version is ABI-compatible only with a very narrow and specific PAM library. But there is no way to have two pam_env shared objects for two different versions of the PAM library (even if the PAM library made use of version numbering), because it has no version number itself.
(In light of this last issue, it's kind of unsurprising that the libpam soname version did not change; it probably wouldn't do any good even if it did.)
Note that it's not clear who is responsible for all of the failures here. At a minimum Ubuntu is at fault; 'break your system' ABI incompatibilities should never have version number changes that are just minor package updates (not that Ubuntu is all that good at this). If Ubuntu created the ABI incompatibility through one of their patches, they are also at fault for not versioning it properly.
Update: Ubuntu has accepted this as a bug, bug #790538. I suppose the good news is that they consider this a serious issue.
PS: what I can best describe as an extreme reluctance to ever change library soname version despite major ABI changes is not exactly unique to PAM. As far as I can tell it's common behavior for a great many Linux projects, most prominently glibc (which seems to have invented its own additional versioning system because soname versions weren't good enough). I have no idea why people like doing this, although I'm sure there's a reason.
(Possibly changing library soname version numbers on ABI changes was found to not work very well in practice.)
Sidebar: what happens and how to fix it
When this happens to a daemon such as cron or xdm, the daemon basically stops doing much of anything useful; cron did not run cron jobs, and xdm did not let anyone log in. You can cure the problem by restarting the daemon, but note that restarting xdm has the small side effect of immediately terminating the session of everyone who logged in through xdm.
Ultimately you're going to want to reboot the computer. This is kind of troublesome if it is a heavily used login server or a compute server used for long-running jobs. This is still Unix, even though developers seem more and more intent on turning it into 'reboot after doing any changes' Windows.
(Yes, I'm bitter right now.)
Comments on this page:Written on 31 May 2011.