An illustrated example of how not to do package updates

May 18, 2010

I spent a greater part of today discovering how and why smartd was not monitoring our disks on several of our Red Hat Enterprise 5 based iSCSI backends. The quick summary is that if your RHEL 5 machine was installed some time ago (or installed recently from an old installer disk) and upgraded since then, smartd may not be monitoring all of your disks; this is all but certain if you've recently added new disks.

What happened to cause this is a badly considered package update. In the beginning of RHEL 5, Red Hat's version of the smartmontools RPM shipped with an /etc/smartd.conf and a system that defaulted to auto-generating the list of disks to monitor every time you started smartd. Later, RHEL updated smartmontools and stopped doing this; instead the new version of /etc/smartd.conf told smartd to do the scan for disks itself. However, applying the update does not re-do existing /etc/smartd.conf files, which now have a static list of disks to monitor that never gets updated. If your list of disks changes after the package update is applied, you lose.

We normally update our iSCSI backends immediately when they're installed, before we connect them to the enclosure with their data disks. The net result is that several backends were left silently monitoring only the system disks, which is what you could call not desirable.

(The problem does not occur with recent RHEL 5 install images, which have the updated smartmontools RPM rolled into the base OS.)

To be blunt, this is a badly done package update, especially for a theoretically 'enterprise' operating system. You should never create a situation where a sysadmin installs a package without changing its configuration files, installs an upgrade, and the package stops working, partly because creating such situations is a great way to persuade sysadmins to never install package updates.

(What RHEL should have done is keep the auto-generation system but change the default /etc/smartd.conf to not use it. Then everyone would have been happy; people with the old configuration would have had it keep updating their list of disks, and people with the new configuration would have their disks auto-detected by smartd itself.)


Comments on this page:

From 69.113.211.148 at 2010-05-18 08:55:02:

I know for a fact that %{_sysconfdir}/smartd.conf is defined in that RPM as a %config(noreplace), which means that you should have seen a notification from your package manager that the file was created as /etc/smartd.conf.rpmnew. Were you not reviewing the output of your package manager during the update process? I don't think you really have a leg to stand on with this complaint -- you should be aware of your distribution's conventions for handling configuration files, and you should read what yum tells you is happening when you update.

--Jeff

By rdump at 2010-05-18 11:43:25:

@Jeff

Nonsense. First, figuring out the dangerous behavior change required more than looking at gobs of non-humanly-parseable package manager logging. Second, the change was not fail-safe in the face of normal use. Someone didn't think things through before committing the change to all their customers.

The careless attitude behind that kind of fail-dangerous change is one general reason I dislike trying to run production resources on Linux distributions. It's endemic among too many packagers (some of whom just blindly pass on upstream changes without even thinking about testing; another sin), and I really don't like having to second-guess every installation.

By cks at 2010-05-19 01:46:44:

@Jeff: the short answer is that .rpmnew files don't work in practice, only in theory. This especially applies when you are administering a lot of machines and applying package updates immediately after installing the base OS.

(In fact I believe that an increasing number of RPM-based distributions now let you apply pending updates in the installer itself, which means in the installer GUI, which means that you probably won't even see notices about .rpmnew files. (Note that I think that applying updates at install time is a good feature that everyone should support.))

Written on 18 May 2010.
« You should also document why you didn't do attractive things
Why RPM's .rpmnew files don't work in practice »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Tue May 18 01:10:25 2010
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.