An illustrated example of how not to do package updates

May 18, 2010

I spent a greater part of today discovering how and why smartd was not monitoring our disks on several of our Red Hat Enterprise 5 based iSCSI backends. The quick summary is that if your RHEL 5 machine was installed some time ago (or installed recently from an old installer disk) and upgraded since then, smartd may not be monitoring all of your disks; this is all but certain if you've recently added new disks.

What happened to cause this is a badly considered package update. In the beginning of RHEL 5, Red Hat's version of the smartmontools RPM shipped with an /etc/smartd.conf and a system that defaulted to auto-generating the list of disks to monitor every time you started smartd. Later, RHEL updated smartmontools and stopped doing this; instead the new version of /etc/smartd.conf told smartd to do the scan for disks itself. However, applying the update does not re-do existing /etc/smartd.conf files, which now have a static list of disks to monitor that never gets updated. If your list of disks changes after the package update is applied, you lose.

We normally update our iSCSI backends immediately when they're installed, before we connect them to the enclosure with their data disks. The net result is that several backends were left silently monitoring only the system disks, which is what you could call not desirable.

(The problem does not occur with recent RHEL 5 install images, which have the updated smartmontools RPM rolled into the base OS.)

To be blunt, this is a badly done package update, especially for a theoretically 'enterprise' operating system. You should never create a situation where a sysadmin installs a package without changing its configuration files, installs an upgrade, and the package stops working, partly because creating such situations is a great way to persuade sysadmins to never install package updates.

(What RHEL should have done is keep the auto-generation system but change the default /etc/smartd.conf to not use it. Then everyone would have been happy; people with the old configuration would have had it keep updating their list of disks, and people with the new configuration would have their disks auto-detected by smartd itself.)

Written on 18 May 2010.
« You should also document why you didn't do attractive things
Why RPM's .rpmnew files don't work in practice »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Tue May 18 01:10:25 2010
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.