2015-08-13
Enabling services on package updates is a terrible mistake
Let's start with my tweet:
The #CentOS 6 iptables package update unconditionally (re-)enables the service and thus turns on firewalls. BRB, setting things on fire now.
Really, it does. It's right there in the RPM postinstall script:
; rpm -q --scripts iptables [...] postinstall scriptlet (using /bin/sh): /sbin/ldconfig /sbin/chkconfig --add iptables [...]
Of course iptables by itself does nothing or rather just applies
whatever rules you already have, but your CentOS 6 machine almost
certainly has a restrictive set of rules in /etc/sysconfig/iptables
that were written there by system-config-firewall during the system
install. Turning on the iptables service will cause them to be
applied at the next reboot, and in our case this took out incoming
external email for more than twelve hours because of course those
rules blocked incoming connections to the necessary TCP ports.
(Yes, there were two problems there. We know.)
At one level this is a straightforward total failure of good packaging; a package update should never enable a currently disabled service. Automatically enabling a service on the initial package install may or may not be a good idea, but changing the system state on a mere package upgrade is clearly utterly wrong. A deliberately disabled service suddenly turning on is generally going to do something bad to the overall state of the system; the only question is just how bad. Iptables is well placed to make this really bad.
(In turn this means that there was a major process failure here. This issue is almost certainly present in the original RHEL 6 update that CentOS 6 built their package from, and Red Hat Enterprise of all distributions should have better update validation than that. This should not have gotten through code review.)
At another level this is also a joint failure between RPM and
chkconfig
because both make it extra hard to do the right thing.
RPM has only a single 'postinstall' script which is run after both
installs and upgrades, which means that you have to remember to
have your shell code explicitly check for which case you're in (and
it's not at all easy to test the upgrade case). Chkconfig and in
general the whole Red Hat init.d symlink system don't draw any
distinction between 'what the package wants to do by default' and
'what the local sysadmin has specifically set up', which leaves
packages easily able to make mistakes that override sysadmin decisions
like this. Put the two together and you have an explosive mixture
where any failure can blow your foot off. This is not a resilient
system.
(Systemd does much better than the init.d stuff here precisely because it has a clear distinction between these two things.)