'Conditional restart' in init.d scripts can be dangerous

November 28, 2009

Yesterday, the lighttpd instance that I run on my workstation was effectively down for about twelve hours; while the daemon was running, it was using the wrong configuration file and so it wasn't really serving anything. In turn, this happened because I installed a lighttpd package update, and as part of the post-update actions the package did '/etc/init.d/lighttpd condrestart'.

In theory, conditional restart in an init.d script will only restart things if the init script has started the daemon in the first place. This is subtly different from 'if the daemon is running', which is what many init.d scripts implement, and what happened to me illustrates the importance of that difference. I don't start lighttpd with /etc/init.d/lighttpd, I start it with a different init.d script that points it to my local configuration file, so when the normal init.d script 'restarted' lighttpd, the new version was running with the system configuration file and thus not doing much.

I can't blame lighttpd and its init script for this problem; it's relying on standard functions provided by the Fedora init.d environment. And I can't really blame Fedora's init.d environment, because the problem is subtle and reasonably difficult to do completely correctly (and I've seen the same problem on other Linuxes). But regardless of where any fault is or isn't, the underlying issue is that 'condrestart' and similar features are dangerously fragile.

The only way to fix this and make conditional restart reliable is to make the daemons restart themselves; on some signal, any running copy of the daemon arranges to re-exec itself with appropriate command line arguments, environments, and so on. Then the init.d condrestart action simply sends this signal to all copies of the daemon that are currently running and lets them sort it all out.

(As a bonus you will have arranged to fix any copies of the daemon that are running, regardless of how they got started, which is probably what you really want to do.)

If you do not do this, please create an officially supported and documented way of changing all of the command line parameters that your init.d script uses to start the daemon, or as a minimum changing the configuration file.

(Note that this being official is important, because that means that I can count on it not breaking over updates.)

Comments on this page:

By Dan.Astoorian at 2009-11-28 12:44:22:

I disagree with your preferred solution of making the services know how to restart themselves upon receipt of a signal.

For one thing, daemons which expect to be launched by root but drop their privileges after starting may not easily be able to restart themselves via re-exec. At best, they may need specialized code to pass resources to themselves across the re-exec (e.g., "file descriptors 12 and 13 are your log files, which you don't have sufficient privileges to re-open; file descriptors 17 and 18 are the privileged TCP and UDP ports you're listening on"); at worst, it may not even be possible to do the re-exec (e.g., because the daemon has chroot()'d to some place that doesn't include the executable or its libraries, or because the patched version of the daemon or its modified configuration requires new resources which the old one didn't). In any case, that's way too much work, with far too much risk of getting something wrong, to solve this problem.

Furthermore, I think it would be asking for trouble to have "/etc/init.d/whatever condrestart" have a different effect on the restarted service than "/etc/init.d/whatever restart" or "/etc/init.d/whatever stop; /etc/init.d/whatever start" would have had.

Most Red Hat Enterprise Linux init scripts I've looked at (which, I presume, derive from Fedora's) can be customized to some degree by modifying a file in /etc/sysconfig/ which gets sourced by the init script; I've always found this to be the most fruitful approach to making changes that will survive patches. If I were you I'd probably file a bug against lighttpd requesting that the init script be changed to use the same mechanism to set command line options.


From at 2009-11-29 00:37:39:

As a corollary, in Debian, these parameters which affect the behavior of the init scripts are typically placed into the directory /etc/default.


From at 2009-11-29 09:51:10:

If I have a service that I'm starting by hand (or by non-init.d script), I always run 'chkconfig <service> off' to ensure that it won't be started by overzealous automatic things, such as update.

Matt Simmons

By cks at 2009-11-30 11:21:37:

Something I forgot to mention: in at least Fedora, condrestart pays no attention to whether or not the service is chkconfig'd on or off; it runs and 'works' no matter what.

It turns out that the Fedora lighttpd init.d script does have an undocumented way to use an /etc/default file to change the configuration file name. The problem is that 'undocumented' bit; since it is undocumented, and it is fiddling with internal things in the script, I would rather make a copy of the init.d script and modify it than rely on the undocumented customization staying working.

I think that a daemon being able to restart itself is clearly the most desirable case, and thus you should design for it if possible. If that isn't possible, then you can and should write an intelligent init.d script that only restarts your own instance of the daemon.

Written on 28 November 2009.
« Modern version control systems change your directory layouts
In security, you need to stop the root mistake »

Page tools: View Source, View Normal, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sat Nov 28 01:48:12 2009
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.