Limiting how much load Exim puts on your system

November 21, 2008

One of the things that you usually want to do with MTAs is have some limit on how many things that they'll try to do at once. This is especially important if, like us, you allow users to run programs from their .forwards; people do every so often have runaway programs, or just programs that sit there endlessly (trying to get a lockfile, for example).

Unfortunately, Exim only has limited support for load limiting. What you really want to do is limit the number of particular sorts of simultaneous deliveries allowed, so that you can have limits like 'only twenty pipes at once, and only four at once per user'. Exim can't do that directly; instead, all you can do is try to limit the number of simultaneous deliveries in total, and there is no direct limit on that either, so you have to reverse engineer one sideways.

Exim can start delivery processes either immediately during an SMTP conversation or later, during a queue run. Each queue run starts one process, and for local transports each delivery process only does one delivery at a time. So if all you are dealing with is queued mail, you can be doing up to queue_run_max local deliveries at once.

(We mostly care about local deliveries, because they are the ones that can explode the most and are the most likely to use up a lot of memory and CPU. For remote SMTP, each top-level Exim delivery process can do up to remote_max_parallel deliveries at once.)

Once you have at least smtp_accept_queue SMTP connections (more or less; concurrency issues can create a bit of slop), the new connections queue all of their messages and do not create more delivery processes. Before then, each SMTP connection can create at most smtp_accept_queue_per_connection delivery processes; after a connection has processed that many messages, it starts queuing them instead of immediately delivering them.

So the maximum process limit for local deliveries is the number of queue runners you allow, plus the maximum number of non-queueing SMTP connections times the number of non-queued messages per connection. This is the worst case situation, but unfortunately reducing any of these settings to limit the worst case will slow down ordinary processing under some situations, either by forcing things to be queued unnecessarily or by slowing down how soon queued messages get processed.

(Exim really wants to do immediate deliveries from SMTP sessions in order to process email promptly. The problem with relying on queue runners for delivery is that each time a message is retried, it has to be re-routed from scratch. This means that even a modest number of messages to places with DNS problems will probably clog your queue up significantly.)

The other possible approach is to use limiting based on the load average, through queue_only_load and deliver_queue_load_max. My concern with these is that the load average is a lagging indicator (it is a one minute moving average, after all). Under a significant load burst you can get into trouble well before the load average updates to high enough to kick in these limits.

Written on 21 November 2008.
« Combining dual identity routing and isolated interfaces revisited
Why I hate 'security questions' »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Fri Nov 21 23:41:33 2008
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.