Load is a whole system phenomenon

September 19, 2013

Here's something obvious: load and its companion overload is something that's created by everything that's going on on your system at once. Oh, sure, some subset of the activity can be saturating a particular resource, but in general (and without quota-based things like Linux's cgroups) it is the sum of all activity (or all relevant activity) that matters.

So far this probably all sounds very obvious, and it is. But there's a big corollary: if you want to limit load, you must take a global perspective on activity. If you have ten things that each could create load, you can't limit overall system load just by limiting those ten things individually and in isolation from each other. A 'reasonable load' for one thing by itself is not necessarily reasonable when all ten are loaded at once. If you have no dynamic global system the best you can do is to assign static quotas such that each thing gets a limit of (say) 1/10th of the machine and can't use more even when the system is otherwise idle.

Now this comes with an exception: if all activity funnels through one central point at some point in processing, you can (sometimes) put load limits on that single point and be done. That's because the single point implicitly has a global view of the load; it 'knows' what the global total load is because it sees all traffic.

All of this sounds hopelessly abstract, so let's talk web servers and web applications. Suppose you have a web server serving ten web apps, each of which is handled by its own separate daemon. You want your machine to not explode under load, no matter what load it is. Can you get this by just putting individual limits on each web app (eg 'only so much concurrency at once')? My answer is 'not unless you're going to use low limits', at least if demand for the apps is unpredictable. To do this properly you need some central point to apply a whole system view and whole system limits. One such spot might be the front-end web server; another might be a daemon that handles or at least monitors all web apps at once.

In short, now you know why I feel that separate standalone daemons are the wrong approach for scalable app deployment. Separate daemons mean separate limits and you can't configure those sensibly without risking blowing up your machine under load. The more apps you have the worse this gets (because the less their 'safe' share of the machine is).

Comments on this page:

By Perry Lorier at 2013-09-19 05:21:10:

Unless, as you point out, you group them together and rate limit them as a whole using cgroups. That's what it's there for :) You can have one part of the system saturated, and in another cgroup be fine if you set things up properly ;)

Written on 19 September 2013.
« Reconsidering external disk enclosures versus disk servers
Processes waiting for NFS IO do show in Linux %iowait statistics »

Page tools: View Source, View Normal, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Thu Sep 19 00:40:04 2013
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.