There's a relationship between server utilization and server lifetime

March 25, 2016

In yesterday's entry on how old our servers are, I said that one reason we can keep servers for so long is that our CPU needs are modest, so old servers are perfectly fine. There is another way to put this, namely that almost all of our servers have low CPU utilization. It is this low CPU utilization that makes it practical and even sensible to keep using old, slow servers. Similarly, almost all of our servers have low RAM utilization, low disk IO utilization, and so on.

This leads me to the undoubtedly unoriginal observation that there's an obvious relationship between capacity utilization and how much pressure there is to upgrade your servers. If your servers are at low utilization in all of their dimensions (CPU, RAM, IO bandwidth, network bandwidth, etc), any old machine will do and there is little need to upgrade servers. But the more your servers are approaching their capacity limits, the more potential benefit there is of upgrading to new servers that will let you do more with them (especially if you have other constraints that make it hard to just add more servers, like power or space limits or a need for more capacity in a single machine).

It follows that anything that increases server utilization can drive upgrades. For example, containers and other forms of virtualization; these can take N separate low-utilization services, embody them all on the same physical server, and wind up collectively utilizing much more of the hardware.

(And the inverse is true; if you eschew containers et al, as we do, you're probably going to have a lot of services with low machine utilizations that can thus live happily on old servers as long as the hardware stays reliable.)

In one way all of this seems obvious: of course you put demanding services on new hardware because it's the fastest and best you have. But I think there's a use for taking a system utilization perspective, not just a service performance one; certainly it's a perspective on effective server lifetimes that hadn't really occurred to me before now.

(Concretely, if I see a heavily used server with high overall utilization, it's probably a server with a comparatively short lifetime.)


Comments on this page:

When I was involved in hosting and system administration and had to deal with server lifetimes, there were a handful of servers which actually had to be upgraded because of running out of capacity. These were shell servers, database servers and web servers. And then there was the rest of the supporting fleet for mail, DNS, logging and the lot which had to be upgraded because the support contracts for the servers were getting too expensive.

How do the support contract pricing policies fit your budget? They must be a factor?

By cks at 2016-03-25 16:33:45:

Support contract durations aren't an issue for us because we don't have the money for them in the first place. We treat the hardware warranty mostly as an insurance against dead on arrival units and infant mortality.

(Occasionally we have the funding to get extended hardware support, but even then we'll keep running hardware once that runs out.)

The other consideration for upgrading low-utilization servers is power (and thus cooling) efficiency. I suspect your university is like my former employer and that your department doesn't have to worry about paying for electricity. In other environments, though, lowering the power bill might be worth a pre-mortality upgrade. This is even more true if rack space/weight is a consideration. I once ran a 4U monstrosity as a syslog server because it was free, but once I needed that extra space, it got shoved off to the VM farm.

From 88.192.212.75 at 2016-03-31 13:43:45:

Has your view on virtual servers changed, I have the impression from earlier posts that you don't use them much?

At work we currently administer over 350 Linux servers running on mostly on VMware, using about 10 physical blade servers.

I love the convenience on not being tied to physical hardware and the cheapness of them. Since a basic 1 CPU, 2 GB RAM server is practically free we can throw up a new one pretty much without a second thought.

By cks at 2016-03-31 14:36:52:

We're still non-virtual and non-containerized for our servers and services (which leads to low utilization of physical servers, which is one reason they last so long for us). I understand the appeal in the abstract, but in the concrete, real use of virtualization clearly has significant setup costs (cf). So far there isn't a good enough reason to make the big commitment and investment that virtualization would require. On top of that, a large fleet of inexpensive small things may well be the right decision for us due to how this interacts with funding issues.

Written on 25 March 2016.
« How old our servers are (as of 2016)
The sensible update for my vintage 2011 home machine »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Fri Mar 25 01:16:43 2016
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.