How old our servers are (as of 2016)

March 23, 2016

I was recently asked how old our servers are. This is a good question because, as noted, universities are cheap; as a result we probably run servers much longer than many people do. In fact we can basically wind up running servers into the ground, and we actually are somewhat doing that right now.

The first necessary disclaimer is that my group here only handles general departmental infrastructure on the non-undergraduate side of things (this split has its roots way back in history and is another story entirely). Also, while we try to have some reasonably powerful machines for people who need compute servers, we don't have the budget to have modern, big, up to date ones. Most professors who need heavy duty computation for their research programs buy machines themselves out of grant funding, and my understanding is that they turn over such compute servers much faster than we do (sometimes they then donate the old servers to us).

Over my time here, we've effectively had three main generations of general servers. When I started in 2006, we were in the middle of a Dell PowerEdge era; we had primarily 750s, 1650s, and 2950s. Most of these are now two generations out of productions, but we still have some in-production 2950s, as well as some 1950s that were passed on to us later.

(It's worth mentioning that basically all of these Dells have suffered from capacitor plague on their addon disk controller boards. They've survived in production here only because one of my co-workers is the kind of person who can and will unsolder bad capacitors to replace them with good ones.)

Starting in 2007 and running over the next couple of years we switched to SunFire X2100s and X2200s as our servers of choice, continuing more or less through when Oracle discontinued them. A number of these machines are still in production and not really planned for replacement soon (and we have a few totally unused ones for reasons beyond the scope of this entry). Since they're all at least five or so years old we kind of would like to turn them over to new hardware, but we don't feel any big rush right now.

(Many of these remaining SunFire based servers will probably get rolled over to new hardware once Ubuntu 16.04 comes out and we start rebuilding our remaining 12.04 based machines on 16.04. Generally OS upgrades are where we change hardware, since we're (re)building the machine anyways, and we like to not deploy a newly rebuilt server on ancient hardware that has already been running for N years.)

Our most recent server generation is Dell R210 IIs and R310 IIs (which, yes, are now sufficiently outdated that they're no longer for sale). This is what we consider our current server hardware and what we're using when we upgrade existing servers or deploy new ones. We still have a reasonable supply of unused and spare ones for new deployments, so we're not looking to figure out a fourth server generation yet; however, we'll probably need to in no more than a couple of years.

(We buy servers in batches when we have the money, instead of buying them one or two at a time on an 'as needed' basis.)

In terms of general lifetime, I'd say that we expect to get at least five years out of our servers and often we wind up getting more, sometimes substantially more (some of our 2950s have probably been in production for ten years). Server hardware seems to have been pretty reliable for us so this stuff mostly keeps running and running, and our CPU needs are usually relatively modest so old servers aren't a bottleneck there. Unlike with our disks, our old servers have not been suffering from increasing mortality over time; we just don't feel really happy running production servers on six or eight or ten year old hardware when we have a choice.

(Where the old servers actually tend to suffer is their RAM capacity; how much RAM we want to put in servers has been rising steadily.)

Some of our servers do need computing power, such as our departmental compute servers. There, the old servers look increasingly embarrassing, but there isn't much we can do about it until (and unless) we get the budget to buy relatively expensive new ones. In general, keeping up with compute servers is always going to be expensive, because the state of the art in CPUs, memory, and now GPUs keeps turning over relatively rapidly. In the mean time, our view is that we're at least providing some general compute resources to our professors, graduate students, and researchers; it's probably better than nothing, and people do use the servers.

PS: Our fileservers and their backends were originally SunFires in the first generation fileservers (in fact they kicked off our SunFire server era), but they're now a completely different set of hardware. I like the hardware they use a lot, but it's currently too expensive to become our generic commodity 1U server model.

Sidebar: What we do with our old servers

The short answer right now is 'put them on the shelf'. Some of them get used for scratch machines, in times when we just want to build a test server. I'm not sure what we'll do with them in the long run, especially the SunFire machines (which have nice ILOMs that I'm going to miss). We recently passed a number of our out of service SunFires on to the department's undergraduate computing people, who needed some more servers.

Written on 23 March 2016.
« Wayland and graphics card uncertainty
There's a relationship between server utilization and server lifetime »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Wed Mar 23 22:17:42 2016
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.