Thinking about two different models of virtualization hosts

November 8, 2020

The conventional way to make a significant commitment to on-premise virtualization is to buy a few big host machines that will each run a lot of virtual machines (possible with some form of shared storage and virtual machine motion from one server to another). In this model you're getting advantages of scale and also fluctuating usage of your virtual machines over time; probably not all of them are all active at once, so you can to some degree over-subscribe your host.

It has recently struck me that there is another approach, where you have a significant number of (much) smaller host machines, each supporting only a few virtual machines (likely without shared storage). This approach has some drawbacks but also has a number of advantages, especially over not doing virtualization at all. The first advantage is that you can deploy hardware without deciding what it's going to be used for; it is just a generic VM host, and you'll set up actual VMs on it later. The second advantage is that it is much easier to deploy actual 'real' servers, since they're now virtual instead of physical; you get things like KVM over IP and remote power cycling for free. On top of this you may make more use of even moderate sized servers, since these days even basic 1U servers can easily be too big for many server jobs.

(You may also be able to make your hardware more uniform without wasting resources on servers that don't 'need' it. If you put 32 GB into every server, you can either run one service that needs 32 GB or several VMs that only need 4 GB or 8 GB. You're not stuck with wasting 32 GB on a server that only really needs 4 GB.)

Having only a few virtual machines on each host machine reduces the blast radius when a host machine fails or has to be taken down for some sort of maintenance (although a big host machine may be more resilient to failure). It also makes it easier to progressively upgrade host machines; you can buy a few new ones at a time, spreading out the costs. And you spend less money on spares as a ratio of your total spending; one or two spare or un-deployed machines cover your entire fleet.

(This also makes it easier to get into virtualization to start with, since you don't need to make a big commitment to expensive hardware that is only really useful for virtualization. You just buy more or less regular servers, although perhaps somewhat bigger than you'd otherwise have done.)

However, this alternate approach has a number of drawbacks. Obviously you have more machines and more hardware to manage, which is more work. You will likely spend more time planning out and managing what VM goes on what host, since there is less spare capacity on each machine and you may need to carefully limit the effects of any given host machine going down. Without troublesome or expensive shared storage, you can't rapidly move VMs from host to host (although you can probably migrate them slowly, by taking them down and copying data over).

In a way you're going to a lot of effort to get KVM over IP, remote management capabilities, and buying somewhat fewer machines (or having less unused server capacity). The smaller the host machines and the fewer VMs you can put on them, the more pronounced this is.

(But at the same time I suspect that there is a sweet spot in the cost versus server capacity in CPU, RAM, and disk space that could be exploited here.)

Written on 08 November 2020.
« Turning on console blanking on a Linux machine when logged in remotely
Getting the git tags that are before and after a commit (in simple cases) »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sun Nov 8 23:47:39 2020
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.