Disk IO is what shatters the VM illusion for me right now

May 10, 2013

I use VMs on my office workstation as a far more convenient substitute for real hardware. In theory I could assemble a physical test machine or a group of them, hook them all up, install things on them, and so on; in practice I virtualize all of that. This means that what I want is the illusion of separate machines and for the most part that's what I get.

However, there's one area where the illusion breaks down and exposes that all of these machines are really just programs on my workstation, and that's disk IO. Because everything is on spinning rust right now (and worse, most of it is on a common set of spinning rust), disk IO in a VM has a clear and visible impact on me trying to do things on my workstation (and vice versa but I generally don't care as much about that). Unfortunately doing things like (re)installing operating systems and performing package updates do a lot of disk IO, often random disk IO.

(In practice neither RAM nor CPU usage break the illusion, partly because I have a lot of both in practice and VMs don't claim all that much of either. It also helps that the RAM is essentially precommitted the moment I start a VM.)

The practical effect is that I generally have to restrict myself to one disk IO intensive thing at once, regardless of where it's happening. This is not exactly a fatal problem, but it is both irritating and a definite crack in the otherwise pretty good illusion that those VMs are separate machines.

(The illusion is increased because I don't interact with them with their nominal 'hardware' console, I do basically everything by ssh'ing in to them. This always seems a little bit Ouroboros-recursive, especially since they have an independent network presence.)


Comments on this page:

From 89.27.121.149 at 2013-05-10 03:59:34:

The most Ourboros-y thing I have done with VMs was to connect the host OS to Internet through a firewall running in a VM. The VM talked PPPoE through one of the host's network interfaces and provided NAT'd Internet access to the home network through another, plus to the host :)

Also, SSDs ease the VM disk IO pain substantially.

From 216.105.40.123 at 2013-05-10 18:14:18:

I'm here to second the comment on SSD. I have deployed VMs in production and often run into IO issues. SSD solved them.

From 71.80.128.33 at 2013-05-12 04:00:20:

Yes, if you're setting-up virtual servers on your "workstation", you're going to run into performance problems. Disk may be FIRST, but it won't be the only one.

On 2U servers, you can get 8x 15,000 RPM SAS drives in a RAID-10 array... Or a very fast iSCSI SAN (whether dedicated hardware or OpenFiler/FreeNAS/OpenSolaris, whether high-end like NetApp/EMC or a $500 chassis like a PROMISE with dirt-cheap consumer level SATA drives)... Or some SSD drives... If you want to use server-local storage, disk reading will be faster, but writes will still be network I/O limited (something like Google's Ganeti will work). If you want to use iSCSI, reads and writes are network I/O limited, but manageability is simpler and flexible, (just about any commercial offering like VMWare or RedHat Virtualization+Cluster Manager should work).

It's true that if you're going to max-out your hardware much of the time... Whether CPU, RAM, Disk, Network, Interrupts, or something else... Virtualization is going to be significant unnecessary overhead, rather than a money-saver. But if you're just occasionally running into I/O slowdown, getting this stuff set-up properly on servers will allow that I/O to be oversubscribed and aggregated across more servers, in the end giving you FASTER I/O for each virtual server, for less money than if they remained on lower-end but isolated islands of hardware.

By cks at 2013-05-13 11:06:11:

This is not VMs used for production (I would never run production VMs from my workstation), this is VMs for testing. Since we have neither the money nor the interest to set up a production grade VM infrastructure, my practical choices for the testing I need to do are either to use virtualization on my desktop or to scavenge up spare physical hardware. In that situation I use virtualization on my desktop, flaws and all.

(Even with a production grade VM infrastructure I might still use the VM environment on my desktop, due to my virtualization priorities.)

SSDs would be nice. I'm sure they'd speed up my desktop a lot (for both virtualization and otherwise). But again, we have no money for providing sysadmin machines with SSDs.

(I wish we did, I'd certainly like to ditch the spinning rust.)

Written on 10 May 2013.
« Thoughts on when to replace disks in a ZFS pool
Illustrating the tradeoff of security versus usability »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Fri May 10 02:26:02 2013
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.