Wandering Thoughts archives

2012-12-05

One way to break down how people use virtualization

Virtualization is such a general technology that even when people are using it for whole machine virtualization there are any number of different ways that they can use it. Today I want to present a number of usage dimensions for this (when I started I thought it was going to be a simple four-quadrant breakdown, but the more I thought the more dimensions showed up).

So, here's some axes that you can chart virtualization usage along:

  • when you power up VMs, do you leave them running for a long time or only run them for relatively short periods and then shut them down again?

  • are running VMs totally headless, in need of only basic text-mode interactions with the console, or do you need full console graphics support (perhaps even with 3d acceleration)?

    (The former is typical of long-running servers, the latter is typical of virtualized desktops.)

  • regardless of how long they run when powered up, are your VM images long-lived or do you use them and then throw them away?

    (Long-running VMs are necessarily long-lived, but short running ones can still be long-lived too.)

  • is the basic setup for a new VM more or less constant (perhaps to the extreme of always starting with a cloned image) or highly variable? (Clearly there's a continuum.)

  • is VM management going to be automated or will it be done by hand at small scale?

  • how much does interacting with a VM need to be like interacting with real hardware?

(Some of these axes will be more or less forced if you operate at large scale with a lot of VMs.)

One of the reasons that thinking about this can be important is that different virtualization systems have different strengths and weaknesses in these areas. Understanding your own usage means understanding what's important to you and thus what virtualization systems are a good fit for you and what ones are a terrible fit.

(For one extreme hypothetical example, if your goal is small scale virtualized desktops it would be a terrible idea to pick a package that's aimed at headless ephemeral servers, full of support for things like fast cloning and rollback, command line management, mass starts and shutdowns, and with console access as an afterthought.)

VirtualizationUsageTypes written at 23:18:29; Add Comment

2012-12-04

You can't assume that your performance problems will be obvious

One of the things underlying us not checking to make sure our harmless change actually was harmless is that we made an assumption; we assumed that if we had a performance problem it would be obvious and therefor the lack of any obvious performance problem meant that everything was fine. In fact we assumed this all through our disk performance issues. For example, because only the mail spool had any particularly obvious performance issues we assumed that only the mail spool had issues (and that we understood them).

What this drives home is something that I've kind of seen before:

Not all performance problems are obvious.

The easy performance problems are the obvious ones, where something simply explodes under load or at least scrapes along being clearly unacceptably slow. But those aren't the only problems you can have. You can also have performance problems that degrade your system in a subtle way, where nothing is obviously broken and everything just kind of goes slowly. If you don't look closely and especially if you have a complex environment you may simply say 'well, this is pretty much the best performance we can expect'. This is exactly what happened with us. We assumed that the performance we were seeing was what we could reasonably expect and that any slow increase in problems were simply because of growing usage and activity. In the end we only found our performance problem because we were smacked in the nose and actively went looking.

(We were lucky in that when we started looking in detail we could see that things were clearly broken at a low level.)

On the one hand it's important to recognize that the absence of evidence of performance problems is not very strong evidence for the absence of performance problems. On the other hand we can't be constantly suspicious of our systems because most of the time we aren't going to find anything if we go looking (and pretty soon we'll stop wasting our time by doing so). This is another case where metrics come in; metrics are constantly suspicious on our behalf.

One important corollary of this is that performance can degrade quietly. If performance problems are not obvious, you can go from good performance to performance problems without any obvious sign; this is especially so if the performance problems happened gradually. Once again this is where metrics come in, not just because they're suspicious on your behalf but because they keep history (if configured correctly).

(This applies to programming as much as to system administration. In fact programs are famous for having their performance, memory usage, and so on suffer a death of a thousand tiny little cuts, each change insignificant by itself but the cumulative total resulting in disaster.)

NonObviousPerformanceIssues written at 01:01:57; Add Comment

2012-12-03

A reason for detailed commit messages: as a guard against errors

There's lots of common reasons to write good descriptive commit messages in your version control system. But I've recently come to realize a somewhat unusual reason why they matter; writing good commit messages makes it much more likely that you'll be able to recognize and recover from inadvertent modifications. In theory inadvertent modifications should never make it into your commits. In practice this is always a risk and it happens periodically. Now suppose that you are coming back to the modification after the fact and you've tracked an oddity or a potential problem to part of a commit. How do you know whether this is a deliberate, intended modification or if it was an inadvertent one that snuck in to the commit by accident?

(You can't simply revert the odd modification because you don't know if it's actually part of solving a real problem. If it was and you revert it you've just reintroduced a problem, possibly an obscure one.)

The more generic the commit message is, the less real clues it gives you about this (note that things like bug numbers or 'fixed problem <X>' are mostly generic here). With a generic commit message you have to basically reverse engineer the whole change (including the specific modification you've spotted) to see if it's a plausible solution to whatever problem it's theoretically supposed to be solving and even if you do, you may be left with uncertainty. But if the commit message is specific about what the commit includes (and why) you have a much better chance of knowing that the modification you're looking at is an inadvertent modification, that it has nothing to do with what's supposed to be in the commit (and thus the fix or change the commit is making) and snuck in by accident somehow. Conversely if the modification is mentioned in the commit message you know that it was definitely intentional (although it still may be mistaken).

(Much the same logic applies if the whole change is not a good one, but I'm assuming that you don't have those because people don't commit stuff that is significantly bad.)

This is probably less of an issue for code and thus programmers than for the sort of configuration and control files that sysadmins tend to deal with; code is in a way often more understandable on its own than a configuration file change is. Especially opaque are what I'll call 'factual' changes, which have no clues about what they're trying to achieve; the classic factual change is a permissions change, where you can easily see what it was but by itself you have no idea of what it was trying to achieve. That these factual changes are so obvious creates a temptation to write very terse commit messages, because in many cases the commit message will just be repeating the change itself. I've come to think that these are among the most important commit messages precisely because they are so disconnected from any higher goal. You can't work out from the change itself whether it's a good or a bad change, so you desperately need to know intentions and context.

(This can't be a new observation and thus I'm sure I'm very late to the party on this one. I've just been thinking about this recently as I made various commit messages at work, sometimes with the feeling that I was being overly detailed in my descriptions (a feeling that I now think is wrong).)

GoodCommitMessagesVsErrors written at 02:46:41; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.