SSDs and understanding your bottlenecks

February 29, 2012

In a comment on my entry on five years of PC changes, it was suggested that I should have used a SSD for the system disk. I kind of addressed this in the original entry on my new machine's hardware specifications, but I want to use this to talk about understanding where your bottlenecks are (and aren't).

The simple way of talking about the benefits of SSDs is to say that they accelerate both reads and writes, especially synchronous writes and random IO in general (because SSDs have no seek delays). But phrasing it this way is misleading. What SSDs actually accelerate is real disk IO, which is not the same as either OS-level reads and writes or what you think might produce disk IO. This is fundamentally because every modern system and environment tries to keep as much in memory as possible, because everyone is very aware that disks are really, really slow.

(Even SSDs are slow when compared to RAM.)

Thus when you propose accelerating any disk with a SSD, there are two questions to ask: how much do you use the disk in general and how much actual disk IO is happening. There's also a meta-question, which is how much of this IO is actually causing visible delays; it's quite possible for slow IO to effectively be happening in the background, mostly invisible to you.

Although I haven't measured this, my belief is that system disks on Unix machines are in many ways a worst case for SSDs. I tend to think that my desktop environment is relatively typical: I normally use only a few programs, many of them are started once and then stay running, and I often already have an already executing instance of many of the programs I re-run (for example, xterms and shells; a Unix system is basically guaranteed to always have several instances of /bin/sh already running). All of these act to limit the amount of OS-level reading and writing being done and increase the effectiveness of OS memory reuse and caching. Even on a server, this pattern is likely to be typical; you need an unusual environment to be really using lots of programs from /usr/bin and libraries from /usr/lib and so on, and doing so more or less at random.

(Also, note that much system disk IO is likely to be sequential IO instead of random IO. Loading programs and reading data files is mostly sequential, for example.)

Given this usage pattern, the operating system almost certainly doesn't need to cache all that much in order to reduce IO on the system disk to almost nothing. How likely it is to be able to do that depends on how much memory the system has and what you're doing with it. Here the details of my hardware matter, specifically that I have 16 GB of RAM and don't run all that much that uses it. Ergo, it is all but certain that my OS will be able to keep enough in memory to reduce system disk IO to almost nothing in normal use. If the system disk is barely being used most of the time, making it an SSD isn't going to do very much most of the time; there just isn't anything there for the SSD to accelerate.

Now, here's something important: saying that an SSD wouldn't make a different most of the time isn't the same thing as saying an SSD would never make a difference. Clearly an SSD would make a difference some of the time, because my system does sometimes do IO to the system disk. Sometimes it does a fair bit of IO, for example when the system boots and I start up my desktop environment. If you gave me an SSD for free, or if 250 GB SSDs were down in the same $50 price range that basic SATA disks currently are, I would use them. But they aren't, not right now, and so my view is that SSDs for system disks are not currently worth it in at least my environment.

(I also feel that they're probably not that compelling for server system disks for the same reasons, assuming that your server disks are not doing things like hosting SQL database storage. They are potentially attractive for being smaller, more mechanically reliable, and drawing less power. I feel that they'll get really popular when small ones reach a dollar per GB, so a 60 GB SSD costs around $60; 60 GB is generally plenty for a server system disk and that price is down around the 'basic SATA drive' level. It's possible that my attitudes on pricing are strongly influenced by the fact that as a university, we mostly don't have any money.)

Note that user data is another thing entirely. In most environments it's going to see the lion's share of disk IO, both reads and writes, much more of it will be random IO than the system disk sees, and a lot of it will be things that people are actually waiting for.

PS: it's possible that the inevitable future day when I switch to SSDs for my system disk(s) will cause me to eat these words. I'm not convinced it's likely, though.

Sidebar: mirroring and SSDs

Some people will say that it's no problem using a single SSD for your system disk because it's only your system disk and SSDs are much more reliable than HDs (after all, SSDs do not have the mechanical failure issues of HDs). I disagree with them. I do not like system downtime, I have (re)installed systems more than enough times already, and I count on my workstations actually working so that I can get real work done.

(If you gave me an SSD for free I would probably use it as an unmirrored but automatically backed up system disk, paired with a HD that I could immediately boot from if the SSD died. But if I'm paying for it, I want my mirrors. And certainly I want them on servers, especially important servers.)

Written on 29 February 2012.
« The two sorts of display resolution improvements
A trick for dealing with irregular multi-word lines in shell scripts »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Wed Feb 29 22:50:31 2012
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.