HD usage can be limited by things other than cost per TB

October 27, 2017

I was recently reading WDC: No SSD/HDD Crossover (via), which reports Western Digital data that says that HDs will continue to have a significant price per TB advantage over SSDs for at least the next decade (the quoted figure is a 10:1 advantage). I'm perfectly prepared to believe this (I have no idea myself), but at the same time I don't think it's necessarily very relevant. The simple way to put it is that a great deal of storage is not bulk storage.

If you're doing bulk storage, then certainly the cost per TB matters lot and HDs will likely continue to have the advantage. But if you're not, there are a number of other concerns that will probably clip the wings of HDs long before then. Two classical concerns are maintaining enough IOPS per TB, and the time it takes to restore your data redundancy after a disk is lost (whether through a RAID resynchronization or some other mechanism).

Larger and larger HDs might come with an increase in IOPS per disk, but history is fairly strongly against that; genuine sustainable IOPS per disk has been basically flat for HDs for years. This means that as your HDs grow bigger, IOPS per TB drops; the same amount of IOPS per disk is spread among more TB per disk. If you feel you need reasonably responsive random IO, this can easily mean that your usable TB per disk is basically capped. This is the situation that we're in with our fileservers, where we deliberately used 2 TB HDs instead of something larger in order to maintain a certain level of IOPS per TB.

(This IOPS limit is different from a situation where HDs simply can't provide enough IOPS to meet your needs.)

The time required to restore full data redundancy after a disk failure goes up as you put more and more data on a single disk. If you lose a giant disk, you get to copy a giant disk's worth of data, and the bigger your disks are the longer this takes. At a certain point many people decide that they can't afford such long rebuild times, and so they have to cap the usable TB per disk. Alternately they have to build in more redundancy, which requires more disks and results in higher costs per usable TB of space (not raw TB of space).

(This has already happened once; as disks got larger, people moved away from RAID-5 in favour of RAID-6 in large part because of rebuild times and the resulting exposure if you lost a drive in a RAID-5 array.)

If your usable TB per disk is capped in this way, the only thing that larger, cheaper per TB HDs do for you is perhaps drive down the price of smaller right-sized disks (in our case, this would be 2 TB disks). Unfortunately, disks seem to have a floor price and as disk capacity increases, what you get for this floor price in a decent quality drive seems quite likely to go over the maximum TB that you can use. More to the point, the cost of SSDs with the same capacity is going to keep coming down toward where they're affordable enough. This is the real-world SSD inflection point for many environments; not the point where the price per TB of SSDs reaches that of HDs, but the point where you might as well use SSDs instead of the HDs that you can use, or at least you can afford to do so.

Comments on this page:

Another way that pure cost per TB may be irrelevant – even in bulk storage: I care about bulk hot storage… and SSDs draw – very roughly – two orders of magnitude less power while idle.

I’ve not seen anyone try to even estimate cost per terabyte year, unfortunately. But it seems to me that by that measure, there is a tipping point either within reach or possibly already passed. And I assume that idle power consumption is not easily cut very much (if anything, it should be easier to reduce for SSDs than HDs, owing to physics-based reasons (probably the same reasons HD IOPS rates are largely flat)). If power draw is more or less constant, then the cost per TB year tipping point is effectively determined by a threshold on the absolute cost per TB. And that’s coming down inexorably.

Other factors have already flipped. Storage density matters in bulk storage at data centre scale. SSDs caught up to HDs on that front just recently, and will be pulling ahead from here on out.

Platters are the new tape: cheap but cumbersome. (Esp. as software comes to expect SSD-level I/O performance… already happening.) Not yet, but soon.

Written on 27 October 2017.
« The 'standard set' of Unix programs is something that evolves over time
Why I plan to pick a relatively high-end desktop CPU for my next PC »

Page tools: View Source, View Normal, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Fri Oct 27 01:08:27 2017
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.