Wandering Thoughts archives

2017-10-27

HD usage can be limited by things other than cost per TB

I was recently reading WDC: No SSD/HDD Crossover (via), which reports Western Digital data that says that HDs will continue to have a significant price per TB advantage over SSDs for at least the next decade (the quoted figure is a 10:1 advantage). I'm perfectly prepared to believe this (I have no idea myself), but at the same time I don't think it's necessarily very relevant. The simple way to put it is that a great deal of storage is not bulk storage.

If you're doing bulk storage, then certainly the cost per TB matters lot and HDs will likely continue to have the advantage. But if you're not, there are a number of other concerns that will probably clip the wings of HDs long before then. Two classical concerns are maintaining enough IOPS per TB, and the time it takes to restore your data redundancy after a disk is lost (whether through a RAID resynchronization or some other mechanism).

Larger and larger HDs might come with an increase in IOPS per disk, but history is fairly strongly against that; genuine sustainable IOPS per disk has been basically flat for HDs for years. This means that as your HDs grow bigger, IOPS per TB drops; the same amount of IOPS per disk is spread among more TB per disk. If you feel you need reasonably responsive random IO, this can easily mean that your usable TB per disk is basically capped. This is the situation that we're in with our fileservers, where we deliberately used 2 TB HDs instead of something larger in order to maintain a certain level of IOPS per TB.

(This IOPS limit is different from a situation where HDs simply can't provide enough IOPS to meet your needs.)

The time required to restore full data redundancy after a disk failure goes up as you put more and more data on a single disk. If you lose a giant disk, you get to copy a giant disk's worth of data, and the bigger your disks are the longer this takes. At a certain point many people decide that they can't afford such long rebuild times, and so they have to cap the usable TB per disk. Alternately they have to build in more redundancy, which requires more disks and results in higher costs per usable TB of space (not raw TB of space).

(This has already happened once; as disks got larger, people moved away from RAID-5 in favour of RAID-6 in large part because of rebuild times and the resulting exposure if you lost a drive in a RAID-5 array.)

If your usable TB per disk is capped in this way, the only thing that larger, cheaper per TB HDs do for you is perhaps drive down the price of smaller right-sized disks (in our case, this would be 2 TB disks). Unfortunately, disks seem to have a floor price and as disk capacity increases, what you get for this floor price in a decent quality drive seems quite likely to go over the maximum TB that you can use. More to the point, the cost of SSDs with the same capacity is going to keep coming down toward where they're affordable enough. This is the real-world SSD inflection point for many environments; not the point where the price per TB of SSDs reaches that of HDs, but the point where you might as well use SSDs instead of the HDs that you can use, or at least you can afford to do so.

HDUsageLimits written at 01:08:27; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.