Wandering Thoughts archives

2016-06-18

It's easier to shrink RAID disk volumes than to reshape them

Once your storage system is using more than a single disk to create a pool of storage, there are a number of operations that you can want to do in order to restructure that pool of storage. Two of them are shrinking and reshaping. It's common for volume managers and modern filesystems like btrfs to be able to shrink storage pool by removing a disk (or a set of mirrored disks), although not all modern filesystems support doing this. It's also becoming increasingly common for RAID (sub)systems to support reshaping RAID pools to do things like change from RAID-5 to RAID-6 (or vice versa); modern filesystems may also implement this sort of reshaping if they support RAID levels that can use it. Often shrinking and reshaping are lumped together as 'yeah, we support reorganizing storage in general'.

In thinking about this whole area lately, I've realized that shrinking is fundamentally easier to do than reshaping because of what it involves at a mechanical level. When you shrink a pool of storage, you do so by moving data to a new place; you move it from disk A, which you are getting rid of, to free space on other disks. When all the data has been moved off of disk A, you're done. By contrast, reshaping is almost always an in-place operation. You don't copy all the data to an entirely different set of disks, then copy it back in a different arrangement; instead you must very carefully shuffle it around in place, keeping exacting records of what has and hasn't been shuffled so you know how to refer to it.

For obvious reasons, filesystems et al already have plenty of code for allocating, writing, and freeing blocks. To implement shrinking, 'all' you need is an allocation policy that says 'never allocate on this entity' plus something that walks over the entire storage tree, finds anything allocated on the to-be-removed disk, triggers a re-allocation and re-write, and then updates bits of the tree appropriately. The tree walker is not trivial, but because all of this mimics what the filesystem is already doing you have natural answers for many questions about things like concurrent access by ordinary system activity, handling crashes and interruptions, and so on. Fundamentally, the whole thing is always in a normal and consistent state; it just has less and less of your data on the to-be-removed disk over time.

This is not true for reshaping. Very few storage systems do anything like a RAID reshaping during normal steady state operation. This means you need a whole new set of code, you're going to have to be very careful to manage things like crash resistance, and a pool of storage that's in the middle of a reshaping looks very different from how it does in normal operation (which means that you can't just abandon a reshaping in mid-progress in the way you can abandon a shrink).

(This is a pretty obvious thing if you think about it, but I hadn't really considered it before now.)

PS: Not all 'shrinking' is actually shrinking in the form I'm considering here. Removing one disk from a RAID-5 or RAID-6 pool of storage is really a RAID reshape.

(It's theoretically possible to design a modern filesystem where RAID reshapes proceed like shrinking. I don't think anyone has done so, although maybe this is how btrfs works.)

tech/VolumeShrinkingVsReshaping written at 02:52:27; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.