Btrfs's mistake in limiting itself to two-way mirroring

March 31, 2015

Recently, I tweeted:

That btrfs still will not do more than two-way mirroring immediately disqualifies it for many serious uses as far as I'm concerned.

On the surface this may sound like a silly limitation to be annoyed at btrfs over, something that only a small number of people playing in the enterprisy (over-)cautious, cost is no object world will ever use. Two way mirrors are pretty reliable, after all, and almost no one actually uses more than two-way mirroring (and the people who do may not be entirely sensible).

This is too small a view of the situation. The problem with having a maximum of two-way mirroring is not steady state operation, it's when you're migrating storage from one disk to another (or from one set of disks to another). Supporting three (or more) way mirroring makes it simple to do this while preserving full redundancy; you attach the new disk as a third mirror, wait for things to resynchronize, and then detach the old disk. If things go wrong with the new disk during this process, no sweat, your old disks are still there and working away as normal.

At this point some people may suggest 'rebalancing' operations, where you attach the third disk and then tell your sophisticated filesystem to change the system by moving all the data from the old disk to the new disk; I believe that btrfs supports this by adding the new disk then deleting the old disk. The problem is that this is not good enough because if things go wrong it will generally leave part of your data non-redundant (whatever data has been migrated to the new disk). It's strictly better to run the new disk in parallel with the old disks and then decide that you trust it enough to drop the old disk out, and that requires real multi-way mirroring.

What btrfs does if you give it more than two disks in a raid-1 setup is actually potentially useful behavior (it mirrors each piece of data on two out of three drives, giving you more disk space). But the right solution here would be to support both this and a way to tell btrfs that you want N-way mirroring instead of just 2-way mirroring. As it is, only having two-way mirroring is yet another reason why I may never use btrfs on my own machines.

(I think that this is an important feature for home machines, with are both the machines most likely to see drive replacements over time and the place where overall drive systems may be the flakiest. You just know that someday someone is going to attach a dubious USB 3.0 external drive to their home system temporarily in order to swap internal drives, with predictable results partway through.)

(Of course, this sort of artificial limitation in btrfs's RAID support is partly fallout from what I feel is btrfs's core mistake.)


Comments on this page:

By Georg Sauthoff (gsauthof) at 2015-09-05 08:54:05:

I agree, only being able to create a 2-way mirror is an artificial limitation of Btrfs.

But you can still replace a failed disk and have full redundancy while doing so.

For RAID-1, Btrfs has a replace command that copies everything to a new device - and once everything is finished the old device is removed.

During the replace execution a 'btrfs fi show' lists the RAID-1 filesystem with 3 devices that all have the same space usage.

Written on 31 March 2015.
« My preliminary views on mosh
When the Unix load average was added to Unix »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Tue Mar 31 23:23:00 2015
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.