Why I want a solid ZFS implementation on Linux

February 9, 2014

The short version of this is 'ZFS checksums and ZFS scrubs'. Without strong per-block integrity protections, there are two issues that I increasingly worry about for my Linux workstations with mirrored disks: read errors on the remaining live disk when resynchronizing a RAID-1 mirror after it loses one disk and slow data loss due to undetected read errors and corrupted on-disk data. Slow data loss is also a worry for backups on a single backup or especially an archival disk (I'll have more than one archive disk but cross-verification may be very painful).

(ZFS also offers flexible space management for filesystems, but this is less of an issue for me. In practice the filesystems on my workstation just grow slowly over time, which is a scenario that's already handled by LVM. I might do some reorganization if I could shrink filesystems easily but probably not much.)

ZFS's block checksums combined with regular scrubs basically immunize me against these creeping problems. Unless I'm very unlucky I can pretty much count on any progress disk damage getting repaired, and if I'm unlucky at least I'll know about it and maybe I can retrieve things from backups. Of course in theory Btrfs can do all of this too, but btrfs remains not ready for production and unlike ZFS this applies to the fundamental code, not just the bits that connect the core ZFS code to Linux.

(That ZFS is not integrated into the mainline kernel also makes it somewhat risky to use ZFS on distributions like Fedora that stick closely to the current mainline kernels and update frequently. Btrfs is obviously much better off here, so I really wish it was stable and proven in widespread usage.)

I suppose the brute force overkill solution to this dilemma is an OmniOS based fileserver that NFS exports things to my Linux workstation, but there are various drawbacks to that (especially at home).

(Running my entire desktop environment on OmniOS is a complete non-starter.)

(This is sort of the background explanation behind a tweet.)

Written on 09 February 2014.
« Why I'm not looking at installing OmniOS via Kayak
My dividing line between working remotely and working out of the office »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sun Feb 9 20:58:04 2014
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.