In place migration to Linux software RAID

February 28, 2012

Suppose that you have an existing system that is operating without mirrored disks and you want to change that; you want to add a second physical disk to the system and then wind up with software RAID mirroring of appropriate filesystems. This generally goes by the name of 'in-place migration'. Knowledge of how to do this used to be more common because back in the old days, distribution installers couldn't set up mirrored system disks during installation; these days installers can and so needing to do this by hand is much rarer.

(In place migration is easy with software RAID implementations that store the metadata 'out of band', outside of the disk space being mirrored. Solaris DiskSuite can more or less trivially do in-place migration to mirrors, with only a minor pause to remount most filesystems. Unfortunately for us, Linux software RAID is not such a thing; it stores its metadata 'in-band', at either the start or the end of the partition being mirrored.)

When this question came up here recently, I said that there are two ways to do in place migration: the traditional, well tested approach that everyone used to use and a theoretically possible approach that I at least have never tested. The well tested approach is not quite literally in-place; the theoretical one is, but is trickier and untested.

The traditional approach goes like this:

  1. arrange to have an identical partition on your second disk. The traditional way to do this is to use identical disks and copy the partition table from the first disk to the second with sfdisk.
  2. create a mirror using only the second disks's partition and a missing (aka failed) device, for example:
    mdadm -C /dev/md0 -l raid1 -n 2 -x 1 -a missing /dev/sdb3 missing

    Make very, very sure that you are using the second disk's partition for this, not the first disk. The first disk should not be mentioned anywhere in the command line.

  3. mkfs the new mirror and mount it somewhere; we usually used /mnt.
  4. copy the existing filesystem to the mirror using the tool of your choice. I prefer to use dump and restore, but tastes differ.
  5. edit /etc/fstab to mount the filesystem from the mirror.
  6. unmount the current filesystem and immediately remount the mirror in its place. (Doing this for the root filesystem requires a reboot, among other things, and is outside the scope of this entry.)
  7. hot-add the old filesystem's partition on the first disk to the mirror, for example:
    mdadm -a /dev/md0 /dev/sda3

    Since you're adding a new device to a mirror, the mirror resyncs onto the new device. You can watch the progress of the resync in /proc/mdstat, and on modern systems you may get email from mdadm when it finishes.

Back in the days of Ubuntu 6.06 and similar systems we did this a lot and we never had problems (at least if we weren't shuffling filesystems around at the same time). This is not quite in-place because it involves copying the filesystem, and on a sufficiently busy filesystem it may be troublesome to get a complete and accurate copy (eg, you may need an extended downtime to halt all other activity on the filesystem).

The theoretical way that is fully in-place is to set up your new partition on your new disk and then do something like this:

  • shrink the filesystem so that there is enough space for software RAID metadata at the end of the partition. You will need to experiment to find out exactly how much space the metadata needs.
  • unmount the filesystem.
  • create a software RAID mirror with a single mirror, using a format with metadata at the end of the partition.

    In theory this doesn't write to anything except the metadata area. Note that I have neither tested nor verified that this is true in practice; that's why this is a theoretical way. You will want to test the heck out of this (probably in a virtual machine).

  • change /etc/fstab to mount the filesystem from the mirror and remount it.
  • hot-add the partition on the second disk and let the mirror resync on to it.

Depending on how long it takes to shrink the filesystem and whether or not it can be done live, this may require less downtime than the other approach.

I doubt I'll ever use the theoretical way. While it's vaguely neat, it'll clearly take a bunch of work and testing to develop into something that can be used for real and it has only marginal advantages over the tried and true way (especially on extN filesystems, where resize2fs can only shrink unmounted filesystems).

Comments on this page:

From at 2012-03-14 13:38:21:

The second approach works, we've done it a couple of times. It requires only one short downtime to add the disk and to add the first disk to the RAID. The rest can happen in multiuser. The first approach is cleaner and safer, but requires at least two downtimes. - Arcady

Written on 28 February 2012.
« What information I want out of ZFS tools and libraries
The two sorts of display resolution improvements »

Page tools: View Source, View Normal, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Tue Feb 28 00:11:50 2012
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.