Don't use dd as a quick version of disk mirroring

October 16, 2014

Suppose, not entirely hypothetically, that you initially set up a server with one system disk but have come to wish that it had a mirrored pair of them. The server is in production and in-place migration to software RAID requires a downtime or two, so as a cheap 'in case of emergency' measure you stick in a second disk and then clone your current system disk to it with dd (remember to fsck the root filesystem afterwards).

(This has a number of problems if you ever actually need to boot from the second disk, but let's set them aside for now.)

Unfortunately, on a modern Linux machine you have just armed a time bomb that is aimed at your foot. It may never go off, or it may go off more than a year and a half later (when you've forgotten all about this), or it may go off the next time you reboot the machine. The problem is that modern Linux systems identify their root filesystem by its UUID, not its disk location, and because you cloned the disk with dd you now have two different filesystems with the same UUID.

(Unless you do something to manually change the UUID on the cloned copy, which you can. But you have to remember that step. On extN filesystems, it's done with tune2fs's -U argument; you probably want '-U random'.)

Most of the time, the kernel and initramfs will probably see your first disk first and inventory the UUID on its root partition first and so on, and thus boot from the right filesystem on the first disk. But this is not guaranteed. Someday the kernel may get around to looking at sdb1 before it looks at sda1, find the UUID it's looking for, and mount your cloned copy as the root filesystem instead of the real thing. If you're lucky, the cloned copy is so out of date that things fail explosively and you notice immediately (although figuring out what's going on may take a bit of time and in the mean time life can be quite exciting). If you're unlucky, the cloned copy is close enough to the real root filesystem that things mostly work and you might only have a few little anomalies, like missing log files or mysteriously reverted package versions or the like. You might not even really notice.

(This is the background behind my recent tweet.)

Written on 16 October 2014.
« Why system administrators hate security researchers every so often
My experience doing relatively low level X stuff in Go »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Thu Oct 16 02:13:02 2014
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.