2014-10-16
Don't use dd
as a quick version of disk mirroring
Suppose, not entirely hypothetically, that you initially set up a
server with one system disk but have come to wish that it had a
mirrored pair of them. The server is in production and in-place
migration to software RAID requires a downtime or two, so as a cheap 'in case of emergency' measure
you stick in a second disk and then clone your current system disk
to it with dd
(remember to fsck
the root filesystem afterwards).
(This has a number of problems if you ever actually need to boot from the second disk, but let's set them aside for now.)
Unfortunately, on a modern Linux machine you have just armed a time
bomb that is aimed at your foot. It may never go off, or it may go
off more than a year and a half later (when you've forgotten all
about this), or it may go off the next time you reboot the machine.
The problem is that modern Linux systems identify their root
filesystem by its UUID, not its disk location, and because you
cloned the disk with dd
you now have two different filesystems
with the same UUID.
(Unless you do something to manually change the UUID on the cloned
copy, which you can. But you have to remember that step. On extN
filesystems, it's done with tune2fs
's -U
argument; you probably
want '-U random
'.)
Most of the time, the kernel and initramfs will probably see your
first disk first and inventory the UUID on its root partition first
and so on, and thus boot from the right filesystem on the first
disk. But this is not guaranteed. Someday the kernel may get around
to looking at sdb1
before it looks at sda1
, find the UUID it's
looking for, and mount your cloned copy as the root filesystem
instead of the real thing. If you're lucky, the cloned copy is so
out of date that things fail explosively and you notice immediately
(although figuring out what's going on may take a bit of time and
in the mean time life can be quite exciting). If you're unlucky,
the cloned copy is close enough to the real root filesystem that
things mostly work and you might only have a few little anomalies,
like missing log files or mysteriously reverted package versions
or the like. You might not even really notice.
(This is the background behind my recent tweet.)