2005-07-25
Reliably archiving things
Reliably archiving things is a troublesome issue. There are a number of places to find numbers for how long CD-Rs, DVD-Rs, and various sorts of tapes all last, and lots of debate, and lots of people who will sell you solutions, and so on. I don't have much to add to their numbers and marketing information.
But I have been around for long enough to have burnt my fingers and seen other people's fingers burnt, and thus I have arrived at what I will modestly call:
Chris's first law of archiving: don't.
If some piece of data is important to you, the very last thing you want to do is to put it onto some piece of media and then dump it on a shelf. Sure, the media may last more than your lifetime, but will the equipment to read the media and the interface to talk to that equipment and the OS software to talk to everything and the software to read the format all exist in 20 years?
I've concluded that the most reliable way to insure that something stays accessible is to keep it on live, spinning magnetic media: keep it on disk on your system. (Ideally you'll also store the source code for something that can read it, if only for reference for what the format is.) When you change systems, don't leave the data on the old system; copy it to the new system. Always keep the data on your primary system, because it's the one you take the most care of.
This doesn't guarantee that your important archival data will stay accessible. But it vastly increases the chances that ten years from now you will still be able to do something with it, and will not be stuck trying to find an Exabyte tape drive and software to read SGI EFS filesystem dumps (for example).
People will say that this uses a lot of disk space. My reply is that most people don't have all that much important archival data, and disk space is cheap and growing cheaper every day.
Sidebar: backups versus archives
Backups are not archives; they have different goals.
Backups are what you do to make sure you can get back data if your disks melt down. Backups are tied to a particular system and only have to be durable over the short term (the length of time it takes to replace the system when it explodes).
You should definitely keep on doing backups.