Things that could happen to your archives
In the spirit of my old entry on things that could happen to your backups and to reinforce yesterday's entry on not trying to archive things, there's an incomplete list of things that have been known to go wrong with archives. If you're thinking of doing archives, you should be thinking about how you're going to avoid these.
- you aren't archiving everything you need to archive.
- the archive program doesn't work right; it writes a corrupt or
incomplete archive, fails to notice or complain enough about read
errors, or its archive doesn't capture a consistent and usable
state of whatever you want to archive.
With archives you should definitely be doing a full read of the archive and verifying it against the data on disk before you remove anything from disk.
(In general archives are subject to many of the woes of backups. Take them as read.)
- the archive media degrades over time.
This is what most everyone talks about, and for good reason; if your data isn't there any more, nothing else matters. But it's only the tip of the iceberg for what you need and what can go wrong.
- one or more pieces of archive media were physically damaged or
destroyed due to a mishap, accident, water leak, fire, etc.
If you care about real archives, you need more than one copy of any piece of data (and they should not be in the same place). Accidents and mishaps happen, especially to things sitting in the corner.
- you've lost track of one or more pieces of archive media; they're stored somewhere, but you don't know specifically where any more.
- in general you've lost track of what media you have and/or what data you've archived.
- you've lost track of what is on each piece of archive media, so
while you know you have an archival copy of <X> you don't know
which one of fifty tapes it's on (and no one is going to go search
through all fifty tapes unless it is really, really important).
- you don't have anything that can read the media any more.
- the media reading hardware that you carefully saved has quietly stopped working sometime during the years that it was in storage.
- you can't connect the media reading hardware to any of your current systems; it requires an obsolete interface that is no longer supported.
- you have an interface card for the obsolete interface you need, but
it uses a bus type that is no longer supported on your machines.
(I have some PCI SCSI cards. The odds that I will be able to put them in machines drops by the day.)
- you have all of the hardware you need and you even saved cables too,
but the OS driver for the hardware was removed several years ago
after it became unmaintained because no kernel hacker had a copy
of the hardware to test with any more.
- all of your hardware works for the first N tapes (or disks, or whatever),
then something breaks due to the amount of wear you're putting
on old hardware. Since it's all obsolete hardware, there's no
longer any spare parts, maintenance and cleaning kits, or the
expertise to use any of these even if you had them.
- you didn't write down what format the archives are in because it was obvious at the time.
- you don't have any software that can read the archive format.
- the details of the archive format either were never documented or
were only documented in ancient documentation that you got rid
of years ago. You earn bonus irony points if you carefully included
the documentation in your archives.
- the software you have that can read the archive format doesn't run on any of your current machines.
- the old OS you need to run the software to read the archive format doesn't work on any of your current machines.
- you have source code for software to read the archive format, but
it doesn't compile on the current version of the OS because the
compiler has gotten stricter, the library interfaces have changed,
and the OS has moved from 32-bit to 64-bit.
- your commercial archiving system requires a license key, but the
company that made it is out of business now and certainly not
issuing any new ones. Your old license key expired five years
(Yes, there are people who do long term archiving with commercial software.)
- you have forgotten all of the details about how to work with the
media, the archive format, and any surviving software. In theory
you could with sufficient effort re-master all of the pieces and
reverse engineer the format and extract the data. In practice you
don't have the time to do all of this (because it is not a high
enough of a priority), and so the archives are unreadable and
will never be extracted.
It's common to discover this shortly before your last media reader is decommissioned, because this is when everyone decides that you should move the data from the old media (and format) on to some new media. This is often the first time anyone has thought about the archives for years.
(Even if you can remember all of this, it not infrequently turns out that you simply don't have enough time to cycle all of your old media through to read all of the data off of it.)
There are probably many more, but I have less painful experience with archives than I do with backups.
(Although we had an interesting time when the last 9-track reel to reel tape drive was being taken out of service. I don't think we got all of the old historical 9-track tapes copied that we wanted to.)