Looking back at a year of our disk-based backup system
It's now a bit over a year since we first deployed our disk-based backup system (although I only wrote it up in May, after we built the second machine). This makes it a good moment to look back and talk about how things are going, especially since someone I know asked me just this question recently.
On the whole the answer is that things are going well and quietly; with one exception, there haven't really been any surprises or gotchas. The one exception is that we are seeing somewhat more read errors on the hard drives than we expected or entirely like. It's unlikely to be a bad batch of drives, since we also put some of those drives into our iSCSI backends and they haven't been having anywhere near the same rate of errors.
(We're reasonably careful about handling the disks, including letting them spin down before we remove them from their enclosures, but we do handle them and move them around more than desktop drives probably normally experience. Possibly consumer desktop SATA drives are more sensitive and fragile than one might expect.)
Since we are using these disks strictly for relatively short-term backups, I don't consider this a problem. Things happen to backups all the time; that's why you have more than one backup of anything. Our periodic longer-term archival storage runs are still done to LTO tape (as mentioned in the original entry).
One nice benefit that we didn't entirely expect is that our disk-based backups have drastically speeded up small restores of recently deleted files, which is the most common sort of restore request we get. We can now usually do these in a few minutes (and without having to get up from our desks to go move tapes around), which we quite appreciate.