Our disk-based backup system

May 9, 2009

Our solution to the tape backup cost problem has been to move to a disk-based backup system. The most important enabling thing for this is that when we built the latest version of our iSCSI backends, we discovered that we could get a pretty nice 12-bay ESATA-based enclosure for reasonably cheap, and it even has 'tray-less' drive bays that could be hotswapped.

(Tray-less drive bays are drive bays where you simply slide the bare drives in and out; you do not need to mount them in some sort of a carrier or a tray beforehand.)

From the right angle, a swappable disk is essentially the same as a tape, which makes a 12-bay enclosure essentially the same as a tape library (except much cheaper). 1 TB SATA disks are probably more expensive on a cost per gigabyte basis than LTO tapes, but they're not so much more expensive to make this infeasible (at the medium scale that we operate at). And with only a bit of persuasion, Amanda is happy to treat filesystems on disks as tapes.

(In some ways it is too happy to treat them as tapes; it turns out that you have to give Amanda a staging disk in order to get parallel dumps, even when the dumps are going to disk anyways.)

So that is our current disk-based backup system. Each backup server is essentially an iSCSI backend that is running different software: a 1U server, with one system disk and one Amanda staging disk, connected to a 12-bay ESATA enclosure that's loaded with 1 TB SATA drives. Each of the 1 TB drives is divided up into three logical 'tapes' (to reduce the amount of space wasted when we have a slow dump day), and Amanda cycles through them exactly as if they were real physical tapes. Periodically we exhaust the 'tapes' loaded into the enclosure, so we pull the oldest hard drives out, set them aside, and stick more in.

(We found a source for nice plastic 3.5" HD carry cases, so we don't have to stack the bare drives up. They look something like VHS tape cases.)

Having what is effectively a tape library gives us a significant boost in backup capacity all by itself (plus it gives us much better coverage during weekends and holidays). If we need still more we can add another backup server for what I believe works out to be at most half the cost of a tape drive alone.

We're keeping our existing tape based backup system for periodic long term archival backups. After all, we have a lot of perfectly good LTO tapes, and they're probably more durable over several years than SATA HDs. (Long term durability of SATA HDs is not a concern, since with daily backups a given disk will get reused within at most a few months.)

(Necessary disclaimer: this backup system is the hard work of a number of people here, and my contribution was relatively small.)

Sidebar: the disk replacement schedule

If you are nervous about disaster recovery, you will want to pull and replace disks as soon as they're (fully) used, so that you can move them to your immediate offsite location, and you may want to accept wasted space and use only one 'tape' per disk. Locally, we accept the greater potential loss in a disaster like a machine room fire in exchange for faster and easier restores of recently deleted files.

Written on 09 May 2009.
« The problem with tapes (for backup)
Another advantage of disk-based backup systems »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sat May 9 02:28:14 2009
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.