Our disk-based backup system

May 9, 2009

Our solution to the tape backup cost problem has been to move to a disk-based backup system. The most important enabling thing for this is that when we built the latest version of our iSCSI backends, we discovered that we could get a pretty nice 12-bay ESATA-based enclosure for reasonably cheap, and it even has 'tray-less' drive bays that could be hotswapped.

(Tray-less drive bays are drive bays where you simply slide the bare drives in and out; you do not need to mount them in some sort of a carrier or a tray beforehand.)

From the right angle, a swappable disk is essentially the same as a tape, which makes a 12-bay enclosure essentially the same as a tape library (except much cheaper). 1 TB SATA disks are probably more expensive on a cost per gigabyte basis than LTO tapes, but they're not so much more expensive to make this infeasible (at the medium scale that we operate at). And with only a bit of persuasion, Amanda is happy to treat filesystems on disks as tapes.

(In some ways it is too happy to treat them as tapes; it turns out that you have to give Amanda a staging disk in order to get parallel dumps, even when the dumps are going to disk anyways.)

So that is our current disk-based backup system. Each backup server is essentially an iSCSI backend that is running different software: a 1U server, with one system disk and one Amanda staging disk, connected to a 12-bay ESATA enclosure that's loaded with 1 TB SATA drives. Each of the 1 TB drives is divided up into three logical 'tapes' (to reduce the amount of space wasted when we have a slow dump day), and Amanda cycles through them exactly as if they were real physical tapes. Periodically we exhaust the 'tapes' loaded into the enclosure, so we pull the oldest hard drives out, set them aside, and stick more in.

(We found a source for nice plastic 3.5" HD carry cases, so we don't have to stack the bare drives up. They look something like VHS tape cases.)

Having what is effectively a tape library gives us a significant boost in backup capacity all by itself (plus it gives us much better coverage during weekends and holidays). If we need still more we can add another backup server for what I believe works out to be at most half the cost of a tape drive alone.

We're keeping our existing tape based backup system for periodic long term archival backups. After all, we have a lot of perfectly good LTO tapes, and they're probably more durable over several years than SATA HDs. (Long term durability of SATA HDs is not a concern, since with daily backups a given disk will get reused within at most a few months.)

(Necessary disclaimer: this backup system is the hard work of a number of people here, and my contribution was relatively small.)

Sidebar: the disk replacement schedule

If you are nervous about disaster recovery, you will want to pull and replace disks as soon as they're (fully) used, so that you can move them to your immediate offsite location, and you may want to accept wasted space and use only one 'tape' per disk. Locally, we accept the greater potential loss in a disaster like a machine room fire in exchange for faster and easier restores of recently deleted files.

Comments on this page:

From at 2009-05-09 09:20:17:

Fascinating to think about, especially in the light of using tape for longer term backups or perhaps archiving (we do both). I'm not sure how different it is because I've never used Amanda, but Bacula has direct support for disk based backup.

It's also useful to consider in the light of increasing storage space (we're up to 6TB and growing pretty quickly) and how this might bleed into HSM technologies.


From at 2009-05-09 10:07:03:

Thanks for the interesting write-up. What's kept me from pursuing something similar in the past has been 1) lack of characterization of the stability of off-line disks (unlike tapes); and 2) lack of off-site vendors that want to handle hard drives. It seems like you're avoiding (1) for the most part by using LTO for long-term archives; has the market changed with respect to (2)?


From at 2009-05-09 10:48:03:

This is a very cool write up! Thanks for posting it.

I'm curious about how you've solved some of the problems I've encountered with AMANDA and virtual tapes. Namely Amanda's propensity to fill tiny amounts of tape on average and not ever append to tapes.

I haven't gotten my config engineered to the specs that I want, but my plan is to wind up with several external USB drives which contain a ton of 'tapes' that Amanda can go through, thereby using the drive more efficiently. If I only had one virtual tape per drive, my dailies would get overwritten in however many days/tapes I've got. 5 tapes, 5 days, in other words.

My eventual plan will be to have those tapes doing dailies and have the tape library slogging away on the weekly archives. I'm interested in your thoughts on this.

BTW - We just bought Amanda Enterprise from zmanda. I'll be writing about it as soon as I get some experience with it.

Matt Simmons

By cks at 2009-05-10 00:37:59:

I don't know how commercial off-site vendors would react to hard drives these days; we're small and cheap enough that our 'off-site' solution is to store our backup media in some space we have in another building on campus.

Amanda continues to never append to 'tapes'; this is part of why we have each disk split up into multiple 'tapes' as far as Amanda is concerned, so that we minimize the space wasted if Amanda only puts a little bit of data on a 'tape'. (This is just one way that Amanda's handling of disk-based backups is less than ideal.)

I wouldn't trust USB for this for at least two reasons. First, USB enclosures and external drives generally don't seem very robust, and second, USB write speeds are famously remarkably slow, quite possibly slow enough to cause problems. They're cheap, but in this case you really do get what you pay for.

Written on 09 May 2009.
« The problem with tapes (for backup)
Another advantage of disk-based backup systems »

Page tools: View Source, View Normal, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sat May 9 02:28:14 2009
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.