Wandering Thoughts archives

2005-08-15

Check your backups

Backups have been in the geek news recently, courtesy of the chilling tale of the online comic Penny Arcade's backup failure. Making backups is important, but there's an equally important and far less appreciated piece: checking your backups to make sure that you can actually restore them.

You may say 'but, I'm sure my system would scream a lot if something was wrong'. Let me tell you a story about that.

Once upon a time there was a young and innocent system administrator. He had an Ultrix MIPS DECStation to take care of (which says something about how long ago this was), and part of taking care of it was backing it up. Dutifully he arranged tape backups; because he worked at a university, they were tape backups over the network to a remote tape drive.

Unfortunately the Ultrix backup program insisted on logging in to the tape server as the wrong user (and which user it used was hard-coded). No problem; this was a university, so he had full Ultrix source code. Changing which user rdump used was a simple text edit and recompile.

While doing this, he noticed that rdump's Makefile compiled things without optimization. Since this was on a MIPS-based system, where compiler optimization was important for decent performance, the system administrator fixed that when he recompiled rdump.

About a year later he accidentally did something to a relatively unimportant file and decided he wanted to restore it from a backup. He queued up the right tape, got the rrestore program talking to it, and rrestore promptly told him that the backup was corrupted. This could happen sometimes, so he tried another tape; then another; then all of them. Not one was good.

I will cut to the punchline: the MIPS compiler had an optimizer bug. Compiling rdump with optimization on ran into this bug (which was why the original rdump Makefile didn't do that), and the bug made rdump produced corrupted and unrecoverable output while (of course) thinking all was fine.

A year's worth of backups were literally worthless. The young system administrator had a small heart attack, thanked his lucky stars he had found this before he needed to restore anything important, recompiled rdump without optimization, and immediately scheduled some full backups. And tested them afterwards, just to be sure.

So, having stubbed my toe, I strongly urge you: test your backups by trying to restore at least something from them. If you don't, you don't actually know if you have backups, you just think you do. (And remember, 'optimisim is not a plan'.)

sysadmin/CheckYourBackups written at 01:35:15; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.