Losing track of part of our Amanda configuration and then recovering it
Yesterday, I wrote an entry on two ways to have Amanda always
make full backups of a filesystem, and
mentioned that we'd started out with one way (forcing them with
amadmin) and had switched to the second way (configuring '
0') a few years ago. There's a story as to why I wrote an entry
about it now, instead of a few years ago.
For years, we've always been making full backups of our mail spool,
because in our experience 'incremental' backups were almost as big
and took longer. We started out doing this with our first method,
and it worked fine for years. Then we started having the mail spool
not back up at all once in a while. This was obviously bad, and we
eventually worked out that it was '
amadmin .. force' skips doing
a backup at all if a full backup won't fit (as covered in yesterday's
entry). Having worked out this logic and
found the second and for us better approach of '
dumpcycle 0', we
switched over to it and moved on.
Then, over time, we forgot about the whole chain of logic. Recently
we were looking at always doing full backups for another filesystem
for reasons outside the scope of this entry, and we couldn't figure
out why our existing full backups filesystem was set up to use this
odd, indirect method of it instead of the obvious direct approach
of forcing things. To make it more puzzling, our cron setup still
had a commented out '
amadmin force' invocation. Fortunately we
had archives of our old discussions, and we were able to go through
them to recover the context of this bit of our Amanda configuration
to understand the 'why' (and more of the 'what', because we hadn't
remembered the gotcha of '
amadmin force' instead of '
(I've sort of written about this, when I wrote about how you should document why you didn't do attractive things. This isn't quite what I was thinking of in that entry, but it's certainly in the general area.)
Our fix for this was to put a big comment about the situation in
amanda.conf (before the '
dumpcycle 0') to document the
why of this configuration setting.
We've also revised our cron setup to entirely remove the commented
amadmin force' bits, so in the future we couldn't stumble
over them and start wondering about why they were there.
(We do often write up the 'why' of changes and configurations, but we mostly do it in our worklog system, which makes such information less immediately accessible and obvious. Here we now have strong evidence to say that we should make sure this information is very visible (ie, our getting confused this time). Our worklogs can also have the problem of assumed context, including vaguely mentioning problems that we hadn't recorded in worklog.)