You should delete obsolete data files
October 10, 2009
Around here, our systems have layers of accreted history; scripts that are generating files that are used by, well, we're sometimes not entirely sure any more. Every so often we reach the point where we turn another one of the file-generating scripts off (we take them out of crontabs, we remove invocations of them from other scripts and so on).
Having done this for a while now, I have a suggestion: when a data file becomes obsolete and is no longer updated, you should immediately delete it. Don't keep it around just in case anything still refers to it, because if there are any remaining users, you want things to break right away, when you remember what you just did recently.
(If you are lucky, things will break with error messages about 'cannot read file X' and it will be obvious why. But sometimes things will just malfunctioning, and then it really helps to have a recent change to blame.)
The problem with leaving such files around just in case is that you still get breakage, but it is much more subtle breakage. What happens is that the file slowly slips further and further out of correspondence with reality (as reality keeps changing but it doesn't), and sooner or later this divergence starts producing odd results. Things work for old accounts (or old bits of data in general) but not for new enough ones; things go to the wrong place; deleted things mysteriously resurface or still part-work. Straightforward, immediate breakage is more painful (and perhaps more embarrassing if you overlooked something important) but is much better in the long run.
I admit that this is hard for me to do; I'm a packrat by nature, and
even in an environment with version control systems and backups my
instinct is to keep old data files around just in case. But just in
case almost never shows up, so I need to wield the
* * *
Atom feeds are available; see the bottom of most pages.