Another aphorism of system administration

August 24, 2005

If a piece of information doesn't have to be correct for the system to work, sooner or later it won't be.

(Yanked out of my spam by ASN summary, where a version of it got a mention in passing.)

It's easy to see this aphorism in action. For example, if your host/domain name to IP address mapping information is wrong you notice right away. But nothing common breaks if the information to map from an IP address to a hostname is missing, and lo, it is missing all over the place.

And of course comments in source code are the classical case. Nothing breaks if the comment doesn't describe what the code is actually doing, so often the comment doesn't.

Corollary:

Attempting to validate a non-essential piece of information will inevitably turn up lots of perfectly valid systems that have the information wrong.

This most often comes up in antispam efforts, where people desperately starts attempting to insist on correct information in previously non-essential bits and discover that lots of people have it wrong or broken, often people that they still wanted to talk to. For example, for years almost no one cared what a SMTP HELO or EHLO greeting said, so real mail servers have all sorts of broken greetings.

There are two fundamental reasons for this:

  • People are lazy; they don't like doing things that seem to just be make-work. (This is one big reason why security is a pain, too.)
  • It's hard to notice incorrect information that nothing depends on.

Corollary: if you want to insure that some piece of information is correct, you must make something important check it and depend on it. The more important the better, because otherwise people may just ignore the fact that your checker is either screaming or broken.


Comments on this page:

From 67.190.163.211 at 2005-08-25 10:19:48:

I think your characterization is slightly incorrect, especially in the case of PTR records.

There's a negative correlation between the amount of attention paid to maintaining DNS records, and the amount of spam coming from a box. This may be because networks with staff who don't have time to change PTRs when they change A records also don't pay much attention to host security, etc.

In a sense, that makes lack of reverse DNS a sign of lack of competence at stopping bad things happening on that network. It becomes a useful predictor of behavior.

By cks at 2005-08-25 14:33:35:

There's a difference between a heuristic and a sure thing. Missing or bad PTR information is the former, even for SMTP mail. (So is a bad SMTP HELO; we have a pile of exemptions for people who can't manage to get it right but we still want to get email from.)

Once you step outside of SMTP email, I think missing/bad PTR data is an even less valuable heuristic.

For example, a quick check of 8,885 different IP addresses that made good requests from a web server here over the past 28 days or so shows that about 28% of them have missing or bad PTR data. I don't think we have quite that many spammers et al making proper web requests (and I happen to know that at least some legitimate web search engine crawlers have bad PTR information).

Written on 24 August 2005.
« Completely using an alternate yum configuration
Explaining rubber duck debugging »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Wed Aug 24 17:35:56 2005
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.