Why I think that stupid spamming is actively wasteful
In reaction to my last entry, a commentator wrote:
You assume it's more cost efficient for the spammer to fix his system rather than just have a slightly higher percentage of broken addresses in his list than otherwise. I'd guess the broken addresses cost the spammer virtually nothing in resources or time.
I used to feel this way, that spamming was basically free, but I've shifted my views over time. My current belief is that in today's Internet environment, sending spam to addresses is not so cheap that it's pointless to measure and I actually suspect that modern spammers are often email-rate-limited and so sending to bad addresses directly displaces email that could go to potentially good addresses.
First off, let's take an easy case, that of people exploiting webmail systems via compromised accounts (as happened with us). Whether the spammers are using 'mules' to enter things by hand or they're driving the webmail systems by automation, it seems extremely likely that the spammer will have a relatively low sending rate limit (either the mules can only type and click so fast, or the webmail server software can and will only respond so fast). Thus, every clearly bad email address emailed to is a possibly good email address not mailed to.
(I'm making what I feel is the safe assumption that spammers have basically an infinite supply of potentially good email addresses they could spam.)
But let's suppose that the spammer has no message submission problems; they can stuff the queue with as much email to as many addresses as they want. The next limitation is the sending mailer itself. Spammers very often use compromised machines with whatever MTA setup the machine already has, a setup that is extremely unlikely to be set up for high sending volumes. The MTA will likely only be able to do DNS lookups and route messages so fast and make so many simultaneous delivery attempts at once, either through software limits or through machine capacity limits. Here again, bad addresses clearly displace potentially good ones.
(It's not uncommon for me to connect to the SMTP port on a machine that's sending out spam and have it report a temporary failure because of resources exceeded.)
Finally we have the actual delivery. Ignoring greylisting, I've seen clear evidence that large mail providers pay attention to delivery volumes and especially delivery volumes to bad addresses. Even here we've periodically seen temporary SMTP failures from the likes of GMail with messages to the effect of 'slow down, you're trying to send us too much too fast'. Every address a spammer tries to send to at such providers is one more point in their internal scoring systems for 'this IP is probably sending spam', and probably even more so for bad addresses; again bad addresses are displacing potentially productive ones and pushing the sending IP that much closer to when the provider will choke it off. Greylisting has similar but smaller effects (since it won't necessarily choke off future potentially good email addresses, just delay things). The effects of all of this is going to be magnified if the spammer is hijacking a compromised machine with a normal MTA that's set up for normal mail volumes.
You can build very custom infrastructure that has no problems with all of this (although you're still going to run into issues with destinations choking you off for too much volume). But I don't think most spammers these days are using anything that sophisticated, so all of those spammers are very likely to be email-rate-limited in their spamming.