Speculation about what comment spammers think they're doing here

September 22, 2012

To summarize briefly, comment spam attempts here show some odd behavior; when I add sources to IP blocks, I see significant hits on those blocks but the level of non-blocked comment spam attempts stays more or less the same (but comes from new IPs). It's as if the comment spammers keep trying from the old IPs but also add new IPs. I'm a firm believer that spammers are generally not stupid. Whatever strange things they're doing are being done for reasons that make sense to the spammers. So the real question I'm left with is what the comment spammers are targeting here. What is their actual goal, which their software presumably thinks it's dutifully achieving?

What their software actually does almost all of the time is fill in all of the text fields on the 'add a comment' page (including my honeypot field that you are not supposed to touch), submit it for previewing, and then not do anything more. In particular the spammers seem to basically never attempt to resubmit the spam to actually post it; one POST and they're done.

I've come up with two speculations on what they're doing so far. First, the spammer software could think that it's actually succeeding in posting spam comments and it could be targeting 'so many comments posted successfully'. This is a bit of a stretch but the raw text of a comment is (re)displayed on the preview page (although the HTML version is not shown if the honeypot field was touched). Software that simply searched for its submitted spam text might be satisfied and conclude that the comment had been successfully posted.

Second, the spammer software could be trying to flood a (presumed) moderation queue with a high volume of spam submissions in the hopes that something would get through by mistake. The software would then be targeting 'so many comments submitted into the queue' and it would continue to pound away even if nothing seemed to be getting through; after all, the people behind the moderation queue only have to make a mistake once.

(I feel that one of the principles of the modern Internet spam game is 'automated work is cheap'. If the spammer can just leave software running to do something, they might as well keep it banging away; the cost of leaving it running is probably low enough that even a single success pays for it. In an environment where you have to rent botnets by the 15 minutes and so on, this may not be quite as true as I've been assuming.)


Comments on this page:

By Dan.Astoorian at 2012-09-22 09:29:35:

Perhaps the simplest explanation for the IP address patterns is that botnets are not static; they grow as new hosts get infected, and the hosts within them that have dynamic IP addresses don't keep them forever.

I don't know how the comment spamming software is typically designed, but I'm not sure there's any incentive for it or its operators to even check whether a posting is successful. If it succeeded, then it's done its job and no further action is needed; otherwise, it would take someone real work to figure out why it didn't get posted, the payoff for doing that work is relatively low, and there's probably no significant reduction in costs by pruning that forum from its list of targets for the next attempt. So why even check whether or not it worked?

"Never test for an error condition you don't know how to handle." --Steinbach's Guideline for Systems Programmers.

By cks at 2012-09-24 11:11:27:

My problem with the shifting or growing pool of IPs explanation is that the patterns I'd expect from it don't seem to match what I'm seeing. If the pool of IPs that the spammers were using was steadily increasing, I'd expect to see more and more non-blocked spam attempts and I generally haven't. If the IPs were shifting, I'd expect the attempts from old, now-blocked IPs to level off and drop over time and again this doesn't seem to happen; they still have a significant volume (more attempts than from unblocked IPs, generally).

(Also, a lot of the IPs seem to be static IPs of servers (which is one of the things that has changed over time; they certainly used to be botnet IPs in part).)

You're right that I may be assuming too much about spam software. They might only care about 'made N attempts that were not obviously blocked' (ie, that got 200-series responses or something).

Written on 22 September 2012.
« How I enter URLs in my browser
How we handle Ubuntu LTS versions »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sat Sep 22 02:54:06 2012
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.