A steady change in the source of blog comment spam attempts

December 21, 2014

Wandering Thoughts has been in operation for long enough that I've been able to observe a slow shift in the sources of comment spam attempts over the years. Roughly speaking (and relying on a fallible memory), in the beginning much of the comment spam attempts came from what appeared to be open proxies or otherwise compromised machines, to the point where I tried using DNS blocklists like the CBL and SBL as defenses (which didn't work out in the end). Then, at least as I perceived it, the comment spam sources largely shifted to dodgy foreign hosting providers broadly located where you'd expect them to be (Eastern Europe, Russia, and China). And then lately the majority of the still-unblocked sources have shifted to US based hosting providers and datacenters.

At the moment, the largest group of sources seem to emerge from IP address space assigned to 'DataShack LC' and 'WholeSale Internet, Inc'. Where sub-delegation information is readily accessible through whois, the specific IP addresses appear to have been delegated in very small slices to entities that appear to be Chinese based on their names; a typical example is 69.197.128.163, currently assigned to 'Zhou Pizhong' via 69.197.128.160/29. The IP addresses almost never have reverse DNS information available.

For a long time I've been reluctant to explicitly block US hosting providers, for various reasons. I've now decided that that's over for me; large netblocks for these persistent sources are now going in my blocks. Hopefully it will never affect someone using a VPN (or a personal cloud Unix machine) to try to leave a legitimate comment here.

One of several reasons that this depresses me is that it implies that being a source of repeated persistent comment spamming is no longer enough to get people terminated from even US-based hosting (if it ever was). Or at least from second-tier US hosting, since I still don't see much or any comment spam attempts from the large but inexpensive providers like AWS, Linode, and so on.

(I noticed part of this shift to hosting providers a couple of years ago, but back then it was mostly to European hosting providers and many of them were in dodgy areas.)

PS: Mind you, some of this apparent shift in comment spam sources turns out to be a bit illusory. My very first spam comment came from a US hosting provider, as did a lot of sources from a big incident early on. And I haven't kept any sort of records over the years, or even often tried particularly hard to identify the sources and keep notes. The most extensive sort of 'notes' I have are all of the various network areas I've blocked from leaving comments because their volume of comment attempts irritated me, and that's not exactly a scientific process.


Comments on this page:

As an active blogger myself, I have found the best way to beat automated form spam is by implementing some sort of proof-of-work puzzle on the client through Javascript.

On my blog, I implement Hashcash through the Wordpress plugin system. Any comment that does not submit a valid Hashcash token is immediately sent to the Wordpress spam queue. This also means that my blog has an idea of how much comment spam has been avoided:

Akismet has protected your site from 1,053,481 spam comments already. There are 2,552 comments in your spam queue right now.

5,791 comments have been approved, and are visible on the posts.

Prior to Hashcash, I was seeing a good number of the flattering type of spam comments come through the system. "Nice Blog, thanks for sharing this kind of information.", "Very good info. Lucky me I ran across your website by accident. I have book-marked it for later!", etc.

For every comment currently in the spam queue, I see the following attached at the bottom of each comment:

[WORDPRESS HASHCASH] The poster sent us '0 which is not a hashcash value.

I can't say enough good things about client-side proof-of-work puzzles, whether they are memory hard, CPU hard, or network hard.

Written on 21 December 2014.
« Unsurprisingly, laptops make bad to terrible desktops
Why Go's big virtual size for 64-bit programs makes sense »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sun Dec 21 01:36:31 2014
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.