== When comment spammers attack WanderingThoughts has been getting little one-off bits of comment spam for [[some time|MyFirstCommentSpam]], but late last night I had my first actual comment spam attack run. There's a number of interesting and odd little bits about it. The first sign of trouble was a single comment spam posted in the evening of December 20th, from a PacBell DSL line. I removed it pretty rapidly, but on December 23rd the spammer came back from various other IPs to post 12 more spam comments between 3:40am and 3:43am. (The spammer doesn't appear to have ever checked to see if their posted comment from the 20th worked, so I'm not sure why they only posted one comment then.) During both comment spam runs, the spammer also deluged WanderingThoughts with a bunch of page requests from a variety of IPs (26 URLs from 23 different IPs the first time, 300 URLs from 75 different IPs the second time). Each request had two distinct signatures: * a bunch of whitespace after the '_HTTP/1.1_' in the _GET_ command, instead of an immediate end of line. * a _Referer_ header of '_!http://_' (no trailing slash), presumably in an attempt to make them look more legitimate. I can only guess that this is an attempt to hide the comment spam posts in a bunch of other traffic. (Of course, I get sufficiently little traffic that the actual effect was to make both incidents stand out like sore thumbs.) The spam comments were all identical. They looked like this: > http://pd2.funnyhost.com > desk3 > [url=http://pd4.funnyhost.com]desk4[/url] > [link=http://pd6.funnyhost.com]desk6[/link] Presumably funnyhost.com has simply picked the four most common ways of making links in blog comments and slammed them all in one spam in an attempt to cover all the bases. Some Googling suggests that they've been spamming like this quite actively for quite some time. Funnyhost.com is claimed to be owned by 'Home Media', located in Malaysia, but has its websites hosted by Dotster.com. Cleverly, their web pages only display for certain browsers; things like _lynx_ and _wget_ get empty (0 byte long) pages. Once you get the real pages, they're a bunch of advertising links, leavened with JavaScript popups, some JavaScript to disguise links, and some images fetched from 'cache.revenuedirect.com'. The popup I looked at is served by 'webpdp.gator.com', aka Claria and apparently a common popup ad company. The advertising links are sent through 'pagead2.googlesyndication.com' (a real Google domain) before reaching the advertised websites. The interesting domain in this is revenuedirect.com, which claims (according to their front page) to let you earn money from your domain names without needing to develop a website by 'monetizing' your traffic. They appear to do this by supplying canned websites (that seem to look a lot like funnyhost.com's) loaded with popups and disguised Google ads; presumably they collect a cut of the revenue. So the model seems to be: * Google and Claria pay people for running advertising. * revenuedirect.com bundles up a canned setup for exploiting Google and Claria. * funnyhost.com buys into revenuedirect.com's services, then spams a lot of blogs to draw traffic to their websites. This all makes a handy illustration of [[affiliate marketing being dead|AffiliateMarketingIsUndead]]. Again.