2005-09-24
A spammer roundup
It's time to do a roundup on the status and activities of various of my perennial spammers and spam sources. (Unfortunately I can't literally round them up, nor use something like Roundup (tm) to make them disappear. Spammers are more persistent than weeds.)
Hotmail's spam problem continues unabated, despite any attempts to get Microsoft's attention. This week alone we rejected 330 attempts by Hotmail to get us to accept non '@hotmail.com' addresses from them. They involved 266 different email addresses from 86 different domains; the clear 'winner' in the domain addresses was msn.com (222 times), but then there were such domains as 'onlineuklotttery.com', 'betterdaysloto.com', and 'unionbanksite.com'.
Hotmail's other spam problem is also still happening every so often. Just today we refused a Hotmail email from 82.169.144.3, part of SBL19800, listed as an advance fee fraud source since April 4th.
The referer spammers still hit me once or twice a day, always for the spam category page and always from compromised machines (often listed on the XBL). Mostly it's for online card game gambling sites, although a couple of times it's been for online pharmacies. The web site hosting has moved to the IP address 161.58.59.8, Verio Web Hosting, with a reverse DNS of 'blackjack-123.com'. (Google shows that this IP address has been hosting Referer spam websites for quite a long time.)
They are getting creative in the domain names; I have to enjoy 'www.evilplots.com'. There doesn't seem to be any particular commonality in the domain registration information. All of the ones for the past week use 209.200.14.204, 64.234.220.141, and/or 161.58.59.8 as their nameservers, under various names; 64.234.220.141 is part of SBL17672, a ROKSO listing for Traffix.
The major comment spammers from CommentSpamWritLarge are still trying to post comments; they've made 182 attempts (from 108 different IP addresses) since the early morning of September 18th. 72 attempts were from just one IP address, 208.62.160.29, 'millwood.simplecom.net', part of Bellsouth's IP range. The claimed user agent was 'libwww-perl/5.803', so apparently one of the spammers has a Perl program to do this sort of stuff. (A Google search shows that we are not the only web site getting hit by these people.)
Of the big previous sources, 209.200.11.96/28 (previously the leading source) seems to have disappeared. Still appearing to at least some extent were 80.237.140.233, 168.143.113.0/24, and 207.248.240.119.
As always, neither group appears to care in the least that their attempts are completely fruitless.
It's a multi-protocol world after all
I just fixed a wee bug in DWiki's Atom syndication feeds. The bug was that https:// URLs (such as references to Red Hat's Bugzilla) got mangled in Atom feeds, and only in Atom feeds, to be prefixed with the web site's URL.
DWiki normally generates shortened URLs that have full paths but omit the 'http://website/' bit (for various reasons). But when it generates Atom feed entries, DWiki needs to generate only absolute, fully qualified URLs (the Atom spec calls for this, among other reasons). This means that it needs to be able to recognize which URLs were already fully qualified URLs (because they refer to external websites) and which ones aren't. To tell if a URL was already fully specified DWiki was just looking for 'http://' at the start of the URL it had. So DWiki thought https:// URLs weren't fully qualified and 'helpfully' qualified them in the Atom feed entries.
When I wrote that code, I had forgotten that it's a multi-protocol world (technically, a multi-scheme world). And in a multi-scheme world, checking for just one scheme is almost certainly a bug. In this case, I should have been checking to see if the URL had any scheme at all (which takes somewhat more code; DWiki now goes to this effort).
As a result, I have a new mantra: if my code is looking for
http://
and I'm not about to connect to a web server, I probably
have a bug. (The magnitude of the bug may vary, but as a minimum all
my code should look for https:// too.)
(The wonder of having a blog and talking about my own code bugs is that I can display my stupid programming moments in public. Perhaps it'll goad me into writing higher quality code from the start.)