Weekly spam summary on March 24th, 2007
This week, we:
- got 12,733 messages from 249 different IP addresses.
- handled 21,567 sessions from 1,259 different IP addresses.
- received 197,829 connections from at least 58,846 different IP addresses.
- hit a highwater of 8 connections being checked at once.
This is up from last week, although the messages received count remains down from the usual levels.
This has an interesting general decline in the number of new different IP addresses talking to us over the week (and the general Thursday dip also makes me wonder).
Kernel level packet filtering top ten:
Host/Mask Packets Bytes 188.8.131.52 19714 1084K 184.108.40.206/24 15468 751K 220.127.116.11 13923 724K 18.104.22.168/24 11192 672K 22.214.171.124/24 10640 482K 126.96.36.199 4493 216K 188.8.131.52 3979 212K 184.108.40.206 3645 170K 220.127.116.11 2383 143K 18.104.22.168 2245 135K
This is down from last week, partly because at least some of the active webmail subnets seem to have quieted down a bit.
- 22.214.171.124 and 126.96.36.199 kept trying to send stuff with origin addresses that had tripped our spamtraps.
- 188.8.131.52 reappears from last week.
- 184.108.40.206 is in acceleratebiz.com IP address space, and we don't talk to that any more. Considering its current hostname is 'mail.thefreebiediscount.com', I can't imagine that we're missing much.
- 220.127.116.11 is a telecomitalia.it IP address and returns from earlier this month.
- 18.104.22.168 kept trying with a bad
- 22.214.171.124 is a Taiwanese IP address with no reverse DNS.
Connection time rejection stats:
62832 total 38554 dynamic IP 17429 bad or no reverse DNS 5222 class bl-cbl 262 acceleratebiz.com 185 class bl-sbl 160 class bl-pbl 154 class bl-sdul 127 dartmail.net 101 class bl-dsbl 94 cuttingedgemedia.com 72 class bl-njabl
(Note that I don't always put specific domain blocks in this list, even if they show up in the overall numbers.)
The highest SBL source this week is SBL52715 (a spam source and landing pages /27, listed only today) at 108 rejections. Next is SBL50181 (good old microcamp.com.br's compromised web server, listed since January 18th) at 37 rejections.
Nine of the top 30 most rejected IP addresses were rejected 100 times or more this week; the leaders are 126.96.36.199 (455 rejections, bad reverse DNS), 188.8.131.52 (247 rejections, generic fastwebnet.it), 184.108.40.206 (221 rejections, bad reverse DNS), and 220.127.116.11 (217 rejections, verizon dynamic IP). It's striking that only two out of the nine are not in zen.spamhaus.org.
Fourteen of the top 30 are currently in the CBL, twelve are currently
bl.spamcop.net, fourteen are currently in the PBL, and a
grand total of 20 are in zen.spamhaus.org.
This week Hotmail had:
- no messages accepted.
- no messages rejected because they came from non-Hotmail email addresses.
- 32 messages sent to our spamtraps.
- no messages refused because their sender addresses had already hit our spamtraps.
- 5 messages refused due to their origin IP address (two in the CBL, two from Nigeria, and one in SBL49971.)
And the final numbers:
|what||# this week||(distinct IPs)||# last week||(distinct IPs)|
Now those are the sort of numbers on bad bounces that I like to see.
As usual, bad
HELOs have no sources that particularly stand out;
the highest is 18.104.22.168 (63 rejections).
Bad bounces were sent to two different bad usernames this week. Both went to plausible usernames that have never existed here (to the best of my memory), and this week they both came from machines in the USA.
How comment spammers behave
One of the things that watching your logs while trying out various comment spam precautions is good for is seeing how comment spammers seem to behave, or at least how the comment spammers that drop by WanderingThoughts behave. (Your mileage may vary, since there are a lot of comment spammers out there and they can't all be using the same tools.)
As before, I'm only really interested in defeating the automated comment spammers; a dedicated person is always going to be able to leave comments here. (And I'm not interested in making it so that people writing comments can't include links.)
So, my observations on comment spammers to date:
- they will hit any POST form with a submit button that they can
see. They don't seem to spam the search box (which is a GET form
without an explicit submit button), but they do regularly try to submit
comment spam through DWiki's login form.
(The most amusing login form spammer is the one that believes in being honest; they start all of their spam attempts with 'sorry, but i need money...'.)
- however, they almost never go past the first form submission. The
single greatest reduction in successful comment spam that I ever
managed was changing my comment form so that you had to preview before
actually posting your comment; almost every spammer previewed and then
just went away.
- some but not all of them fill in any form field that they spot; my
comment form's honeypot field gets a regular stream of programs that
trip over it, but there are about as many spammers who don't.
- the basic User-Agent checking I do
is surprisingly effective. It is also a very cheap check to make,
since you can even do it in Apache itself.
- a fair number of them harvest your comment form from one IP and then
submit from another (or a pool of others). This is really easy to see
in the full web logs, and so my 'must submit from the same /24'
precaution trips up a reasonable number of would-be comment spammers.
However, the really interesting thing is that a number of comment spammers modify this hidden field. All of the spammers that modify it seem smart enough to try putting in IP addresses, but they make them up randomly instead of using the IP address they're POSTing the form from, and they don't notice that the field is not formatted as a straight IP address. (And sometimes they stick some newlines on the end.)
They may be doing this partly because I called the field 'previp'. (My current format for it is the IP address less the last octet, so the real version looks like '
A.B.C.', with no newline at the end.)
Looking at some numbers, it appears that most comment spammers that don't trip up on the honeypot field make up random IP addresses to put in this field instead of leaving it alone.
- comment spammers almost always use comment spam using all four of the
popular syntaxes for making links at once. These seem to be:
- a bare URL:
- a full HTML link:
(I have seen one spammer that turned the initial < into <.)
(I'm not sure what uses the last two forms, but they turn up a lot.)
The links don't necessarily all go to the same website, but the presence of all four forms in the same comment is a pretty good danger sign.
(As the result of a recent aggressive (and temporarily successful) spam run, WanderingThoughts currently rejects comments that contain any of the last three forms of links, since they don't work here anyways.)
- a bare URL:
- while typical comment spam attempts to include more links than normal, it's not a lot more than normal; for example, the recent aggressive comment spam only had four links per comment (one in each link format).
I also have some negative results. First, it's not worth checking for
Referer values; almost every comment spammer that made it past
my basic User-Agent checks sent the right value.
Also, very soon after I changed my comment form to only have a preview option at the start I saw a significant jump in comment spam attempts. From this I formed the hypothesis that comment spammers are unduly attracted to forms with only one submit button; however, various experiments I've tried since then suggest that this isn't the case.
(I changed things so the first 'add comment' page had two form submission buttons and the backend DWiki code just made them do the same thing. But I didn't see any reduction in comment spam attempts, even across various variants of how the buttons were named and so on.)
Randomly engaging NumLock considered irritating
Dear Fedora Core 6 X server: please stop randomly turning my NumLock on. It's getting really old by now, especially since I use a BTC-5100C mini keyboard and so turning NumLock on sprinkles numbers around my typing instead of the letters that I expected.
(It also makes various fvwm2 operations not fire, since I'm not hitting shift+alt+mouse button, I'm 'hitting' shift+alt+numlock+mouse button. I'd tell fvwm2 to ignore the state of NumLock entirely, except it currently serves as a useful cue to me that hey, NumLock got turned on again.)
Perhaps this is some accessibility feature that I am accidentally waking up, but it seems unlikely; I'm running in a bare session, without the usual Gnome or KDE stuff started up. Nor is there any apparent pattern for when it happens, although it happens fairly infrequently and I probably don't notice it right away when it does.
PS: this is unlikely to be hardware failure since it is happening on two machines, although both have BTC-5100C keyboards. (I really like them.)