2006-04-23
The sort of command line I can wind up typing
Here's the sort of command line I can wind up typing:
spam/wtd daemon | expsyslog.pl | tcpwrhits -i | sed 10q | while (read foo) { n=`{echo $foo}; echo '| '^$n '|' `{checksmtp $n(2)}; }
(This will probably be wrapped in your browser, but it is all one line.)
I wrote that on the fly, in one pass, although I had to think it through a bit. I'm not going to claim that this is a typical Unix command line, but I do think it's the sort of thing experienced Unix users wind up doing every so often. It's also a good illustration of the density of little custom scripts in real environments.
What it produces is more or less the 'count / IP address / why' table from this week's spam summary, in DWikiText form ready to be dumped into place.
(The shell syntax will look a bit strange, since I use rc
.)
Sidebar: an index of commands
spam/wtd |
Spits out the last week's worth of some sort of log,
in this case the daemon syslogs. |
expsyslog.pl |
expand 'last message repeated N times' in syslog logs |
tcpwrhits |
Generate a report of IP addresses rejected at connection time by our SMTP frontend (and how many times each was rejected) |
checksmtp |
Tell me what our SMTP frontend would do for a connection from a given IP address. |
(Except for expsyslog.pl
, all of these are completely specific to
our environment and thus pretty uninteresting.)
Some CBL stats for the week ending on April 22nd, 2006
As mentioned in this week's spam summary, this week I decided to change our SMTP frontend's configuration to get statistics on the CBL that were better than my previous quick SMTP connection stats. Now that this week's up, the results are in:
- the CBL rejected 41% of our incoming SMTP connections this week.
- 75% of the connection we rejected were rejected for being in the CBL.
- more tellingly, 85% of the IP addresses that we rejected at connection time were rejected for being in the CBL.
Looking at how often each CBL-listed IP address tried to connect to us:
1 try | 2 tries | 3 tries | 4 tries | 5-10 tries | 11-20 tries | more |
61.9% | 16.8% | 9.3% | 3.5% | 6.4% | 1.3% | 0.7% |
This is startlingly different than the quick stats from a couple of weeks ago, and I have no explanation why. It seems that at least this week, most of the zombie machines are not reused; they get one rejection and then that's it. It's possible that current ratware treats 5xx SMTP rejections differently than 4xx rejections; our rejections were all 5xx ones.
Looking only at the IP addresses that tried 11 times or more (494 out
of 24,256 total IP addresses), the average is 32 rejections per IP, but
the median is 15 rejections, the 75% level is 35 rejections, and the
90% level is 61 rejections. There's one IP with 490 rejections, five
with between 200 and 240, 19 with between 100 and 199, 86 with 50 to
99 rejections, and 81 with 20 to 49 rejections. If I knew more about
gnuplot
, I would do up a nice accumulated density chart or the like.
I did up some rough 'distance' numbers, crudely measuring how far apart the earliest and the latest rejections were for IP addresses that tried more than once. It's a fairly wide distributions; some IP addresses made attempts throughout the entire week (and these were not prolific IP addresses). For example:
- 59.16.53.89 made 5 attempts between Apr 16 03:40:13 and Apr 23 02:06:56.
- 211.225.173.48 made 9 attempts between Apr 16 04:11:12 and Apr 23 02:10:23.
- 81.202.185.180 made 13 attempts between Apr 16 04:28:28 and Apr 23 02:09:50.
- 81.203.125.210 made 4 attempts between Apr 16 03:55:07 and Apr 23 02:40:46.
I'm wary of my statistical analysis, so I'll just quote one more figure: 41% of the IP addresses that tried more than once made a connection a day (or more) after their first one. (This may be understating the case, since I haven't filtered out IP addresses that first got rejected less than 24 hours ago.)
Tentative conclusion: zombie machines do get reused, but many of them get reused only slowly.
Finally, let's look at our CBL rejections broken down by their ASN. This is a reasonably good proxy for how much of a zombie source various ISPs and countries are for us.
# of different IPs | ASN | (owner) |
1570 | AS4766 | Korea Telecom (Korea) |
1492 | AS4837 | China169 (China) |
1106 | AS4134 | Chinanet (China) |
900 | AS19262 | Verizon (US) |
519 | AS9318 | Hanaro (Korea) |
395 | AS12322 | Proxad (France) |
384 | AS3352 | Telefonica (Spain) |
357 | AS6478 | AT&T Worldnet (US) |
355 | AS20115 | Charter Communications (US) |
285 | AS5462 | Telewest Broadband (England) |
Many of the usual suspects from SpamByASN and XBLStats-2005-08-06 show up again, like bad pennies.
(There are probably additional interesting numbers to run that I just can't think of at the moment.)
Weekly spam summary on April 22nd, 2006
This week's statistics are atypical, because in pursuit of better CBL statistics I moved our CBL check before all of our other connection time checks (including our greylisting) and pretty much stopped adding IP addresses to our kernel filters during the week.
Bearing that in mind, this week we:
- got 12,845 messages from 226 different IP addresses.
- handled 17,723 sessions from 788 different IP addresses.
- received 141,631 connections from at least 38,000 or so different IP addresses.
- hit a highwater of 50 connections being checked at once, hit today (this Saturday).
This is all up from last week, but not too much. The per day table is more or less flat, with a peak of 28,000 connections this Monday.
Kernel level packet filtering top ten:
Host/Mask Packets Bytes 66.116.103.133 8967 456K 212.216.176.0/24 5903 294K 202.43.219.0/24 5015 254K 222.112.161.1 3436 165K 210.109.97.184 3365 162K 61.128.0.0/10 3293 165K 218.0.0.0/11 2011 101K 220.160.0.0/11 1861 93084 222.146.58.254 1801 88876 213.29.7.190 1722 103K
Here we see the effects of pretty much not adding anything to the kernel filters all week. This leaves very few active individual IP addresses:
- 66.116.103.133 hit spamtraps (although not early enough to save some of our users) and then kept mailing and mailing.
- 222.112.161.1 and 210.109.97.184 are Korean IP addresses without working reverse DNS.
- 222.146.58.254 reappears from last week, still trying to send phish spam email.
- 213.29.7.190 is a
centrum.cz
mail machine.
Connection time rejection stats:
79352 total 58949 class bl-cbl 8680 dynamic IP 8007 bad or no reverse DNS 2071 class bl-ordb 466 class bl-njabl 429 class bl-dsbl 67 class bl-sdul 39 class bl-sbl 30 class bl-spews 8 class bl-opm
Yes, you read that right; 75% of our rejections were due to CBL listings. This isn't too surprising; the last time I looked at the stats (although over a shorter period) it was actually higher. The popularity of the ORDB is probably because of not putting heavy rejection sources into the kernel filters; just four IP addresses accounted for 80% of the ORDB rejections.
This week was obviously the week of really active connection time rejection sources, since practically none of them got put into the kernel filters. Here's a little table of the top ten:
Count | IP | Why |
872 | 217.40.27.106 | dialup |
720 | 213.76.217.20 | dialup |
599 | 81.241.234.166 | baddns |
599 | 63.196.46.20 | bl-ordb |
570 | 210.109.97.184 | baddns |
501 | 83.111.79.10 | bl-ordb |
499 | 212.248.91.226 | bl-cbl |
366 | 72.11.98.58 | bl-ordb |
352 | 87.0.64.88 | bl-cbl |
352 | 211.156.161.173 | baddns |
(The fourth ORDB IP address is 146.145.107.123, with 189 rejections; it's down at #24 on the top 30 most rejected IP addresses.)
The Hotmail stats are up a bit this week:
- 3 messages accepted.
- 1 message rejected because it came from non-Hotmail email
address (from
hotmail.fr
; possibly I should fix that). - 7 messages sent to our spamtraps.
- no messages refused because their sender addresses had already hit our spamtraps.
- 3 messages refused due to their origin IP address (two from SBL37487 (oh look, our old friends Gilat-Satcom), and one from Ghana).
The final set of numbers:
what | # this week | (distinct IPs) | # last week | (distinct IPs) |
Bad HELO s |
953 | 44 | 709 | 63 |
Bad bounces | 21 | 16 | 70 | 53 |
Bad bounces have dropped like a stone, although I'm not going to hold
my breath hoping that they stay there. The count of bad HELO
s is up a
bit, but that's not surprising because I didn't throw prolific sources
into the kernel level blocks this week like I usually do.
This week's really prolific bad HELO
s: 217.13.30.114 (184 times),
63.138.75.163 (145 times), 213.123.26.96 (138 times), and 66.240.116.170
(96 times). By contrast, last week the most prolific source only had 67
rejections.