2006-06-26
How not to report spam (part 1)
For my sins, I am on one of the aliases here that gets a certain amount of reports of spamming theoretically committed by UofT IP addresses. (I am not one of the people who has to deal with them, fortunately; it is a thankless job). This exposes me to a certain amount of good examples of how not to report spam.
Today's example comes to us from an official government organization in a large Southern American country. All the information they gave us was:
- the date (with the format spelled out: +1)
- the time (with the time zone, as an offset from GMT: +1)
- the sending IP address.
- the 'SMTP ID', apparently something generated by their system.
- the virus type it was identified as.
- the Subject line of the mail.
Unfortunately, the IP address is the IP address of our main outgoing
SMTP gateway. It sends a considerable amount of email, and little
details like the MAIL FROM and the RCPT TO of the problematic
message would have been useful.
(Disclaimer: despite my grumbles, Vernon Schryver's remarks about spam complaints definitely apply. Even people making imperfect spam reports are doing us a favour that they don't have to. It would just be faster to fix the issue if we got more information.)
2006-06-25
Weekly spam summary on June 24th, 2006
This week, we:
- got 13,681 messages from 253 different IP addresses.
- handled 18,870 sessions from 835 different IP addresses.
- received 303,478 connections from at least 47,309 different IP addresses.
- hit a highwater of 7 connections being checked at once.
Connection volume is majorly up from last week; other numbers are up slightly, except the highwater (which is down). The per day table:
| Day | Connections | different IPs |
| Sunday | 63,522 | +7,971 |
| Monday | 143,435 | +6,640 |
| Tuesday | 21,068 | +6,387 |
| Wednesday | 21,889 | +7,733 |
| Thursday | 21,137 | +6,998 |
| Friday | 17,960 | +6,695 |
| Saturday | 14,467 | +4,885 |
The spam storm from last Saturday evidently continued through Sunday and Monday, although apparently not from all that many IP addresses.
Kernel level packet filtering top ten:
Host/Mask Packets Bytes 204.202.15.180 11580 571K 199.239.233.177 8647 427K 198.66.222.20 8280 408K 61.128.0.0/10 5559 277K 218.0.0.0/11 5172 257K 212.216.176.0/24 4954 249K 204.202.9.161 4556 225K 70.229.186.3 4259 199K 220.160.0.0/11 3687 182K 219.128.0.0/12 2823 140K
This is down from the levels of last week, especially at the top of the table.
- 204.202.15.180, 199.239.233.177, 198.66.222.20, and 204.202.9.161 all reappear from last week, and again got blocked for keeping trying to send us stuff that had already hit our spamtraps.
- 70.229.186.3 is an Ameritech ADSL customer who appears to be running
a Microsoft mailer with an internal hostname that wouldn't have gotten
past our
HELOname checks anyways.
Connection time rejection stats:
34428 total
16667 bad or no reverse DNS
14647 dynamic IP
1785 class bl-cbl
162 class bl-dsbl
147 class bl-spews
135 class bl-sbl
124 class bl-njabl
70 class bl-sdul
36 class bl-ordb
Given the connection volume jump this week, it's surprising that all of these stats are lower than last week. I can only guess that a lot of IP addresses didn't make it through our greylisting or something.
Twelve of the top 30 most rejected IP addresses were rejected more
than 100 times, but only one (218.254.82.97, at 1210 rejections)
hit the heights of activity seen last week. 22 are currently
in the CBL, 7 are currently in bl.spamcop.net, and one is in the
SBL.
Of course the one listing is 222.252.173.9, part of SBL39408, which is a /15 listing for a major Vietnamese network area that is apparently full of spam sources and has been listed since April 10th. (It came up here back in May.)
Out of curiosity I looked at the most 'popular' SBL listings:
| rejections | SBL listing | since when | why |
| 74 | SBL38558 | 02-Mar-2006 | datanetmedia.com / prospermedia.com (QWest) |
| 25 | SBL42599 | 28-May-2006 | random spammer in HE.NET |
| 9 | SBL41338 | 04-May-2006 | Russian spam source (okclub.org) |
| 9 | SBL41015 | 27-Apr-2006 | phish source |
| 6 | SBL43251 | 10-Jun-2006 | spam haven in HE.NET |
I have to say that this doesn't look too good for HE.NET. Or QWest. It's kind of sad that some of our most active SBL-rejected spam sources are in the United States, connected by major ISPs.
Hotmail is looking better this week:
- no messages accepted.
- 1 message rejected because it came from a non-Hotmail email address.
- 7 messages sent to our spamtraps.
- no messages refused because their sender addresses had already hit our spamtraps.
- no messages refused due to their origin IP address
And the closing numbers:
| what | # this week | (distinct IPs) | # last week | (distinct IPs) |
Bad HELOs |
420 | 48 | 375 | 54 |
| Bad bounces | 18 | 17 | 25 | 13 |
The most prolific source of bad HELO names this week was 68.88.211.161
(claiming to be 'maplehill.MHCM.local'), which failed to take the hint
139 times; unfortunately this is common behavior for the Microsoft
mailer that it seems to run.
We saw bad bounces to both 38-character hex strings from before, as well as to the usual suspects: plausible real users (including 'webmaster' and 'noreply'), a random alphanumeric string, and three all-numeric usernames.
2006-06-18
Weekly spam summary on June 17th, 2006
This week, we:
- got 12,612 messages from 249 different IP addresses.
- handled 17,714 sessions from 803 different IP addresses.
- received 245,591 connections from at least 48,624 different IP addresses.
- hit a highwater of 8 connections being checked at once.
Connection volume is up substantially from last week, although nothing else seems to be up much (especially the highwater). The per day table:
| Day | Connections | different IPs |
| Sunday | 19,554 | +9,196 |
| Monday | 17,987 | +7,349 |
| Tuesday | 19,967 | +6,725 |
| Wednesday | 19,737 | +6,848 |
| Thursday | 23,173 | +7,102 |
| Friday | 21,914 | +6,399 |
| Saturday | 123,259 | +5,005 |
In other words, we did about half this week's connection volume today. That would be yet another spam storm in progress.
Kernel level packet filtering top ten:
Host/Mask Packets Bytes 204.202.15.180 18790 927K 199.239.233.177 13568 669K 220.229.62.220 12171 619K 198.66.222.14 11162 551K 204.202.9.161 7360 363K 198.66.222.20 6693 330K 61.128.0.0/10 6003 303K 218.0.0.0/11 4885 242K 200.83.2.213 4765 286K 155.212.2.42 4667 230K
This is well up from last week, especially at the quite aggressive top end; it's been quite a while since we had a week with that many IP addresses sending us over 10,000 fruitless packets.
- 204.202.15.180, 199.239.233.177, 204.202.9.161, 198.66.222.14, and
198.66.222.20 were all sources of phish spam that hit our spamtraps
(I can tell from the
MAIL FROMaddresses). (Two actually made our lists back in May.) - 155.212.2.42 hit our spamtraps and kept on sending madly, but I'm not sure whether it was phish spam or regular spam.
- 220.229.62.220 reappears from last week, still with bad reverse DNS.
- 200.83.2.213 is a Chilean IP address with bad reverse DNS, probably part of vtr.net.
Clearly this is the week of phish spam. Somewhat to my surprise the
prolific sending boxes are not Windows machines; they all seem to be
running Sendmail or Postfix, likely on Unix. I'm disappointed that so
many Unix boxes seem to be getting hijacked by the phish spammers. (All
of these machines got rejected with MAIL FROMs that were clearly set
by the spammers to look more authentic, so I don't think this is just
the usual case of a 'send mail to people' CGI-BIN getting abused.)
Connection time rejection stats:
46264 total
21356 bad or no reverse DNS
21158 dynamic IP
2284 class bl-cbl
255 class bl-dsbl
203 class bl-sdul
67 class bl-njabl
44 class bl-spews
31 class bl-ordb
29 class bl-sbl
The usual suspects are up substantially from last week. This week was also the week of really aggressive connection attempts; three IP addresses were rejected more than a thousand times. The top five are:
1713 200.83.2.213
1162 218.254.83.47
1161 218.254.82.97
433 211.144.69.247
172 61.247.78.210
Of the 30 most rejected IP addresses, 29 were rejected more than
100 times. 25 are currently in the CBL, 14 are currently in
bl.spamcop.net, and 211.144.69.247 is in SBL42856 as
being under the control of the ROKSO-listed Mailtrain
(it's also in the CBL, so it's probably a compromised machine).
Hotmail's numbers for this week:
- 2 messages accepted.
- 4 messages rejected because they came from non-Hotmail email addresses (all from other Hotmail properties).
- 10 messages sent to our spamtraps.
- 1 message refused because its sender address had already hit our spamtraps.
- 1 message refused due to its origin IP address being in the SBL (196.3.62.3, in two SBL listings: SBL31791 and SBL35001, both of which date from late 2005, both of which are listed for advance fee fraud spam sent through Hotmail).
These numbers are a disappointment, although they're not catastrophic. I am particularly irked by Hotmail's willingness to continue to accept email from places that have spammed through it before.
And the final set of numbers:
| what | # this week | (distinct IPs) | # last week | (distinct IPs) |
Bad HELOs |
375 | 54 | 727 | 47 |
| Bad bounces | 25 | 13 | 34 | 18 |
We had two bounces to the 38-character hex string from before, but also another bounce to a new 38-character
hex string, 8B407639D45C5742ADD3987F7E013C41288C3A (which I am about
to become the only Google hit for, just like with the other one). The
most prolific bad bounce destination this week was noreply, followed
by a bunch of old usernames, some garbage alphanumeric sequences, and
one bounce to an all-digit username.
2006-06-11
Weekly spam summary on June 10th, 2006
Our SMTP listener died on Tuesday evening and was restarted, so some of this week's statistics are incomplete. This week, we:
- got 12,614 messages from 245 different IP addresses.
- handled 17,611 sessions from 882 different IP addresses.
- received 95,812 connections from at least 38,234 different IP addresses since 21:10 Tuesday. (And about 43,000 connections from at least 16,000 different IP addresses up to Tuesday morning at 4am.)
- hit a highwater of 10 connections being checked at once since 21:10 Tuesday.
At a rough guess, this makes the volume about the same as last week, maybe up a bit. The per-day information is unfortunately completely useless, but seems more or less flat from what I can reconstruct.
(It's possible that a significant volume surge on Tuesday took down the SMTP listener; it generally dies on an internal error deep in the depths of the C library. I assume something is getting messed up between threading and other fun issues.)
Kernel level packet filtering top ten:
Host/Mask Packets Bytes 61.128.0.0/10 7354 378K 66.58.176.187 5961 303K 218.254.83.47 5753 276K 220.229.62.220 5694 290K 205.206.60.232 5666 272K 218.0.0.0/11 5192 260K 220.160.0.0/11 4331 216K 193.74.71.23 4209 253K 82.225.205.16 4206 202K 65.214.61.113 4189 201K
This time, pride of place goes to a large aggregate bit of China. It was there last week, but not that high. Of the individual IP addresses:
- 66.58.176.187 and 218.254.83.47 return yet again from last week; at this rate they may earn themselves permanent blocks.
- 220.229.62.220 is part of a Taiwanese netblock, and can't be successfully resolved to a hostname. Since it claims to be something with 'adsl' in the name, we probably don't want to talk to it anyways. (It also appears to be 'dns.maze.com.tw'.)
- 205.206.60.232 is a generic Telus IP address that we reject as a 'dialup'; it's also listed in dsbl.org as an open relay.
- 205.206.60.232 mailed a spamtrap address and then kept trying to
send us more mail with the same
MAIL FROM. - 82.225.205.16 is a generic proxad.net IP address. Uh, no. It's
also on a pile of DNSbls, including
bl.spamcop.netat the moment. - 65.214.61.113 is another server that mailed a spamtrap address and then kept trying to send; however, they stand out because they've been trying and trying since May 23rd.
Connection time rejection stats:
41600 total
19791 bad or no reverse DNS
16897 dynamic IP
2579 class bl-cbl
544 class bl-dsbl
244 class bl-sdul
216 class bl-ordb
179 class bl-njabl
133 class bl-sbl
113 class bl-spews
This is down a bit from last week, which may just be the effects of the Tuesday evening SMTP listener restart (since it restarts the greylisting process for everyone).
Out of the top 30 most rejected IP addresses, 18 had more than 100
rejections; the champion was our friend 218.254.83.47 (587 times), with
second place going to 210.50.131.218 (only 234, and rejected due to
being on the DSBL). 22 of the top 30 are currently
in the CBL, and only 7 are currently in bl.spamcop.net.
Hotmail stats are looking quite good:
- 3 messages accepted.
- 1 message rejected because it came from a non-Hotmail email address.
- no messages sent to our spamtraps.
- 1 message refused because its sender address had already hit our spamtraps.
- 1 message refused due to its origin IP address being in the CBL.
On the other hand, the one rejected non-Hotmail email address was from the domain 'mail2agent.net', with Microsoft DNS servers but registered with the contact email of 'eurolottowinner@mail2agent.net'. This looks alarmingly like Hotmail backsliding into the whole original problem.
And the final set of numbers:
| what | # this week | (distinct IPs) | # last week | (distinct IPs) |
Bad HELOs |
727 | 47 | 288 | 69 |
| Bad bounces | 34 | 18 | 27 | 23 |
Surprisingly (to me) there is no single huge spike source of bad
HELO names; there's only four that had 50 or more rejections,
in fact.
There were another four bounces to the 38-digit hex string, a bunch of bounces to plausible login names (many of which used to exist here), but only unlike last week, only one bounce to an all-digit username.
2006-06-04
Weekly spam summary on June 3rd, 2006
This week, we:
- got 11,560 messages from 225 different IP addresses.
- handled 16,969 sessions from 1005 different IP addresses.
- received 135,139 connections from at least 46,180 different IP addresses.
- hit a highwater of 12 connections being checked at once.
Apart from slightly higher numbers of IP addresses talking to us this week, this is a clone of last week's numbers. Since the per day volume fluctuated, I'll include the table this week:
| Day | Connections | different IPs |
| Sunday | 14,968 | +6,360 |
| Monday | 22,460 | +6,890 |
| Tuesday | 20,133 | +6,642 |
| Wednesday | 21,142 | +7,553 |
| Thursday | 17,879 | +5,624 |
| Friday | 20,882 | +7,370 |
| Saturday | 17,675 | +5,741 |
This isn't a major fluctuation as those go; clearly things are a bit random. (Perhaps one day I will add deliveries by day to this table, although it's harder to construct.)
Kernel level packet filtering top ten:
Host/Mask Packets Bytes 65.126.217.71 17288 879K 218.254.83.47 7490 360K 66.58.176.187 5555 283K 198.187.200.0/24 5080 305K 61.128.0.0/10 4282 214K 218.0.0.0/11 4014 200K 212.216.176.0/24 3588 183K 220.160.0.0/11 3413 171K 213.177.135.32 2800 134K 63.252.170.25 2629 123K
Overall this seems quieter than last week, although there's one obvious huge exception.
- 65.126.217.71 is a QWEST IP address that kept
HELO'ing as 'yinyang', with no domain name or anything. Declined. - 218.254.83.47 and 66.58.176.187 return from last week, evidently still not done yet.
- 213.177.135.32 and 63.252.170.25 are CBL-listed and gave us bad
HELOnames on top of it.
198.187.200.0/24 is an outdated and now erroneous listing I just noticed now. Whoops. (See, there's more than one reason for me to do these summaries. Finding such outdated listings is one of those generic problems, partly because I never built an infrastructure to manage it all when I set these things up.)
Connection time rejection stats:
44525 total
21085 bad or no reverse DNS
19378 dynamic IP
2400 class bl-cbl
322 class bl-sdul
233 class bl-dsbl
153 class bl-spews
142 class bl-sbl
131 class bl-njabl
68 class bl-ordb
Rejections are up on last week, and more than I'd expect from
the slight overall traffic growth. 24 of the top 30 most rejected
IP addresses had more than 100 rejections, with the champion being
64.191.63.117 (382 times); our friend 218.254.83.47 is the runner up
with 379 rejections. 24 of the top 30 are currently in the CBL and 10
are currently in bl.spamcop.net.
Hotmail stats are low but not groovy:
- no messages accepted.
- no messages rejected because they came from non-Hotmail email addresses.
- 10 messages sent to our spamtraps.
- 1 message refused because its sender address had already hit our spamtraps.
- 1 message refused due to its origin IP address being part of Gilat-Satcom.
Meanwhile Yahoo continues to slap us with the spam trout, although I have yet to write a script to generate numbers for how badly.
The last set of numbers:
| what | # this week | (distinct IPs) | # last week | (distinct IPs) |
Bad HELOs |
288 | 69 | 462 | 64 |
| Bad bounces | 27 | 23 | 18 | 16 |
Once again there were several bounces to our friend the 38-digit
hex string, plus to a number of real (ex) usernames, plus random
ones. The new pattern this week is bounces to all-digit usernames
of various lengths, ranging from 03 to 41291175.
2006-06-03
The fundamental problem of spam
Recently, yet another article on the death of email ran in the Register, 'The time has come to ditch email' (which I saw due to a Slashdot article). As usual, it advocates replacing SMTP email with something that is more 'secure', whatever exactly this means.
Unfortunately, this misses the fundamental problem of spam:
You want to get email from strangers, but only good strangers.
Telling good strangers from bad strangers is a hard problem, to put it one way. There is no indication that computers are going to be any good at it any time soon, and certainly current technology is not up to the job. Magic new security technology for a new email protocol would have to be very magic to solve the problem, and so far no one has even come close. Worse, a great many people (including the author of the piece in the Register) seem completely oblivious to the issue.
Indeed, today's antispam technology has false positives and false negatives precisely because it has to use heuristics like 'did a copy get emailed to a lot of other people' or 'does it have bad phrases' as a proxy for the real question.
(If you think that assigning people identities on the Internet will solve this problem, please see TwoSidesOfIdentity.)
(This idea isn't original to me; I think I picked it up in Usenet's news.admin.net-abuse.email.)