2014-06-15
Weird spammer behavior: a non-relaying relay attempt
One of the the interesting things about running a sinkhole SMTP server that accepts everything and basically serves as a spamtrap is that I get to see all sorts of odd and crazy spammer behavior. Take the following SMTP transaction log:
220 hisokusa.cs.toronto.edu go-smtpd HELO smelektronik.de 250 hisokusa.cs.toronto.edu Hello 86.34.202.208 MAIL FROM: <jhbrrhf@smelektronik.de> 250 Okay, I'll believe you for now RCPT TO: <XXXX@hawkwind.utcs.utoronto.ca> 250 Okay, I'll believe you for now RCPT TO: <betty@hawkwindbase.f9.co.uk> 250 Okay, I'll believe you for now RCPT TO: <mail@hawkwise.fsbusiness.co.uk> 250 Okay, I'll believe you for now DATA ....
This is a CBL-listed IP address and the spaces after the ':' in
MAIL FROM and RCPT TO is typical of badly implemented spamware
(it's not RFC-compliant, although many mailers will accept it).
The interesting thing is the second and third RCPT TO addresses.
My sinkhole here is not the MX target for any of them (of course).
Sometimes you'll see deliberate relay attempt probes, but this
doesn't seem to be one of them. Instead it looks like the spammer's
software is just clumping lexically similar domains together and
then dumping N addresses one the MX target of the first one,
regardless of whether the additional addresses will ever get accepted
(almost no MTA will, because almost all are configured to not relay
these days).
About all I can guess is that someone wrote software that either has a bug or that is simply extremely sloppy and wrong, and the authors either never tested it or don't care. Perhaps they make their money from selling it to people who simply don't notice that an appreciable amount of their delivery attempts can never succeed. I suppose the customers are probably not in a position to really notice this behavior.
(My logs shows three such attempts so far in a few days, from two different IPs in total. It all appears to be the same spam run.)
2014-05-31
Wnen trying to unsubscribe from spam can be not completely crazy
For reasons beyond the scope of this entry, I've started running a
little 'sinkhole' SMTP server to collect email for what is essentially
a disused set of old addresses of mine. The server takes in everything,
logs the messages to disk, and then tells the senders that they
weren't delivered; later, I can go through the logged messages to
see if anything interesting has shown up. In the process of doing
this I've noticed that a surprising amount of the spam comes with
List-Unsubscribe headers with URLs. So I've been using some of
them to see what happens (especially when I recognize the sender as
a long term repeat offender).
Normally it's an article of faith that you should never, ever use a spammer's unsubscribe procedure. Doing so only confirms that your address is live and perhaps helps the spammer, reducing the load on their systems and so on. I'm not sure that what I'm doing here is exactly sensible (although it makes for an interesting experiment), but I don't think it's completely crazy.
The big difference between my situation and the normal situation is that the addresses I am 'unsubscribing' are already dead addresses as far as I'm concerned. They basically no longer get valid email (and some of them don't even exist) so any increase in the amount of spam coming to them is immaterial; it's just more grist for the sinkhole to reject. Also, these addresses have been aggressively defended for what is now years and spammers have been trying to bang away at them for years. It's clear that the repeat spammers I can recognize simply don't notice or don't care about any extra load on their mail sending systems that I've been creating with prior tactics like blackholing SMTP traffic from their IPs in the firewall. If they did they would have removed such unresponsive addresses by now, often years ago.
At one level I don't care about the load on my sinkhole, but at another level I do. I'm running the sinkhole to collect interesting new things, and yet more spam from some usual suspect is completely uninteresting. If I can make it go away by unsubscribing rather than making the sinkhole's environment more complicated, so much the better for my real goals.
Or at least that's my thinking so far. I may change my mind later and stop doing this.
(But I'd also find it very interesting if unsubscribing actually seemed to increase the spam from the usual suspects, or even increased it. Seeing if this will happen is one reason I'm bothering with the experiment at all.)
2014-05-28
Yahoo Groups has a bad spam problem and they don't care
For a while, one of the things I've noticed any time I look at our mail system is that we seem to be getting spam from Yahoo Groups. Today I decided to look at the magnitude of that spam. I'm afraid that the results are really terrible.
Over the past roughly 30 days, our main spam filter has seen 29,545 email messages with an envelope origin address at returns.groups.yahoo.com. 28,869 of them has been scored as pretty definitely spam; that's 97.7% of the messages. That's what you could call not very good, and in fact it makes Yahoo Groups the single largest envelope origin domain on all of the spam we've received over that time span.
(Interestingly the second highest is the null sender address, but it's used by less than a third as many messages as come from Yahoo Groups.)
As they say, 'but wait it gets worse'. We also have an optional
DATA-time rejection based on spam scoring. This system logs not
just basic envelope information but also the Subject: headers
from rejected messages, which means that I can look at what they
are for rejected Yahoo Groups messages. Those subject lines are,
well, interesting. Here, let me show you a random sample of what
I saw when I looked:
Subject: [PMX:SPAM] Hot Angelina Jolie Sex Scandal
Subject: [PMX:SPAM] Do You Want To Get Closer To The Next Love?
Subject: [PMX:SPAM] Look For Horny Chicks Near Your Home
Subject: [PMX:SPAM] nice hot girl sex on the office desk
Subject: [PMX:SPAM] Absolutely Free Cyber Casual Affaires
Subject: [PMX:SPAM] Do You Want To Have A Chat With Hot Cybersingles?
I think you get the point. By the way, this excludes many, many subject lines that contain various sorts of sex language, partly because I decided that I didn't want Wandering Thoughts to turn up internet searches for those words.
(We log the Subject: line for because it gets annotated by the
spam scoring system and thus gives us some additional data on why
the system rejected a message if we ever need it.)
All of this is clearly sex spam. And based on both the subject lines and on the fact that it was detected by our spam system (which I'm sure is not state of the art for major online email providers), this should be stuff that Yahoo is more than capable of detecting on the way out of their systems. Yet they don't. The spam flows unimpeded and has been flowing for what I believe is now a very long time (because I've been casually noticing this stuff from Yahoo Groups for years now).
One corollary is that you almost certainly want to get any remaining legitimate groups off of Yahoo Groups, if you're involved with any. The odds are increasing that places will reject all email from Yahoo Groups (or blackball the emitting IP addresses, although Yahoo may not use dedicated IPs for Groups).
(See also this early 2012 data on top domains on spam messages and the discussion of Yahoo Groups there.)
2014-05-14
Modern mail forwarding is leaky
As I noted in my entry on how DMARC would affect us, modern mail forwarding is now leaky in practice. In the old days, if you forwarded your mail from address A to address B you could be reasonably sure that everything that made it to address A would also make it to address B. In the modern world this is no longer necessarily so; it's quite likely that some amount of the mail that was accepted by the mail system for address A will not get accepted by address B. This lost email has leaked out of the system (and the senders may or may not ever find out about it).
Of course there are all sorts of things that will cause mail to leak, and not all of them are bad. Certainly some of the leaked email will be spam that mail system B does a better job of recognizing than mail system A (which is especially likely when mail system B is a lot larger and more sophisticated than mail system A). In a world with an increasing amount of DMARC 'reject' policies, some of it may be email that is considered illegitimate by the origin domain's policy (whether or not it actually is illegitimate). But it can also be email that is mis-classified as bad by mail system B, or email that is simply caught up as collateral damage because mail system B sees too much bad stuff coming from mail system A.
(There are various ways for the collateral damage to happen beyond the straightforward.)
Naturally it's somewhat hard to measure how much nominal leakage there is and very hard to measure how much leakage of legitimate email there is (at least without doing intrusive and privacy violating monitoring of bounces and their content).
Of course, leaky forwarding is not new. Forwarding has been slowly becoming leakier and leakier over the years as spam and other bad stuff became an increasingly large part of email and as places got very varied levels of anti-spam filtering. But I'm not certain that our users understand that and I'm pretty certain that our documentation about how to set up forwarding doesn't contain any real discussion of the possibilities (and I suspect that that should change).
PS: I'm implicitly ignoring in this anyone who wants to forward all of their email, spam included, from mail system A to mail system B. That just doesn't work at all these days; there will be a firehose of 'leakage' as mail system B laughs at your spam.
2014-04-23
How Yahoo's and AOL's DMARC 'reject' policies affect us
My whole interest in understanding DMARC started with the simple question of how Yahoo's and AOL's change to a DMARC 'reject' policy would affect us and our users, and how much of an effect it would have. The answer turns out to be that it will have some effects but nothing major.
The most important thing is that this change doesn't significantly affect either our users forwarding their email to places that pay attention to DMARC or our simple mailing lists because neither of them normally modify email on the way through (which means the DKIM signatures stay intact, which means that email really from Yahoo or AOL will still pass DMARC at the eventual destination). Of course it's possible that some people are forwarding email in ways that modify the message and thus may have problems, but if so they're doing something out of the ordinary; our simple mail forwarding doesn't do this.
(We allow users to run programs from their .forward files, so
people can do almost arbitrarily complex things if they want to.)
There is one exception to this. Email that our commercial anti-spam
system detects as being either spam or a virus has its Subject:
header modified, which will invalidate any previously valid DKIM
signature, which means that it will fail to forward through us to
DMARC respecting places (such as GMail). This would only affect
people who forward all email (not just non-spam email) and then
only if the email was legitimately from Yahoo or AOL in the first
place (and got scored or mis-scored as spam). I think that this is
a sufficiently small thing that I'm not worried about it, partly
because places like GMail now seem to be even stricter than our
anti-spam system is so some percentage of potentially dodgy email
is already not being forwarded successfully.
People who forward their email to DMARC-respecting places will be affected in one additional way. The simple way to put it is that our forwarding is now imperfect, in that we'll accept some legitimate messages but can't forward them successfully. These would be emails from legitimate Yahoo or AOL users that were either sent from outside those places or that got modified in transit by, eg, mailing lists. A user who forwards their email to GMail is now losing these emails more or less silently (to the user). In extreme cases it's possible that they'll get unsubscribed from a mailing list due to these bounces.
This also affects any local user who was sending email out through
our local mail gateway using their AOL or Yahoo From: address.
To put it one way, I don't think we have very many people in this
situation and I don't think that they'll have many problems fixing
their configurations to work again.
(I'd like to monitor the amount of forwarding rejections but i can't think of a good way to dig the information out of our Exim logs, since mailing lists generally change the envelope sender address. This makes it tempting to have our inbound SMTP gateway do DMARC checks purely so I can see how many incoming messages fail them.)
PS: writing this entry has been a useful exercise in thinking through the full implications of our setup, as I initially forgot that our anti-spam filtering would invalidate DKIM signatures under some circumstances.
At least partially understanding DMARC
DMARC is suddenly on my mind because of the
news that AOL changed its DMARC policy to 'reject',
following the lead of Yahoo which did this a couple of weeks ago.
The short version is that a DMARC 'reject' policy is what I
originally thought DKIM was doing: it locks
down email with a From: header of your domain so that only you
can send it. More specifically, all such email must not merely have
a valid DKIM signature but a signature that is for the same domain
as the From: domain; in DMARC terminology this is called being
'aligned'. Note that the domain used to determine the DMARC policy
is the From: domain, not the DKIM signature domain.
(I think that DMARC can also be used to say 'yes, really, pay attention to my strict SPF settings' if you're sufficiently crazy to break all email forwarding.)
This directly affects anyone who wants to send email with a From:
of their Yahoo or AOL address but not do it through Yahoo/AOL's SMTP
servers. Yahoo and AOL have now seized control of that and said 'no you
can't, we forbid it by policy'. Any mail system that respects DMARC
policies will automatically enforce this for AOL and Yahoo.
(Of course this power grab is not the primary goal of the exercise;
the primary goal is to cut off all of the spammers and other bad
actors that are attaching Yahoo and AOL From: addresses to their
email.)
This indirectly affects anyone who has, for example, a mailing list
(or a mail forwarding setup) that modifies the message Subject:
or adds a footer to the message as it goes through the list. Such
modifications will invalidate the original DKIM signature of
legitimate email from a Yahoo or AOL user and then this bad DKIM
signature will cause the message to be rejected by downstream mailers
that respect DMARC. The only way to get such modified emails past
DMARC is to change the From: header away from Yahoo or AOL, at
which point their DMARC 'reject' policies don't apply.
DMARC by itself does not break simple mail relaying and forwarding (including for simple mailing lists), ie all things where the message and its headers are unmodified. An unmodified message's DKIM signature is still valid even if it doesn't come directly from Yahoo or AOL (or whoever) so everything is good as far as DMARC is concerned (assuming SPF sanity).
Note that Yahoo and AOL are not the only people with a DMARC 'reject'
policy. Twitter has one, for example. You can check a domain's DMARC
policy (if any) by looking at the TXT record on _dmarc.<domain>,
eg _dmarc.twitter.com. I believe the 'p=' bit is the important
part.
PS: I suspect that more big free email providers are going to move to publishing DMARC 'reject' policies, assuming that things don't blow up spectacularly for Yahoo and AOL. Which I doubt they will.
2014-03-30
One of my worries: our spam filtering in the future
I've mentioned in the past that we rely on a commercial anti-spam system for our spam filtering. What I haven't mentioned is that it isn't supported on and doesn't run on any version of Ubuntu after Ubuntu 10.04 LTS. 10.04 is now rather long in the tooth and with the impending release of Ubuntu 14.04 it will fall out of support in a bit over a year. This doesn't leave us completely up the creek, as the vendor supports Red Hat Enterprise 6, but it does raise a concern: is the vendor still actually interested in this product?
(It's not as if the vendor is deliberately ignoring Ubuntu; the most recent Linux distribution that the vendor supports was released in 2011 (and that's Debian 6).)
Since I do have this concern, every so often I get to worry about how we'd replace this commercial package (either because of the vendor effectively dropping it or because of licensing problems, which have been known to happen). Right now the commercial system has three great virtues: it works quite well, it doesn't require any administration, and it's basically a black box. I suppose that it doesn't really cost us any money is a fourth virtue.
(The university has a site license, the costs for which are covered by the central mail system.)
There are probably other commercial options, but I don't know how much they'd cost or how well they work, and the thought of trying to evaluate the alternatives fills me with dread. I know that there are free alternatives (for both anti-spam and anti-virus stuff) but I suspect that they are not hands free and automatically maintained black boxes and I don't know how well they work. Evaluating the free options would be somewhat less of a hassle than evaluating commercial options (with free options there is no wrestling with vendors) but it wouldn't be a picnic either.
One part of me thinks that I should spend some time on keeping current with at least the free options for anti-spam filtering, just so I can be prepared if the worst happens. Another part of me thinks that that's a lot of work with no immediate payoff (in fact that doing the work now is probably a complete waste of time) and that I should defer it until we know we need a different anti-spam system, if ever.
I don't have any answers right now, just worries. So there you go.
2014-03-14
Guessing whether people will unsubscribe from your mailing lists
Suppose that you have an administrative mailing list (or mailing lists), you understand that people can always unsubscribe one way or another, and you want to have some idea if people are going to do so. Here is my modest suggestion on a simple question to ask yourself about the messages going to the mailing list: are the mailing list messages actionable?
(Alternately you've been forced to run some mailing lists that people can't officially unsubscribe from and you'd like some guess at how many people actually read the messages.)
'Actionable' is jargon, but it's useful jargon. An actionable message is one that causes people to do things. So, does your average message tell the average recipient about something that the recipient needs to do (or know) right now or very soon? If they do not, some recipients may find the messages interesting but I think that a lot of people won't and are going to drop them.
(But, you say, your messages are full of interesting things. That's nice, but look, would your 'interesting things' go even moderately viral on Facebook or Twitter or wherever, even among a limited audience? If the answer is 'of course not' then they are nowhere near as interesting as you think. Being genuinely interesting is a very, very high bar.)
Obviously, the more actionable to the more people the better off you are and the less actionable you are to fewer and fewer people, well, that's not good. Completely non-actionable cheerful messages from eg your Dean (or some other high manager of your choice) almost certainly go straight to the round file.
(The unfortunate but honest truth is that today we simply don't have a good communication system for these sort of newsletter type things, at least if they're supposed to be relatively private. Email has stopped being it for all sorts of reasons that I don't feel like trying to write down in this entry.)
2014-02-21
You should segregate different traffic to different mailing lists
In a comment on my entry about how people can always unsubscribe to things, billings wrote:
We actually run a couple mailing lists at work that are set up so you can't unsubscribe, because they are the official communications from the college to staff, students or faculty (each their own list). I argued that we should allow people to unsubscribe if they want, but university policy makes us have to maintain the list for stuff like emergency alerts and official messages from the Dean. [...]
It is my considered opinion that this is a bad idea. Why it's a bad idea is a direct consequence of how people can always unsubscribe and how these two examples are not like each other.
To be blunt, it's quite likely that a lot of people are going to be completely uninterested in official messages from the Dean. If they come out with any frequency, people are going to 'unsubscribe' from them in some way on their client. The more emergency messages resemble the Dean's official messages, the more likely that they too have been quietly 'unsubscribed' from (whether or not the user intended to do this). If you really want people to get emergency messages (and you probably do), this is a very bad thing.
What you want to do here is to differentiate the emergency messages
from the Dean's messages as much as possible. The more you
differentiate, the less likely that people will miss emergency
messages. Because of how people are generally going to do filtering,
you want the user-visible message headers to be as different as
possible, ie a different From:, a different To: and cc:, a
different completely Subject: (with no common prefix), and so on.
Since mail clients may at least potentially notice list-related
headers when assessing messages, you want to use a different actual
mailing list too.
This also applies to less drastic splits in purpose, of course. But don't split too finely, because then you make it too much work for people to unsubscribe from everything they don't want and they start using blunt hammers. To be honest there are only really three categories: lots of people are going to unsubscribe, you hope that only a few people unsubscribe, and you desperately want no one ever to unsubscribe. In an ideal world you'd have very few list splits beyond that (you might want to have a few different purpose-based splits in the middle category).
2014-02-18
People can always unsubscribe from your mailing lists
At work, every so often I get added to a mailing list where the people running the list are firmly convinced that it's so important that some or all of the recipients shouldn't be allowed to unsubscribe. There is no gentle way to put it: these people are operating under a tragic and dangerous misconception. The reality is people can always unsubscribe from your mailing lists. The only question is where they do it and if you find out about it.
Specifically, if you don't allow people to unsubscribe at the mailing list level, what inevitably happens is that people silently 'unsubscribe' in their mail client. Any modern mail reading environment offers several different ways of doing this, some of which you will like more than others. Unfortunately the way you will like most (or dislike least) is the one that users are the least likely to use.
The obvious and theoretically best way is to create a filtering rule that junks your mailing list messages (and if you're lucky, is precise enough that it only junks those as opposed to, say, all email that you ever try to send to them). The problem with this is that writing rules is kind of a pain in the ass so in practice the users are likely to use the easier approach, namely clicking whatever 'mark as spam' button their mail client has (it almost certainly has one). After they do this a few times, any competent modern mail client will silently start disappearing your new messages as spam. If you are lucky, this marking only affects mail to that particular user. If you're unlucky, what they're doing is helping to persuade a shared mail provider that your messages should be marked as spam for everyone.
In some modern environments the users don't even have to go this far. If they repeatedly delete your mailing list messages unread or barely read, their mail client is smart enough to learn from this and to start automatically categorizing your email as low-importance. Again, if you're lucky the system did this characterization narrowly enough to just take out the mailing list; if you're unlucky, it will take out more. The good news is that this will probably not reach out to affect other users, although you never know.
So on the whole you're much better off letting people unsubscribe even from mailing lists that you think are really important. At least that way you get to know if your audience agrees with you and you can rethink your efforts if they don't.
(In fact unsubscribes are a really good signal that your efforts are not working. You might say 'but if people unsubscribe we can't get them back with better content'; in practice this is almost certainly a delusion. Once someone has decided that your messages are not worth their time it doesn't really matter how they implement the decision because they're extremely unlikely to pay your future messages enough attention to reconsider it, even if they are just deleting them without any further consequences.)