Wandering Thoughts archives

2011-01-29

The amusement of minimalist spam

I just received the following extremely minimalist and to the point spam message:

This is to notify you of a payment Order of usd$8.5m created into Atm Card in your name from the skye bank, you are advice to reconfirm us 1 Name 2 Address 3 Phone 4 ID.

Some days, I like to think that the advance fee fraud spammers are just as tired of the whole thing as the rest of us are.

Tired416Spammers written at 00:27:15; Add Comment

2011-01-23

The question of how long our greylisting interval should be

One of the things that our frontend anti-spam system does is opt-in greylisting for people who don't mind the downsides of it. One of the decisions you need to make with greylisting is how long sending mail servers have to wait before you'll accept their message; right now, our system is configured with a delay of an hour.

It doesn't have this delay because I thought carefully about it, or did research. It has an hour delay because that's the standard default for the particular open source software for greylisting that we're using, or more specifically it was the standard default as of late 2006 or perhaps early 2007 when we first installed it. Even if the default was sensible then, a lot can happen on the Internet in a few years, especially in spam and anti-spam (which is a fast moving field in general).

All of this raises the question of how long a greylisting interval we should use today, and how to figure out what it should be. If we generously assume that no legitimate SMTP server will give up on retrying, what we want to study is basically the decay rate of the sending sources that do give up, which by assumption are bad sources. If we see that, say, 90% have given up after a minute, 95% have given up after five minutes, and 99% have given up after ten minutes, we can conclude that an hour of greylisting delay is a lot of overkill; we could turn it down to ten minutes and still get rid of almost as much spam while not delaying legitimate email anywhere near as much.

The greylisting daemon itself is the best place to capture this information; it already keeps track of first-seen and last-seen information for every greylisting record, and it could log these as it expires entries from its internal database. Unfortunately our greylisting daemon doesn't support doing this. I've considered trying to reconstruct this information from the Exim logs, but so far it's struck me as sufficiently annoying that I haven't looked into how to do it (which is laziness speaking).

(If you write a greylisting daemon, please include an option to log this sort of information.)

I'm also not sure how much useful data I can generate from our logs, since we only have a relatively small number of addresses that have opted in to greylisting in the first place. Unless we have greylisted addresses that are getting a sufficiently large amount of spam from a diversity of sources (in terms of programs and spam senders), looking at our current data might just give me biased answers.

(Ideally someone would have already done all of this analysis on a big site with a lot of email and thus a lot of spam from all sorts of sources and senders, and published the results. I'm not holding my breath on that one, partly because I suspect that any site large enough to generate interesting data is not going to share it because it's a competitive advantage.)

GreylistingTimeQuestion written at 01:04:34; Add Comment

2011-01-08

Finding out if you've been hit by careful, clever spammers

When I found my example of careful and clever spammers compromising a blog, I of course thought about letting the blog's owner know about the problem so they could clean it up. But the more I thought seriously about doing that, the more uncertain I got, because of the fundamental problem here: how do I make sure that I'm notifying the blog's owner instead of handing my email address to the spammer?

Let's turn this around to the flipside question: if you're a blog owner, how can you set things up so that people can notify you of such a compromise?

(Of course, the best thing to do is to detect such compromises somehow, perhaps through external monitoring. But this may not always be feasible.)

The usual modern way of handling blog feedback is with comments or a feedback form of some sort on your website. But smart people aren't going to want to leave you a comment through your already compromised blog; since the spammers compromised your blog, they could be blocking or filtering comments in addition to doing smart things with their spam. Even an email address on the blog's domain is too dangerous to use, since the spammers might have compromised more than just your website. Instead you need something that is not only off the blog but sufficiently far off the blog that people can see that it's unlikely to have been compromised.

But when you move to off-blog mechanisms, there's another problem; people may be able to trust a GMail address or the like not to be compromised, but how do they know it actually belongs to you instead of to the spammer? They certainly can't immediately trust anything on your blog; since the spammer controls it, the spammer could have changed your 'how to contact me' information to point off to one of their GMail accounts instead of yours.

So what you need is not just alternate paths to reach you, but alternate paths where they can see that they've reached the right person. Usefully, the modern web has a bunch of these in the form of all of those social network web services (Facebook, Twitter, etc), provided that you actually use your account on the services. Ongoing activity and participation will help to validate your account as real and (probably) not compromised and convince people that they have found the right person; conversely, an inactive account could have been registered by the spammer last week as part of an ever more elaborate scheme, or just belong to someone else with your name.

(I suspect that in practice, spammers just don't go to this amount of effort just for a compromised blog. In fact, I suspect that they generally don't even try to filter and block on-blog comments and feedback, unless it can be entirely automated. Of course I lack direct experience with this, so I could easily be wrong.)

SpamCompromiseNotification written at 01:42:14; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.