Wandering Thoughts archives

2010-12-29

Spam as a tax on public participation in open source projects

One of the things that has struck me lately is that spam has become an implicit tax on publicly participating in various open source projects. The mechanisms of this are fairly simple: if you have a sufficiently popular open source project, the spammers are sitting there mining the project mailing lists and their web spinoffs for email addresses and then spamming the heck out of them. There's probably also spammers mining web-based bug trackers too, where bug trackers expose this information.

(Actually, this is probably a simplification. Based on my personal experience, it seems more likely that there are people harvesting fresh 'hot' addresses from these mailing lists and then selling them to spammers, primarily advance fee fraud spammers.)

Open source projects are particularly susceptible to this because they still make heavy use of public mailing lists and so reveal your email address when you take part in them. 'Taking part' can be quite minor; just briefly dipping into the conversation at the wrong time can be enough to get your address harvested.

This is a tax because spam degrades the usefulness of an email address. The more spam an address gets, the more that will be missed by whatever automated anti-spam defenses the address has and thus the more you'll have to deal with personally; the end stage is the address becomes dead.

The obvious workaround is to use a revocable email address whenever you need to participate in public (and expect to revoke it periodically and switch to a new one). One of the problems with this is that it damages the long-term usefulness of old mailing list messages (and email addresses in VCS commit logs and so on), since many of the address in them will no longer be valid.

PS: possibly I am over-generalizing from my experience, but I don't really think so. Alternately, perhaps regular participants develop a thick skin for spam (or you have to have a thick skin for spam in order to be a regular participant).

SpamParticipationTax written at 00:59:53; Add Comment

2010-12-13

A program that I want to write: a 'sink' SMTP server

Mostly for historical reasons, my office workstation still runs its own mailer and I still get a very small amount of email to it. I get many, many more spam attempts, because for very many years (in the pre-spam days) it was the primary email address that I used and I used it widely. Over the years I've put up an ever-increasing set of anti-spam precautions that wind up rejecting almost all attempted SMTP connections.

(Technically I wind up dropping connections after sending a 5xx non-greeting banner.)

There's two drawbacks to this approach. The first is that there are a lot of mailers out there that don't like it when their SMTP connection attempt is refused this way and immediately retry it (some common Microsoft mailer, probably Exchange, is especially prone to this). These retries clutter up my logs and annoy me. Second, I'm curious enough that I'd like to know what sort of spam the spammers are trying to send me, or at least information like what addresses here they're trying to spam; after all, this could be a great way to build up an interesting corpus of current spam.

What I need here is some sort of 'sink' SMTP server. This would be a very simple SMTP server that would cheerfully accept more or less anything you told it, log it all, and reply with a 5xx error after the end of the DATA phase (if mailers still explode at this I might make it lie and accept it with a 2xx). Since I expect to get a decent amount of spam (some of it in very bursty waves) I would like this server to be reasonably efficient, even in the face of vaguely hostile clients.

Since I still get real email, I can't have this be a normal server that listens on the SMTP port and just accepts connections (the real email has to go to the real mailer's SMTP agent). Fortunately I already do most of my anti-spam precautions in an inetd-like frontend, so I can have the frontend pass 'spam' connections to the sink SMTP server instead of just rejecting them (by passing the new file descriptor to the sink SMTP server over a Unix domain socket).

Of course, I'd love to use this to try out one of the languages I'd like to learn. Now that I've had it pointed out to me by a commentator, node.js is the obvious candidate; it's practically built for this, plus it already has support for passing file descriptors over Unix domain sockets. Go is possible and I'd definitely like to explore it, but it doesn't have support for file descriptor passing and it's not clear how to add it.

SinkSMTPServerDesire written at 01:53:48; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.