A program that I want to write: a 'sink' SMTP server

December 13, 2010

Mostly for historical reasons, my office workstation still runs its own mailer and I still get a very small amount of email to it. I get many, many more spam attempts, because for very many years (in the pre-spam days) it was the primary email address that I used and I used it widely. Over the years I've put up an ever-increasing set of anti-spam precautions that wind up rejecting almost all attempted SMTP connections.

(Technically I wind up dropping connections after sending a 5xx non-greeting banner.)

There's two drawbacks to this approach. The first is that there are a lot of mailers out there that don't like it when their SMTP connection attempt is refused this way and immediately retry it (some common Microsoft mailer, probably Exchange, is especially prone to this). These retries clutter up my logs and annoy me. Second, I'm curious enough that I'd like to know what sort of spam the spammers are trying to send me, or at least information like what addresses here they're trying to spam; after all, this could be a great way to build up an interesting corpus of current spam.

What I need here is some sort of 'sink' SMTP server. This would be a very simple SMTP server that would cheerfully accept more or less anything you told it, log it all, and reply with a 5xx error after the end of the DATA phase (if mailers still explode at this I might make it lie and accept it with a 2xx). Since I expect to get a decent amount of spam (some of it in very bursty waves) I would like this server to be reasonably efficient, even in the face of vaguely hostile clients.

Since I still get real email, I can't have this be a normal server that listens on the SMTP port and just accepts connections (the real email has to go to the real mailer's SMTP agent). Fortunately I already do most of my anti-spam precautions in an inetd-like frontend, so I can have the frontend pass 'spam' connections to the sink SMTP server instead of just rejecting them (by passing the new file descriptor to the sink SMTP server over a Unix domain socket).

Of course, I'd love to use this to try out one of the languages I'd like to learn. Now that I've had it pointed out to me by a commentator, node.js is the obvious candidate; it's practically built for this, plus it already has support for passing file descriptors over Unix domain sockets. Go is possible and I'd definitely like to explore it, but it doesn't have support for file descriptor passing and it's not clear how to add it.

Comments on this page:

From at 2010-12-13 11:10:41:

You could probably do this with postfix - there's a hook for applying a policy at the end of DATA:


- zdw

By cks at 2010-12-13 11:29:39:

I suspect that the tricky bit with a conventional mailer would be getting the message itself saved away. Conventional mailers will log everything else, but they're not really oriented towards saving the DATA data if they're rejecting the message.

From at 2010-12-13 18:48:49:

Isn't this what milters are for? You do specify you need to save the message body for review, but perhaps you could focus on writing a custom milter (then plugging into Postfix, Sendmail, etc) for this type of behavior rather than writing your own SMTP server?


By cks at 2010-12-14 11:46:58:

There's two things about trying to use an existing mailer. Well, three things.

First, obviously I'd like to use this as a real project to learn a new language with. Using an existing mailer, even with a milter I write myself, more or less defeats that.

Second, I'm pretty confidant that handling the actual SMTP protocol is pretty simple if I don't want to support much more ESMTP than I need to accept EHLOs and don't care about things like checking valid addresses. To the extent that it's work, it's an interesting way of exploring a new language.

Finally, I don't think that any existing mailer can be used by passing it file descriptors over Unix domain sockets. I would probably have to invoke it inetd-style, which means a new process for each connection. I'd really like to avoid that for load reasons.

python -m smtpd -n -c DebuggingServer localhost:1025
Written on 13 December 2010.
« One problem with the files-in-directory approach to configuration
Fumbling towards understanding how to use DRBD »

Page tools: View Source, View Normal, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Mon Dec 13 01:53:48 2010
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.