A surprise discovery about procmail (and wondering about what next)

September 12, 2018

I've been using procmail for a very long time now, and over that time I generally haven't paid much attention to the program itself. It was there in the operating systems I used, it worked, and so everything was fine; it was just sort of there, like cat. Thus, I was rather surprised to stumble over the 2010 LWN article Reports of procmail's death are not terribly exaggerated (via, sort of via, via, via Planet Debian), which covers how procmail development and maintenance had stopped. Things don't exactly seem to have gotten more lively since 2010 (for example, the procmail domain seems to have mostly vanished, and then there's the message from Philip Guenther that's linked to from the wikipedia page). This raises a number of questions.

The obvious question is whether this even matters (as LWN notes in the original article). Procmail still works fine, and just as importantly, it's still being packaged by Debian, Ubuntu, and so on. There are outstanding Debian bugs, but Debian appears to also be fixing issues in their patches (and there's a 2017 patch in there, so it's not all old stuff). While we have quite a few users that depend a lot on procmail and we'd thus have real problems if, say, Ubuntu stopped packaging it, this doesn't appear likely to happen any time soon.

(Actually, if Ubuntu dropped procmail our answer would likely be to start building the package ourselves. It's not like it changes much.)

But, well, procmail is sort of Internet software, and I've said before that Internet software decays if not actively maintained. Knowing that procmail is only sort of being looked after does make me a little bit uncomfortable. However, this raises the question of what alternatives I (and we) would have for equivalent mail filtering systems. Many people seem to use Sieve, but I believe that has to be integrated into your MTA instead of run through a program in the way that procmail operates, and I don't think it can run external programs (which is important for some people). The closest thing to procmail that I've read about is maildrop, but it's slightly more limited than procmail in several spots and I'm not sure it could fully cover the various ways people here use procmail for spam filtering and running spam filters.

Exim itself has its own filtering system (documented here). These are more powerful than Exim-based Sieve filters (they can deliver to external programs, for example) but of course they require Exim specifically and couldn't be moved to another mailer. They're still not quite as capable as procmail; specifically Exim filters can't directly write to MH format directories (which matters to me because of how I now do a bunch of mail filtering).

We've historically declined to enable either Sieve based filtering or Exim's own filtering in our mail system on the grounds that we wanted to preserve our freedom to change mailers. In light of what I've now learned about procmail, I'm wondering if that's still the right choice. We also don't currently have maildrop installed on our central mail machine (where people already run procmail); perhaps we should change that as well, to give people the option (even if they most likely won't take it).

PS: A quick check suggests that we have around 195 people or so who are using procmail (in that they have it set up in their .forward), which is actually more than I expected. Not all of them are necessarily using our mail system much any more, though.


Comments on this page:

The best thing about maildrop is that when you haven't looked at the recipe in six months, it only takes a few seconds to figure out what you were doing and how you want to do a new thing.

Here's my "90% of everything anybody ever wants to do in a filter" example:

xfilter "/usr/bin/spamc -U /var/lib/spamassassin/socket"
     if (/^X-Spam-Status: Yes/:h)
     {   
       to Maildir/spam/
     }

`reformail -D 16000 duplicate.cache`
  if ( $RETURNCODE == 0 )
      exit

if (/^X-Facebook.*/:h)
      to Maildir/facebook/

if (/^Subject: \[Slack\] Notifications.*/:h)
      to Maildir/slack/

I've been seeing comments about Procmail's bitrot (I'm reluctant to call it death) for a while now too.

I'm probably one of the heaviest users of Procmail that I know. (I believe in making the computer do things for me.)

% egrep "^\s*:" ~/.procmailrc.d/* | wc -l
566

(Yes, I INCLUDERC multiple files in a subdirectory to make sorting & maintaining recipes easier.)

Almost all of them are doing a relatively simple test that I expect maildrop could also do. (I've not looked.)

I do use Message-ID caches within folders, but do not like to use them for all of my email because I want copies of messages from mailing lists not copies that were sent to me directly (via To: / CC: / BCC:). Seeing as how the direct copy almost always arrives first, a simple caching technique would do the opposite of what I want. So, I pipe copies of messages from mailing lists to external programs that will remove messages from my inbox that contain the message ID. (They actually move it from my Inbox to Trash and mark the message as read.)

I've also done things like taking copies of messages from RSS feeds (via rss2email) and piped them into custom scripts that would extract URLs for images and feed them into something that would pre-cache them in my local caching proxy. (This was mostly when I was on slow DSL and is not much of an issue now.)

One of the other probably strange things that I do with Procmail is have it apply a RegEx to email looking for ${Bounding}${KeyWords}${Bounding} and set a variable in Procmail. I do this multiple times, each match appending the topic to the variable. Finally, I call formail to add said topic(s) to the Keywords: header.

I add the X-Priority: header with a value of 3 (normal) if it's not there already.

I also have a number of mailing lists that I add the List-Post: header so that my MUA behaves the way that I like.

Suffice it to say that I'm quite entrenched in Procmail. I have yet to see anything that can qualify as a drop in replacement for me.

By James (trs80) at 2018-09-14 05:06:05:

Dovecot has its own LDA (although I see they now recommend the LMTP server version) that features a Sieve plugin with vendor extensions to call external programs. I can't see anything about MH support though.

I've recently seen FDM recommended, although I haven't looked at it yet myself. I suppose I should start said looking. My group makes use of it which, while light and relatively trivial, is currently integral to our workflow.

Have you seen my article on that? The gist is just use a basic MDA that dumps your mail into a Maildir spool; then you can write yourself one or more programs in your favourite language that kick the mail from there to wherever you want it to end up. The entirety of the job of such code is opening and reading files and then moving them, for which any language whatsoever will do, so the only concern is how far you want to library up your mail parsing.

By cks at 2018-09-19 23:31:28:

I like your approach in the abstract, but in practice I currently think it would have two flaws for me. The first is that I would have to put together (and probably create) a mail filtering environment that either interpreted filtering rules or directly embodied those rules in programs. I'm not terribly enthused about doing this myself; I really want a canned environment that someone else reliably maintains and that I just write rules for. The second is that my entire current mail environment is oriented around mbox format files and MH folders; nothing understands Maildir. So I'd be writing to Maildir, manipulating it, and then probably reconstructing mbox files so I could feed them to other tools, which feels at best awkward.

D’oh. You have mentioned MH so often, I should have remembered. No matter though – it is a one-file-per-mail format, just like Maildir, so all arguments and conclusions apply equally to it. Mbox OTOH would be a lot more annoying to deal with, that is true. (Esp. if you get mail delivered into Mbox files…)

As for a canned environment, there really isn’t much of it to can (given a one-file-per-mail format): looping over a directory’s entries, reading each and/or passing to a parser, and ultimately moving the file wherever it’s supposed to go. It’s just a couple lines of scaffolding, the rest of the code is rules. But instead of someone’s weird special-purpose language, you get to write them in whichever plain-jane language you use every day, so there’s never a question of whether you’ll be able to do some unusual thing, and much less of a question of whether you’ll be able to reconstruct its operation half a decade later.

I don’t know if this approach is right for you, but if it weren’t for the Mbox question, I’d suggest you test it out and see how it feels. With Mbox in the mix… could be that it’s not worth it then. For me, it has completely removed the quiet irritation with my mail filtering setup from my life.

Written on 12 September 2018.
« The Linux kernel's internals showing through in the specifics of an NFS bug
I don't like getters and setters and prefer direct field access »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Wed Sep 12 01:48:09 2018
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.