Wandering Thoughts archives

2011-07-07

An interesting gotcha with Exim and .forward processing

Yesterday I described how Exim implements traditional .forward semantics where putting your own address in your .forward means 'deliver it to me, bypassing my .forward'. Because Exim is a mailer construction kit, this isn't a specific feature for .forward handling, it's a generic general feature that happens to give you this result.

So far, so good. Now, let's talk about our .forward-nonspam feature. In the abstract, this is just another .forward-style router that reads a different file and only triggers under some conditions. In concrete, we need several routers in sequence, each of them doing one step of the processing logic:

  1. if .forward-nonspam exists and the message is not spam, expand .forward-nonspam
  2. if the message is spam, .forward-nonspam exists, and .forward does not exist, discard the message
  3. if .forward exists, expand .forward

If you have both a .forward-nonspam and a .forward, the third rule will only be triggered for spam messages because your .forward-nonspam skims off non-spam messages first.

Well. Mostly. You see, although all three of these routers are conceptually a single block of .forward processing, Exim doesn't know this; as far as Exim is concerned, they are three separate and completely unrelated routers. Now suppose you put your own address into .forward-nonspam and also have a .forward, as you might do to create a simple 'put all non-spam email into my regular inbox and all spam mail into a file' system, and you get a non-spam message. Exim processes things until it reaches the first router, expands your .forward-nonspam, gets your address and restarts routing it, gets to the first router again, sees that the router has already handled this address, and only skips that router, not all three .forward-processing routers. So your address falls through to the third router, which says 'sure, you have a .forward, I'll handle this' and dumps the non-spam message into the file for spam email.

Oops.

The fix for this is to split the third router into two routers, one for the case where you do have a .forward-nonspam (where it would only handle messages that are explicitly spam-tagged) and a second one for the case where you have no .forward-nonspam (where it would handle everything). However, this requires an annoying level of repetition in the Exim configuration file.

(For technical reasons I think that you can't combine this together in a single condition on a single router that works quite exactly right.)

Sidebar: the technical reasons

The condition you need is 'if .forward exists and either .forward-nonspam doesn't exist or the message is non-spam'. Exim has special support for securely and correctly checking for file existence over NFS, but this support is only available in the require_files router condition. However, we need to use a condition check with a '${if ...}' string expansion to check 'is non-spam'. You can't or together separate router conditions (they are all implicitly and'd together instead), and the does-file-exist check that's available in a ${if expansion doesn't work the right way over NFS.

In theory you could get around this with various evil hacks involving Exim string expansion, maybe.

(Talking to myself: one could rephrase the condition as 'if .forward exists and, if the message is non-spam, .forward-nonspam doesn't exist' and then write this as a single require_files condition with a conditional string expansion in it.)

sysadmin/EximForwardGotcha written at 15:49:07; Add Comment

How Exim makes traditional .forward semantics work

Traditional .forward semantics allow you to put your own address in your .forward; this means 'deliver to me, bypassing my .forward'. As a mailer construction kit, Exim doesn't have any specific support for handling .forwards; it has some generic features that you can build .forward handling out of. As a consequence of this, it doesn't have any specific handling for this odd bit of .forward semantics and instead supports it in a generic way. I've mentioned this before in an entry on the power of Exim routers but I just pointed to the official Exim documentation for details, and the official documentation is a little bit opaque.

Each message that Exim handles starts out with some number of top level addresses, each of which is routed separately. In the process of doing this, individual routers may replace the current address with one or more new addresses (through, for example, expanding a .forward). Exim then normally tries to recursively route these new addresses just as if they were top level addresses, although it keeps track of the fact that they are 'children' of some address.

(With aliases and simple mailing lists and .forwards that forward mail to people who also have .forwards, you can have a many level chain of descendant addresses that were created from a single top level address.)

When Exim is doing this recursive routing for a particular top level address, it remembers which routers have already handled which addresses. Then if the address currently being routed is the same as one of its ancestor addresses and the ancestor address has already been processed by a particular router, Exim skips that router, acting as if the router was inapplicable to the address or wasn't there at all (instead of having the router re-process an address that it has already processed once); processing the address will fall through to the next router (or routers). In a typical Exim configuration, what's next after the router that handles .forwards is the router that sends people's mail to /var/mail/<user>.

This skipping of routers has to happen separately for each top level destination. If an email message is sent both directly to cks and to sysadmins, an alias that cks is on, you don't want cks to have one copy of the message handled by his .forward and another copy wind up in /var/mail/cks. Also, this skipping of routers is is completely separate from how Exim merges several copies of the same destination together and does only a single delivery to each unique destination (so that in this case cks's .forward will handle only one copy of the message).

(In fact the check has to be separate for each chain of address expansion. We need to be sure that this skipping is only triggered for genuinely recursive addresses and routers.)

In theory this skipping of routers applies to any type of router. In practice only a few of Exim's various types of routers can replace addresses with new addresses and so can possibly trigger this; most of the routers simply give destinations for addresses. At the same time, nothing restricts this to only happening to your router for .forwards; for example, an accidental alias loop will cause the alias handling router to be skipped in a similar way and the results there could be a lot more odd and peculiar (I suspect that one common result would be a 'no such user' error in addition to the message getting delivered to everyone on the alias).

One corollary of all of this is that it's potentially dangerous to create an address-expanding router that returns different results depending on stuff that can change during address routing; for example, a router that returns a different expansion based on the envelope sender address. Such a router won't get invoked a second time on the same address in a recursive situation, even if it would have returned a different, non-looping result. In its loop-breaking behavior, Exim implicitly assumes that every router return the same thing when recursively invoked on the same address.

(Exim does not literally memoize the result of evaluating the router for a given address, although it does cache and memoize the result of a lot of lookups that routers do.)

Sidebar: one way to get an alias loop

Suppose that you have a generic group alias, and a member of the group is going to be away. They think 'I know, I'll forward my email to the generic group alias to make sure that things get handled even if people email me directly'. The pernicious thing is that this appears to work if they test by mailing themselves, because then it's a .forward loop; the incoming mail goes .forward → alias → .forward, the .forward is skipped the second time around, and it all looks good. Only when the group alias is emailed directly does it become an alias loop (going alias → .forward → alias). Pick a rarely used group alias and it could be a while before this blows up.

PS: if you want to catch this in an Exim configuration, I think what you want is a second router that applies to all aliases and just errors them out with 'alias loop detected'. Assuming that both routers accept the exact same set of addresses in the same situations, the only time this second alias-handling router can trigger is if the first one is skipped for some reason, and generally the only way that that can happen is in the situation above, ie there's a loop.

(Disclaimer: I just came up with this idea and haven't actually tested it.)

sysadmin/EximForwardHow written at 00:49:29; Add Comment

These are my WanderingThoughts
(About the blog)

Full index of entries
Recent comments

This is part of CSpace, and is written by ChrisSiebenmann.
Twitter: @thatcks

* * *

Categories: links, linux, programming, python, snark, solaris, spam, sysadmin, tech, unix, web

This is a DWiki.
GettingAround
(Help)

Search:
By day for July 2011: 2 3 4 5 6 7 8 10 11 12 14 15 17 18 20 22 23 24 25 26 27 28 29 31; before July; after July.

Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.