Wandering Thoughts archives

2025-02-11

A surprise with rspamd's spam scoring and a workaround

Over on the Fediverse, I shared a discovery:

This is my face when rspamd will apparently pattern-match a mention of 'test@test' in the body of an email, extract 'test', try that against the multi.surbl.org DNS blocklist (which includes it), and decide that incoming email is spam as a result.

Although I didn't mention it in the post, I assume that rspamd's goal is to extract the domain from email addresses and see if the domain is 'bad'. This handles a not uncommon pattern of spammer behavior where they send email from a throwaway setup but direct your further email to their long term address. One sees similar things with URLs, and I believe that rspamd will extract domains from URLs in messages as well.

(Rspamd is what we currently use for scoring email for spam, for various reasons beyond the scope of this entry.)

The sign of this problem happening was message summary lines in the rspamd log that included annotations like (with a line split and spacing for clarity):

[...] MW_SURBL_MULTI(7.50){test:email;},
PH_SURBL_MULTI(5.00){test:email;} [...]

As I understand it, the 'test:email' bit means that the thing being looked up in multi.surbl.org was 'test' and it came from the email message (I don't know if it's specifically the body of the email message or this could also have been in the headers). The SURBL reasonably lists 'test' for, presumably, testing purposes, much like many IP based DNSBLs list various 127.0.0.* IPs. Extracting a dot-less 'domain' from a plain text email message is a bit aggressive, but we get the rspamd that we get.

(You might wonder where 'test@test' comes from; the answer is that in Toronto it's a special DSL realm that's potentially useful for troubleshooting your DSL (also).)

Fortunately rspamd allows exceptions. If your rspamd configuration directory is /etc/rspamd as normal, you can put a 'map' file of SURBL exceptions at /etc/rspamd/local.d/map.d/surbl-whitelist.inc.local. You can discover this location by reading modules.d/rbl.conf, which you can find by grep'ing the entire /etc/rspamd tree for 'surbl' (yes, sometimes I use brute force). The best documentation on what you put into maps that I could find is "Maps content" in the multimap module documentation; the simple version is that you appear to put one domain per line and comment lines are allowed, starting with '#'.

(As far as I could tell from our experience, rspamd noticed the existence of our new surbl-whitelist.inc.local file all on its own, with no restart or reload necessary.)

spam/RspamdAtTestSurprise written at 22:41:41;


Page tools: See As Normal.
Search:
Login: Password:

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.