The risks of spam filtering (part 1)
While spam filtering is 'dangerous' in that it can trigger on legitimate email, incorrectly classifying it as spam, there are different levels of dangerousness depending on what you do as a result of things triggering. In increasing levels of danger, there are three general things that people do:
- reject the email message during the SMTP conversion.
- discard the email message.
- bounce the email message back to the alleged sender.
The danger of the second option is obvious: the sender of a legitimate email message receives no indication that their email didn't reach the recipient. To them it looks just as if the recipient got it and is ignoring it, while to the recipient it looks like they never sent it in the first place.
The first and the third options both let senders of legitimate email know when their email didn't go through. The problem with the third option, and why it is the worst, is what happens with properly identified spam email. Most spam emails have forged sender information, which means that your mail server will be deluging innocent bystanders with what is effectively spam (to them); in the trade this is known as backscatter and makes people increasingly irate.
(Because of how spammers currently operate, rejecting email during the SMTP conversation is far less likely to do this, and if it does happen anyways it's not your fault because it's not your machine that is sending the bounces.)
Some spam filtering techniques don't explicitly reject email messages during SMTP conversations, but have a failure mode where your mail system never actually accepts the email and the sender's mail system eventually gives up on the message; the most well-known technique that can do this is greylisting. This is equivalent to rejecting the email during the SMTP conversation and has the same effects; if the sender is legitimate, they'll get a message that their email didn't go through, and if it's a spammer the message will probably just silently disappear.
(This is not unique to spam filtering; because modern mail systems insist that the domain of the sender address actually exists, persistent DNS issues can cause a similar 'defer until the sending machine times out the message' failure.)