== Some stats and notes on relay attempts for our external mail gateway After discovering [[something attempting some open relay checks OpenRelayChecks]], I got curious about whether this was a one-off or if there were clear signs of other open relay checks. To give you a spoiler, the answer is that I can't completely tell because there is a bunch of noise in my data (and on top of that I'm not sure how to analyze it), but it seems possible. What I can easily get from Exim's logs is triples of IP address, _MAIL FROM_, and _RCPT TO_ for rejected relay attempts. I have no good way to reconstruct these into sessions, so it's easy to tell someone connecting five times and making a single relay attempt each time apart from someone connecting once and trying a whole series of _RCPT TO_s. (I admit that somewhere around here it becomes very tempting to pour all of this data into SQLite and start doing ad hoc queries, because I could really use some _GROUP BY_ clauses right now.) My raw data covers about 90 days of logs and has 18,290 such triples. These relay attempts come from 1880 different source IPs; out of these, 540 IPs only occur once (so they connected, did a _MAIL FROM_ and a _RCPT TO_, got a failure, and gave up). Almost all of the origin/destination address pairs are unique (the big exception is _test@live.com_ and its Yahoo destination), but there is a little bit of duplication in _RCPT TO_ addresses (and almost none in _MAIL FROM_s). At a minimum there appears to be some well-written spam software that immediately gives up if it gets a relaying denied message, rather than try multiple _RCPT TO_s. The most active source IPs used multiple _MAIL FROM_s. For example, the single most active source IP used 23 different _MAIL FROM_s, almost all of them with multiple _RCPT TO_s. This I take to be genuine attempts to use us as a relay without particularly noticing (or caring) that none of them work. A few IP addresses tried repeatedly to forge valid local addresses as the _MAIL FROM_s on their relay attempts, perhaps in an attempt to increase the odds that we'd allow them through; the addresses were all administrative ones like _root_, _info_, _admin_, and so on. It's possible that these were relay probes, because they all seem to have had _RCPT TO_s of the same addresses (eg, one IP would try a whole bunch of different local _MAIL FROM_s, all _RCPT TO_'ing the same remote address). A few people tried the null sender as a _MAIL FROM_. (From [[previous stats CSLabRejectionStats-2011-04-26]] I know that spammers forge a lot of bad local usernames on their _MAIL FROM_s, although that may not be for relay attempts.) The top destination domains are mostly Asian. Counting only unique would-be recipients (of which there were 17500), the top five domains are: | 1806 | yahoo.co.jp | 1435 | hanta.co.kr | 395 | yahoo.com.tw | 271 | gmail.com | 264 | ezweb.ne.jp There were 3104 unique senders and their top five origin domains look sort of similar, but much more evenly distributed: | 255 | yahoo.co.jp | 202 | yahoo.com | 160 | ezweb.ne.jp | 158 | hotmail.com | 155 | docomo.ne.jp I think that this is as much random bits and pieces as I want to throw out right now. Part of my problem is that I'm not sure what useful or interesting statistics I can generate from this data, although it feels like there should be something interesting there.