Wandering Thoughts: Recent Entries

2012-01-29

Thinking about spam rejection and abuse addresses

Somewhat recently we got a spate of spam messages to our abuse address, which set me to thinking about the mostly theoretical issue of how to treat email to it.

(It's a mostly theoretical issue for us because the volume of spam and other email to our abuse address is very low in general, so we're not at all likely to change anything about it.)

On the one hand, visible spam rejection of email to abuse addresses is one of the things that really gets on people's nerves; it's famous for rejecting real spam complaints because, of course, they contain spam. Your spam, that people are trying to complain about.

On the other hand, email to abuse is going to go through our spam scoring system and get tagged if the system thinks it's spam. Pretty much everyone here either discards spam-tagged email outright or filters it to a separate folder. My mail filtering deliberately excludes email to abuse (among a few other things), but I don't know if anyone else either bothered or even thought of it; it's not necessarily something that comes to mind when you're setting up personal email filtering.

And finally, I can't think of any actual real email to our abuse address that we've gotten in the last five years or so (since I moved to here). It's all been spam. So as a practical matter, any filtering or rejection that we do on abuse email is unlikely to affect real complaints, because we don't get real complaints (hopefully because our users and machines don't generate spam, as opposed to people just not complaining about it).

(The other aspect of email to our abuse address is that I suspect most people are going to complaint to the central university-wide abuse address instead of abuse at our specific subdomain. The central people will then get in touch with us through our internal contact address, not our abuse address.)

This is of course a specific instance of the general spam rejection versus spam filtering dilemma. If you reject email people at least know; if you filter, there's at least a theoretical chance that you'll recover from filtering mistakes. The stakes are higher for the abuse address because it is one of the addresses that has a very high chance of false positives (non-spam classified as spam).

The most pragmatic thing to do in a situation like this is to apply spam-filtering to your abuse address. This blackholes real spam to keep it from bothering people while carefully not saying anything to real senders who had their messages misclassified. But this pragmatism sort of bothers me because it's lying to real senders just to pacify them (their email is being ignored either way but you're deliberately doing it silently so they don't know). It would be more honest to use spam rejection on the abuse address, and it might do some good to reduce the level of spam. If legitimate email to your abuse address really is vanishingly rare, it also shouldn't affect very many people.

So what's the right answer? I have no idea.

(My current approach of exempting the abuse address from my personal filtering would not be viable if it got a lot of spam. At that point I would probably remove the exemption and let spam-tagged email to the abuse address get quietly filtered away, mostly because it's easier than trying to persuade everyone that maybe we should do spam rejection for email to abuse.)

AbuseRejection written at 02:24:38; Add Comment

2012-01-08

The latest annoyance with Google Groups

The Google Groups spam attempts continues to roll in. This recently led me to yet another unpleasant discovery about Google Groups: as far as I can tell, there is no way to unsubscribe from a Google Groups mailing list (at least through the website). At least there's no way from the outside; it might be possible if you made a Google Groups account for yourself under the email address that is being spammed. For all of the obvious reasons I have no interest in doing that.

At this point I really don't know whether Google is evil or merely indifferent, and it doesn't really matter which. Providing people with no way to unsubscribe is yet another total failure of anything approaching responsible mailing list management. It's also a complete spammer (non-)feature; spammers never bother to implement unsubscription because of course they have no interest in ever seeing addresses go away.

Sidebar: how the spamming appears to work

I did some Groups searching using message IDs that I had, and it appears that the spammer is slightly sophisticated in their use of Groups. They have a 'distribution' mailing list, which is what I and more than 20,000 other people are on, and then they have a series of small, low-activity 'feeder' lists, which the main list is subscribed to. They send the spam to today's feeder list, the feeder list passes it to the main list, and then the main list spams us. I suspect that the feeder lists are all owned by different Google Groups identities than the main list.

This is an obvious exploit of automated anti-spam systems. From the perspective of a dumb system, clearly the problem is today's feeder list; it must have been set up by a spammer who is exploiting a legitimate main list. So eventually the system flags or does whatever to the feeder list but leaves the (apparently) innocently exploited main list alone, which just causes the spammer to make a new feeder list. This may seem stupid, but you can see why doing this the other way around would allow spammers to exploit an automated system to close down mailing lists that they don't like. Of course the real problem here is the automated abuse handling system, because you can't handle abuse reports entirely with automation.

GoogleGroupsNoUnsub written at 01:29:39; Add Comment

2011-12-29

An empirical exploration of whether spammers take Christmas off

There's a vague mythology that spammers take holidays off, or at least some spammers do. I decided I was curious enough to see if this was true, at least as far as our own data goes, so I looked at our filtering systems's statistics. This actually gets into interesting issues, because Christmas fell on a weekend this year plus you'd expect some degree of day to day fluctuation.

(For bonus complication, a month ago includes American Thanksgiving, which spammers might also partially take off.)

The short answer is that spammers do not appear to take Christmas off, although they may take weekends off. Neither the 24th nor the 25th had the lowest spam levels for Saturdays and Sundays since November 20th, although both had relatively low levels.

  • Saturdays: 6770, 7680, 6210, 7110, and 6640 spam messages.
  • Sundays: 5810, 6300, 5970, 6680, 7400, and 5990 spam messages.

Weekends are slower than weekdays in general; weekdays average 7910 spam messages per day while weekends average 6600 per day (Saturday's average is higher than Sunday's by 500 messages or so, at least over my sample range).

I admit that I'm surprised by this result, especially the difference from the short term versus the long term perspective. If I had just looked back at the previous week I would have confidently said that spammers took the Christmas weekend off, but looking back further strongly suggests that it's just regular fluctuation.

All sorts of cautions apply here. Our numbers are small and thus potentially noisy. In addition, spam volume changes over time so going back too far has dangers and it's going to be hard to tell fluctuations in weekday to weekday volume from overall volume changes. There are probably statistical techniques to get useful information despite all of this, but I don't know of them (I don't even know enough gnuplot to throw the raw data into a plot to see if any patterns jump out).

(Gnuplot is one of those things that I really should learn sometime but I keep never getting to it.)

ChristmasSpamLevels written at 01:14:03; Add Comment

2011-12-19

An advance fee fraud spam aphorism

Here's an aphorism:

Any tragedy or political turmoil will be immediately seized on by advance fee fraud spammers as part of their come-on messages.

Be it the invasion of Iraq, a tragic tsunami, a big airplane crash, or the overthrow of the despotic Libyan government, you can be confident that soon your email will have messages from, say, the relative of a former regime member who needs your help to get some money out of the country.

This is a fine aphorism except that once I started actually looking at it, it looked less and less true. While it's true that this sort of thing happens quite often in reaction to world events, not any old tragedy and turmoil will do. For example, take Japan's Sendai quake and the subsequent Fukushima nuclear events. Under normal circumstances, this would be advance fee fraud gold; you could use it to spin all sorts of tragic tales to hook in marks. But I don't think I've seen a single English language spam that talks about it. My guess as to what makes the Japanese tragedy bad for advance fee fraud spammer is pretty simple: Japan is a prosperous first world country. It's pretty implausible that a person in Japan would need to reach out to someone outside their country for help, implausible enough to make potential suckers wake up and realize something's wrong.

Another recent tragedy that I haven't seen show up in advance fee fraud spam (at least not yet) is the flooding in Thailand. My theory here is that these floods are not big and well enough known to make good advance fee fraud bait. If your targets have never heard of the tragedy you're trying to exploit, it's not really helping to engage their sympathies and their belief.

So for the purposes of the aphorism, it's more that any sufficiently large third world tragedy or turmoil will be seized on for advance fee fraud come-ons. But, as usual, that makes the aphorism less sharp.

(On that note, I wonder if we're about to get a rush of North Korean related advance fee fraud spam in the wake of Kim Jong-il's death.)

AdvancedFeeEvents written at 00:25:50; Add Comment

2011-11-20

Google Groups fails both anti-spam and basic mailing list management

Back in September I wrote about a spammer who was using Google Groups to send out their spam. Google Groups let them add one of my addresses to a very large list and then was perfectly happy to be used to broadcast their spam messages through the list. Well. Guess what. The spam is still happening; generally two messages a day, or at least two delivery attempts a day. This is what you could politely call a total failure on Google's part from two perspectives, because none of these delivery attempts have ever succeeded.

None. Zero. That's right; for months, my system has been rejecting at SMTP time every delivery attempt from this Google Groups mailing list. This is now far more than allowing a spammer to use their services to email me (and others); it is now clear that Google Groups fails basic mailing list management practices. One very fundamental basic practice of mailing lists is you stop mailing addresses that bounce, most especially if these addresses have never had any successful deliveries, ever, which is the case here.

Ignoring rejections and bounces is not a sign of good mailing list management software, given that automatic handling of bounces and automatic unsubscription of bouncing addresses has been a basic feature of mailing list software for somewhere around a decade or more. However, it is a red letter sign of spammer mailing list software.

I am not sure I believe that Google has consciously turned Google Groups into a tool for spammers; I would certainly like to think better of Google and it seems hardly worth Google's time, all things considered. But the alternative is to conclude that Google Groups is written and run by people who are incompetent (on one level or another), and that seems equally uncharacteristic of Google.

GoogleGroupsFails written at 01:09:51; Add Comment

2011-11-02

Attention marketers: blog comments are not email

This is another one of those entries that I shouldn't have to write and that will never be read by the people who need to read it, but I'm going to tilt at windmills today. I am doing so because today, someone left a more or less marketing comment on a random entry here in an attempt to get in touch with me.

Let's skip the whole marketing side of this and go straight to the problem, which is that blog comments are not email. I don't mean that in just a direct sense, I mean that in that they have completely different purposes. Email is a private message to a single person. At least in theory, a blog comment is part of a public conversation that is related to the entry that it is attached to (it is a reaction, a comment, or so on).

(This distinction matters to me because I want other people who read the comments here to get something of value from doing so. That's really the point of having public comments instead of encouraging people to email me; the resulting conversation and notes are visible in public for other people.)

A comment that is not part of this public conversation is noise, not signal, and simply by existing it invites people to waste their time by reading it. Whether or not they realize it, people who leave 'comments as email' comments are adding noise to your blog; they are willing to inconvenience all of your readers in order to get your attention.

Well, they got my attention, but probably not in the way they wanted. I do not like having noise added to my blog. When it happens, I do not think well of the people behind it, especially when they could just as well have sent email instead.

(It is my personal opinion that all of this is intuitively obvious, even if people cannot necessarily immediately explain why c-o-e comments are bad.)

(And of course I remove the comment just like I remove other forms of noise, such as spam. Note that a c-o-e comment is not necessarily comment spam as such, unless you have a very broad definition of 'comment spam'. This doesn't make it non-spam either; much like a similar email, it depends on whether or not it was done in bulk. Mind you, I suspect that often it will be.)

BlogCommentsVsEmail written at 01:41:24; Add Comment

2011-10-05

Understanding the motivations of mail service vendors

From a comment on my entry about how modern mailing services should work:

Like any paid service, if they are not providing good service, like letting spam through, people will move to another one who does block spam better. This is their incentive to deal with spam.

This is a common misunderstanding of the incentive structure in operation.

This is not an incentive to deal with spam, this is an incentive to not get blocked. The two are very much not the same thing. Very few people who use a mailing service care very much about whether or not it sends spam; instead, they care about whether it can deliver their email mailouts (and do so without having them scored as spam by the recipients). No one is actually paying the mail service to not deliver spam, so it has no direct incentive to do so; if anything the straightforward incentives go the other way, towards giving spammers the benefit of the doubt so that they keep paying.

(There is some indirect incentive, in that people sort of care about your overall reputation and your overall reputation is affected if websites have lots of stories about how you are a big spam source and have a spam problem.)

For many mail service providers, trying to block spam is one of the most cost effective ways of keeping your mail delivery rates up. However, if you are big enough you become like GMail, Hotmail, and Google Groups; you are too big and popular for anti-spam systems to block unless you become a truly epic emitter of spam. At this point providers lose much of their business incentive to block and reduce spam.

(Note, however, that every mail service provider does less than they could to block spam through their service because every MSP knows that if they adopted full best practices, they would go out of business because customers would find them unattractive. This should not be surprising.)

Nor do mail service providers have a real incentive to refuse to do business with all spammers. Some spammers make unattractive customers (they are unsavory, a big support burden, may attract legal attention, and are high risks to cheat you on payment). However, some spammers make very attractive customers; they pay on time and well, they don't cause you support headaches, they're reputable businesses, and so on. They just send UBE and want you to send it for them. See many examples (and also).

You may appeal to moral qualities as part of the 'incentive' that providers have to deal with spam, ie that they really intrinsically care about not sending spam and are not merely doing it because of business needs. However, mailing service providers (and their employees) have consciously chosen to enter a line of business that draws spammers to it and intrinsically enables spam. It's hard to avoid the conclusion that they consider some degree of spam a necessary evil in order to make money; from there, to misuse an alleged Winston Churchill witticism, it's only a question of how much spam.

MailerMotivations written at 02:05:44; Add Comment

2011-10-03

My idea of how a modern mailing service should work

From one perspective, I can totally understand why small companies want to outsource handling outgoing mail to a dedicated mail provider. The days when you could just install a MTA, plug in some settings, and be done are long over; these days doing a decent job of sending mail and getting it delivered to as many places as possible requires a significant amount of specialized expertise, and the expertise goes up if you want to use HTML mail. You could learn all of this, but why? It's better to outsource and let full-time specialists handle it for you.

On the other hand, as a sysadmin on the receiving end of these mail services I have some issues. Specifically, they get abused by spammers and they have a strong incentive to spend as little money as they can get away with on preventing this (money spent preventing spam is pure expense). On average, the only contact I have with a mailing service is being sent some form of spam (there are many mailing services and I don't sign up with very many places that use them).

Thus I have formed a theory about how such a modern mailing service should work: normally and by default it should proxy outgoing email through your server, using a dedicated proxy agent (not an MTA that you set up). All of the hard work would still be done by the mailing service on their machines and you would continue interacting with them as normal; it's just that the final delivery would emerge from your machine, on your IP address, instead of directly from one of their IP addresses.

The advantage for everyone is that this would make your mail unambiguously your mail, and avoid any contamination with other people who are also using the mailing service provider. The mailing service provider would effectively become less of a provider of mail and somewhat more a provider of mail handling software (and expertise), software that just happened to run on their servers as a service.

This clearly doesn't work for everyone in all situations, so the mailing service would still have an option to send out the mail for you. But I think that 'the mail comes out your IP address' should be the default starting case.

(Since this is the era of running companies out of AWS, it's possible that I'm drastically underestimating how many people would need the mailing service to send out email for them; maybe you simply can no longer assume that people have dedicated IP addresses in address space that hasn't been badly abused and contaminated.)

ModernMailingServiceIdea written at 01:28:37; Add Comment

2011-09-25

Danger signs for mail senders in SMTP conversations

This is another one of those entries that I write for people who are never going to read it, but I don't care; I just feel like pointing out the relatively obvious.

Suppose that you are someone who runs a mailing list service. Like everyone else who offers such a service, spammers will attempt to (ab)use it. Thus, one of the important things that you need to do is detect signs that you have a spammer's mailing list, and these days you certainly can't count on abuse complaints to tell you this.

As I've mentioned before, SMTP time rejections can be an important signal. The corollary of this is that the kind of SMTP rejection matters, and in particular you should really pay attention to MAIL FROM and DATA rejections and consider them a significant warning sign. This is because there are many fewer reasons for rejecting at those stages than for rejecting at RCPT TO time so if your mail is rejected then, well, there's any number of explanations besides 'it's spam'; the user's account could have expired, for example.

(And, let us admit, a disturbingly large number of mail systems have temporary glitches that cause equally temporary RCPT TO failures. This is why real mailing list management software pretty much never automatically removes addresses on a single RCPT TO failure.)

Since they don't have these relatively innocent explanations, mail rejections at MAIL FROM or especially from DATA are often signs of something serious going on. In particular a permanent failure at DATA time almost invariably means that the recipient's system really dislikes the message for some reason; if you're running a mailing list service, the usual case is that it's spam. A MAIL FROM rejection can have more innocent explanations, including a misconfigured MTA on the other side, but it is still more of a danger sign than a RCPT TO rejection.

(A significant volume of RCPT TO failures is still a danger sign, in part because it means that either the list of addresses is old or that the mailing list was badly maintained before it moved to your service. And if a mailing list has a few good mail-outs and then suddenly its RCPT TO failures spike upwards significantly, well, that's a bad sign itself. It could be that a whole bunch of user accounts just coincidentally got expired or filled up, but it's more likely that a bunch of anti-spam systems that reject at RCPT TO time suddenly woke up.)

Of course, all of this presumes that you are trying hard to run a 'clean' mailing list service instead of any of the various alternatives. I'm not convinced that there is or can be any such thing these days, as convenient as it would be for modern web applications if there was.

SMTPDangerSigns written at 01:31:13; Add Comment

2011-09-24

Some recent Google spam problems

With my anti-spam hat on, I'm not a fan of Google. It's not that they're active spammers themselves, it's that their services are not infrequently used to send spam in various ways and Google is famously indifferent to it, or at least apparently indifferent, and has been for some time.

(I've felt for years that there was basically no point in trying to file any sort of anti-spam report with Google because it would just go into a black hole, assuming that they didn't reject it outright.)

Every so often spammers find a new way to exploit some Google property for spam purposes, and I get to be irritated at Google all over again. There have been two particularly noteworthy incidents relatively recently, one from phish spam and one from a spammer for hire. The phish spam issue is due to Google Docs, as opposed to any of Google's mail-sending properties. Google Docs allows you to make forms and then collect replies to them (in a Google Docs spreadsheet, apparently). Oh, and you can evidently supply a significant amount of styling to these forms if you want. You can guess how phish spammers can put this feature to use, especially since one of the perennial problems for phish spammers is hosting the phish form somewhere reliable and then collecting the results. Given that phish spammers continue to do this, I expect that Google Docs is also reliable in not closing them down.

The second and even more irritating recent spam incident is that a 'spam for hire' outfit in the middle east appears to have worked out how to use Google Groups and other Google services to actually host their mail lists and do their spam mailouts. Google evidently allowed them to import a huge mailing list (over 20,000 addresses), did not make any attempt to confirm addresses, and now lets them send plenty of spam to and through it. Of course there is no 'report this mail as spam' link in the messages and I know better than to bother trying to find any abuse contact for Google Groups.

(I know how many addresses it has because the spam mailing list is a public Google Group so you can see its 'subscriber' count if you look it up. Since the spam list they put me on is helpfully called 'total005', I can also make some decent guesses at the existence of other ones.)

Now that one spammer has blazed this trail, I expect that I can look forward to blocking lots of other Google Groups in the future. (I block them on my end, not by using any Google feature. At this point I have no trust in anything Google Groups might offer to keep me from getting spammed.)

RecentGoogleSpam written at 00:35:11; Add Comment

These are my WanderingThoughts
(About the blog)

GettingAround
Full index of entries
Recent comments

This is part of CSpace, and is written by ChrisSiebenmann.

* * *

Atom feeds are available; see the bottom of most pages.

This is a DWiki.
(Help)

Categories: links, linux, programming, python, snark, solaris, spam, sysadmin, tech, unix, web

Search:
[There's more, starting at 2011/08/11 or Previous 10]
(Previous day)
By day for January 2012: 8 29; before January.

Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.