Wandering Thoughts archives


A fundamental problem with challenge/response anti-spam systems

A fundamental problem with CR is that it implicitly assumes that your correspondent wants you to read their email more than you do. Just look at who's doing the work: your correspondents are doing work so that you don't have to.

(Of course, observe that the people for whom this is the most true are the spammers.)

When this is not true, when you want to read the mail more than the sender wants you to, is exactly where CR systems break down. You might think that this is rare, but it's actually quite common; mailing lists are the obvious example. (Trust me; most mailing list managers are completely indifferent about whether the mail reaches you.)

CR proponents like to claim that whitelisting will solve this problem. It doesn't. The fundamental problem of whitelisting is that what you want to whitelist is abstract identities, like 'my friend' or 'my bank', but the only thing available is crude proxies like the email's origin address. And the relationship between the proxies and the real thing changes all the time.

This leads to an important rule:

People who use challenge/response systems should not expect anyone else to expend effort to get their email to them.

Or, the short version: your CR system dropping email is your problem, not mine.

(Disclaimer: and of course CR systems have fundamental problems on the real Internet. There's no way to avoid spamming random bystanders with challenge messages, since almost all spam has forged origin addresses.)

spam/CRProblem written at 23:26:26; Add Comment

The importance of printable objects

I have a small defect in the Python code I produce: I rarely bother to make my classes printable or to give them a repr(). Most of the classes will never be printed, and the default repr value is good enough to distinguish two instances from each other.

But this is a mistake, nicely illustrated by my grump about assert's weakness as a debugging tool. Objects having a useful string value makes it much easier to dump out information about the state of things when a problem comes up. You can cope without it and I usually have, but it's working harder than you should have to.

While the convention of making the repr value something that can be used to reproduce the object is nice, don't let it stop you from having a repr value of some sort. You're not really losing anything when the alternative is a '<foo.bar object at 0xdeadbeef>' thing, although you probably should make sure that you can still tell apart two instances that happen to have identical values.

(You can do without this if your objects have both an equality operator and a hash operator. With just an equality operator, you may someday wind up trying to figure out why an object is not found as the key in a dictionary when you can see that it's right there darnit.)

The default repr function for instances is more or less equivalent to:

"<%s.%s at 0x%x>" % (self.__class__.__module__, self.__class__.__name__, id(self))

The one wart is that id() should really return a Python long that's always positive, instead of an integer that's sometimes negative on 32-bit platforms. On 32-bit platforms you can mask id() with 0xffffffff to get the right value and avoid annoying warnings, but of course this blows up on 64-bit platforms.

python/PrintImportance written at 00:09:05; Add Comment

Page tools: See As Normal.
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.