Wandering Thoughts archives

2007-02-24

Thesis: any server push technology inevitably breeds spam

Consider various server push technologies, where things come to you instead of you having to seek them out: email, instant messaging, voice over IP phone server, and even text messaging on cell phones. All of them have spam problems (generally growing).

This is not a coincidence. Any server push technology will get overrun by spammers, because server push inherently gives them access to people and is thus very, very attractive. As a consumer of server push technology, your only recourse from the onslaught is to hide, to block, to filter; you can't actually get away.

(The push technology provider can't keep all the spammers out, if only because sooner or later some of them are in its own marketing department.)

Client pull technology is much more resilient. The spammers have to be attractive to get you to visit even once, then genuinely interesting to keep you around, and you can easily get away. Thus it is a feature, not a problem, that things like syndication feeds do not have a server push option.

(And indeed much of the spammer activity in client pull technology like the web is about being attractive, for example getting a high Google search rank for some valuable keywords.)

spam/PushBreedsSpam written at 21:51:04;

The problem with browser minimum font size settings

Long ago, when I griped about Slashdot's redesign, Oscar del Rio left a comment suggesting that I use Firefox's minimum font size setting to cut this off. This isn't an approach that I like for a relatively simple reason: I'm willing to have some text set small, just not the main text.

I want websites to be able to set less important things in small font sizes, but I don't want them shrinking down the text I'm actually here to read. If I set Firefox's minimum font size up large enough that the main text is always readable, I completely take out the small font sizes on those less important things and they wind up too big.

(Yes, this is a hideously belated followup. Part of the fun of writing it was trying to figure out if I'd already written an entry talking about this; I ultimately resorted to trawling the archives with grep. Sometimes I am not the most organized blog-writing person in the world.)

web/MinimumFontSizeProblem written at 19:27:44;

Most world-editable wikis are doomed

The Linux iSCSI project keeps its documentation in a world-editable wiki. I should really say kept, because it's hard to find much usable documentation in the wiki at the moment; most of the pages are overgrown with wiki spam. Some pages have had a thousand edits in two days, all of them spam. All of this makes the project's wiki an unfortunately excellent illustration of why most open wikis are doomed.

The problem is that there are just more spammers out there automating their attacks than most wikis have people to fix the damage. Wikipedia survives because it has a critical mass of people who look after it, but it's an exception; very few wikis attract that many people. With a critical mass, you can block spammers and fix spam damage fast enough to discourage spammers and keep your wiki attractive; without it, you drown under a slowly rising tide of spam (and there is some evidence that existing spam attracts more spammers).

(It's not enough to have some dedicated people; you need to have enough that none of them have to spend too much time tending the wiki. Cleaning out spammers is drudge-work, and too much drudge-work burns people out.)

It's possible that the iSCSI wiki was so significantly hit because it doesn't use rel="nofollow" on external links. On the other hand, there's a fair amount of evidence that spammers just don't care about that and will hit anything within reach. And open-edit wiki pages are eminently within reach.

I don't have any answers for how a new wiki is supposed to survive long enough to (potentially) get a critical mass of users, although I wish I did. I just know that I can't imagine running an open-edit wiki myself if I had any choice in the matter, and I continue to be glad I didn't try to build one.

web/OpenWikiDoom written at 18:43:32;

Some things to remember when implementing __getitem__

I've recently been doing some work on classes derived from list and tuple that fiddle with the behavior of __getitem__, and ran into a couple of surprises that I am going to write down so that I remember them in the future:

  • for the simple 'obj[i:j]' case, __getitem__ is not called if the class has a __getslice__ method. list and tuple both do, despite the method being labeled as deprecated since Python 2.0.

  • the moment you use an i or j that is not a simple integer (including floating point numbers, but not including long integers), it turns into a call to __getitem__ with a slice object as the key. (This is simple slicing.)

Supporting the full slice syntax in a __getitem__ implementation makes my head hurt; you can get handed a slice object, or a tuple that can contain at least slice objects, ellipsis objects, and numbers (and probably more that I don't know about). Just throw TypeError instead; it's what lists and tuples do.

Checking that your __getitem__ has not been called with such things, so that you can throw TypeError appropriately, is harder than it should be. I personally wish that __getitem__ wasn't so generic; it seems un-Pythonic to have to inspect the type of your argument to figure out what to do with it.

(The better way would be to have one method to implement plain subscripting, one for simple slices, and a third one for extended slicing. Unfortunately it's too late for that now.)

python/GetitemSurprise written at 01:07:40;


Page tools: See As Normal.
Search:
Login: Password:

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.