Wandering Thoughts archives

2006-03-09

Making things simple for busy webmasters

It's always nice when people's software saves me from having to wonder if they're up to no good by handing out obvious signs of it. Take, for example, the spate of people whose web crawling software advertises itself by having the User-Agent string of:

User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)

Evidently no one told them not to stutter. (There are a couple of variations in what they claim to be, but that one is the most common. Needless to say, no real User-Agent string (MSIE's included) has an extra 'User-Agent: ' on the front.)

The IP addresses that sourced these are scattered all over; a couple of them are (still) on the XBL, and a couple are in SPEWS.

(And I give bonus points to the person with the User-Agent string "W3C standards are important. Stop fucking obsessing over user-agent already.", which I stumbled over while scanning our logs today. I can certainly agree with the sentiment.)

Another good one is the stealth spider that sends a completely blank Referer: header, instead of omitting it; it stands out like a sore thumb in my log scans. This comes from all over, with 157 different IP addresses over the past 28 days or so, 50 of them currently listed in the XBL.

ObviousNogoodniks written at 16:42:15; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.