Making things simple for busy webmasters
It's always nice when people's software saves me from having to wonder if they're up to no good by handing out obvious signs of it. Take, for example, the spate of people whose web crawling software advertises itself by having the User-Agent string of:
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
Evidently no one told them not to stutter. (There are a couple of
variations in what they claim to be, but that one is the most common.
Needless to say, no real User-Agent string (MSIE's included) has an
User-Agent: ' on the front.)
The IP addresses that sourced these are scattered all over; a couple of them are (still) on the XBL, and a couple are in SPEWS.
(And I give bonus points to the person with the User-Agent string
W3C standards are important. Stop fucking obsessing over user-agent
already.", which I stumbled over while scanning our logs today. I can
certainly agree with the sentiment.)
Another good one is the stealth spider that sends a completely blank
Referer: header, instead of omitting it; it stands out like a sore
thumb in my log scans. This comes from all over, with 157 different IP
addresses over the past 28 days or so, 50 of them currently listed in