How to have your web spider irritate me intensely

June 15, 2006

It's very simple: put what should be in your User-Agent header into the Referer header instead. The next time I read my Referer logs, you're sure to provoke me into spasms of teeth-grinding irritation. I can only conclude that people pulling this stunt are attempting advertising through other people's public Referer logs.

(For bonus points, fetch my syndication feeds without any attempt at conditional GET.)

Today's offender is the 'Strategic Board Bot', run by strategicboard.com from the IP address 212.143.103.125 (a netvision.net.il IP address, but also where 'www.strategicboard.com' et al points). Since they aren't fetching our robots.txt either, they've earned an immediate listing in our kernel IP filters.

Strategic Board itself has no useful information in its WHOIS record and appears to be in the business of indexing and searching blogs (which makes their non-use of conditional GET all the more serious; anyone specifically pulling syndication feeds should be using it). Of course, they have no 'how to contact us about our robot' information that I can see in what poking at their web page I'm willing to do.

Strategic Board also wins extended bonus points because they didn't used to do this; they apparently just started yesterday. So they deliberately decided to 'advertise' by hijacking Referer and putting a mere 'SB' into their User-Agent string. (A couple of early requests had 'HTTP Remote File Test' as the User-Agent instead.)

Written on 15 June 2006.
« Dispelling a nightmare (a sysadmin tale)
/etc/inittab versus /etc/rc.d »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Thu Jun 15 14:20:41 2006
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.