Another stupid spider mistake

August 25, 2006

To follow up my earlier entry on this stuff, I just saw another stunned monkey moment:

  • you can't randomly add a trailing slash to URLs any more than you can randomly remove them.
  • especially when the URL includes a query parameter, because then you're changing the query. And that always works really well.

From the pattern of the stealth spider's requests, I think it is adding the trailing slash on any URL that doesn't end in a filename with an extension. This is stunningly braindead, as extensions are nothing more than a hack workaround so webservers don't need real metadata about what MIME type a file is.

(It also has other problems, like not properly resolving relative URLs that use '..'.)

This stunned monkey moment is brought to you by the idiotic stealth spider running from, which is claimed to belong to a 'Goo Khim Yeung' as (Assuming that the WHOIS information is accurate, which it isn't always.)

Written on 25 August 2006.
« More on the Solaris ssh stuff (part 3)
Please don't use session cookies »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Fri Aug 25 11:35:05 2006
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.