Another stupid spider mistake

To follow up my earlier entry on this stuff, I just saw another stunned monkey moment:

  • you can't randomly add a trailing slash to URLs any more than you can randomly remove them.
  • especially when the URL includes a query parameter, because then you're changing the query. And that always works really well.

From the pattern of the stealth spider's requests, I think it is adding the trailing slash on any URL that doesn't end in a filename with an extension. This is stunningly braindead, as extensions are nothing more than a hack workaround so webservers don't need real metadata about what MIME type a file is.

(It also has other problems, like not properly resolving relative URLs that use '..'.)

This stunned monkey moment is brought to you by the idiotic stealth spider running from 204.11.99.2, which is claimed to belong to a 'Goo Khim Yeung' as 204.11.99.0/29. (Assuming that the WHOIS information is accurate, which it isn't always.)

These are my WanderingThoughts
(About the blog)

GettingAround
Full index of entries
Recent comments

This is part of CSpace, and is written by ChrisSiebenmann.

* * *

Atom feeds are available; see the bottom of most pages.

This is a DWiki.
(Help)

Categories: links, linux, programming, python, snark, solaris, spam, sysadmin, tech, unix, web

Search:
Written on 25 August 2006.
(Previous | Next)

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Fri Aug 25 11:35:05 2006
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.