Another stupid spider mistake
To follow up my earlier entry on this stuff, I just saw another stunned monkey moment:
- you can't randomly add a trailing slash to URLs any more than you can randomly remove them.
- especially when the URL includes a query parameter, because then you're changing the query. And that always works really well.
From the pattern of the stealth spider's requests, I think it is adding the trailing slash on any URL that doesn't end in a filename with an extension. This is stunningly braindead, as extensions are nothing more than a hack workaround so webservers don't need real metadata about what MIME type a file is.
(It also has other problems, like not properly resolving relative URLs
that use '..
'.)
This stunned monkey moment is brought to you by the idiotic stealth spider running from 204.11.99.2, which is claimed to belong to a 'Goo Khim Yeung' as 204.11.99.0/29. (Assuming that the WHOIS information is accurate, which it isn't always.)
|
|