Wandering Thoughts archives

2006-05-08

Link: an excerpt from On Writing Well

Here are chapters 2 through 4 of William Zinsser's On Writing Well, a classic book on, well, writing well. Just start with the opening of chapter 2 and keep going:

Clutter is the disease of American writing. We are a society strangling in unnecessary words, circular constructions, pompous frills and meaningless jargon.

Remind you of any computer manuals you've read recently? (Hopefully it does not remind you too much of WanderingThoughts. I try, but I know I have a long way to go.)

On Writing Well itself can be gotten from the online bookseller of your choice. (My choices are Canadian.)

(From a comment on a Slashdot article about writing.)

links/OnWritingWell written at 14:31:30; Add Comment

A really stupid web spider

Today WanderingThoughts had a visit from the worst stealth spider that I've ever seen. Given the previous contestants this is a fairly tall order, but I'm confidant I have a winner. The spider:

  • made two requests for directories without the trailing slash, earning it redirections to the proper URLs.
  • followed the redirections, making two valid requests.
  • promptly made 95 bad requests by failing to treat <a href="..."> URLs with a leading slash properly.

I've seen spiders that didn't handle absolute path URLs before, but this is a new and spectacular level of failure. They failed to crawl a single page past their two start pages; all things considered I'm surprised that they even handled the initial redirections properly.

(They're a stealth spider because they claimed to be a variety of harmless Windows based browsers. This is utterly false; first, the browsers would have gotten the requests right, and second very few make 99 requests in 14 seconds from 42 different IP addresses in the same subnet.)

The details

All 99 requests were made in the spam of 14 seconds, from 42 different IP addresses between 66.90.95.207 and 66.90.95.254. WHOIS says that this is part of a /18 owned by fdcservers.net. Unfortunately, fdcservers.net does not have a working whois server and these IP addresses have no reverse DNS; the IPs answer on port 25, but only with a very generic identification.

There's some evidence from Google searches that this is a botnet for some sort of spam, eg here. The 66.90.110.* IP range that this person reports also came by our server, on May 4th. The requests show some traces of a similarly incompetent spider, but they had the luck to hit an area of the site with mostly relative links (and Apache generously fixed up some of their mistakes, like the requests with '/../' in them).

(Nothing from 66.90.95. has hit here before today, at least for the past 28 days of logs that we have, and 66.90.110. only hit us the once on May 4th.)

web/ReallyStupidSpider written at 02:26:46; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.