|
2005-11-14 Banning MSNBot: an open letter to MSN SearchI understand that MSN Seach wants to be bigger than Google in search. One necessary step towards this is that people must be willing to let MSNBot, your search spider, index their content. Today, I changed our robots.txt to ban MSNBot, joining various other people, and so you're a little bit further from your goal. (Probably not enough further that you care very much.) I'm not doing this because I dislike Microsoft. I'm doing this for a simple reason, the same reason other people have: right now, MSNbot is not a responsible search spider. Here is an incomplete list of the sort of things MSNBot routinely does on this site:
All of these behaviors are undesirable. Most of them are aggressively antisocial. None of them should be news to MSN Search, because two months ago when I first started noticing these issues someone I know who worked at Microsoft put me in email contact with some members of the MSN Search team. They got in touch with me, got information from me, and then disappeared; the last time I heard from them was September 30th. The only change in MSNBot's behavior since September 7th is that it has become a little less enthusiastic about crawling unchanging and error URLs and it stopped pulling our large binary files for a week or two. For all I can tell, these are routine fluctuations in MSNBot's crawling behavior. That's why I've finally banned MSNbot; not just because it does antisocial things, but also because after two months of waiting I no longer believe that MSNBot will get fixed any time soon. On the Internet, two months is a very long time to tolerate antisocial behavior, far longer than you should expect people to wait. (And if not for the hope created by my brief and fleeting contact with MSN Search programmers, I would not have waited this long.) So, in summary: I don't enjoy banning MSNBot, but I have lost patience with its bad behavior and don't expect it to change. Enough is enough; out it goes. Why not just ban MSNBot only from crawling the various bits of bad stuff? Three reasons:
(There would be a fourth reason, 'you can't do that in robots.txt's
syntax', but MSNBot supports wildcard matching in
|
These are my WanderingThoughts GettingAround This is part of CSpace, and is written by ChrisSiebenmann. * * * Atom feeds are available; see the bottom of most pages. Categories: links, linux, programming, python, snark, solaris, spam, sysadmin, tech, unix, web |