Spiders should respect rel="nofollow"
If you're writing a web spider, what should you do when you see a
link marked rel="nofollow"
?
In theory, you can do nothing different from any other link. It's not a formal specification, and the original description only talks about the resulting link not giving the target any credit.
In practice, on the Internet what people expect is in large
part defined by what the 800 pound gorillas do. And both Google
and MSN Search consider nofollow
to be literal: don't
follow this link. In fact Google explicitly documents this
behavior; see one of the original postings on nofollow
, or
Google's description of it in their help pages.
So the real answer is: if you see a rel="nofollow"
link, you
shouldn't crawl the target.
Since Google (the original creators of nofollow
) describe it
this way, I will go so far as to say that respecting nofollow
requires you to not crawl marked links.
Spider authors should do this not just because it's what people expect, but because it's genuinely useful for guiding spiders around web sites. (Especially dynamic web sites like wikis and blogs, which can have a lot of different ways of viewing more or less the same content.)
|
|