My blocking of some crawlers is an editorial decision unrelated to crawl volume
Recently I read a lobste.rs comment on one of my recent entries that said, in part:
Repeat after me everyone: the problem with these scrapers is not that they scrape for LLM’s, it’s that they are ill-mannered to the point of being abusive. LLM’s have nothing to do with it.
This may be some people's view but it is not mine. For me, blocking web scrapers here on Wandering Thoughts is partly an editorial decision of whether I want any of my resources or my writing to be fed into whatever they're doing. I will certainly block scrapers for doing what I consider an abusive level of crawling, and in practice most of the scrapers that I block come to my attention due to their volume, but I will block low-volume scrapers because I simply don't like what they're doing it for.
Are you a 'brand intelligence' firm that scrapes the web and sells your services to brands and advertisers? Blocked. In general, do you charge for access to whatever you're generating from scraping me? Probably blocked. Are you building a free search site for a cause (and with a point of view) that I don't particularly like? Almost certainly blocked. All of this is an editorial decision on my part on what I want to be even vaguely associated with and what I don't, not a technical decision based on the scraping's effects on my site.
I am not going to even bother trying to 'justify' this decision. It's a decision that needs no justification to some and to others, it's one that can never be justified. My view is that ethics matter. Technology and our decisions of what to do with technology are not politically neutral. We can make choices, and passively not doing anything is a choice too.
(I could say a lot of things here, probably badly, but ethics and politics are in part about what sort of a society we want, and there's no such thing as a neutral stance on that. See also.)
I would block LLM scrapers regardless of how polite they are. The only difference them being politer would make is that I would be less likely to notice (and then block) them. I'm probably not alone in this view.
|
|