2010-08-17
The two sorts of Referer spam that I see
In general, there are two sorts of Referer spam that I see here on WanderingThoughts. The first
sort is what I'll call 'plain referer spam', HTTP requests that have
Referer
URLs that are various spam websites. Many of these are easily
recognizable because they are of the form 'http://web.site', with no
trailing slash on the end; this is a legal URL, but it is not one that
regular browsers ever send. Usually the domain names or the rest of the
URL make it pretty clear that this is spam instead of a crazy browser.
(You can enter a website name without the trailing slash, but your
browser will almost always add the trailing slash when it actually
visits the page. Then when you click on links on the page, it is the
slash-added URL that is sent as the Referer
.)
The other sort is what I call 'active text', and is much less common.
Active text is snippets of actual text in the Referer
instead of
an URL; I believe I've seen HTML, BBCode, and even an attempt with
JavaScript. I find this spam both depressing and intensely irritating,
especially the JavaScript. It's depressing because that spammers are
doing it implies that there is software that just puts the text from
Referer
headers on some web page without escaping it first (which is
terribly wrong); it's irritating because
it's clearly attempting to exploit a security flaw and I'm a sysadmin.
(These Referer
headers are clearly bogus; I don't think they even
bother pretending by, say, putting a 'http://...' on the front of
a garbage word at the start.)
My memory, along with some early entries here, suggests that Referer
spam used to be a lot more common than it is now. These days a typical
volume is at most one or two attempts a day, and I'm pretty sure that
there are days with none at all. Since the spammers are not hitting me
with the same Referer
over and over, I generally assume that their
current goal is to collect links from as many web pages as possible
in order to boost their search engine rankings. Naturally they are
completely indifferent to the fact that CSpace has no page that shows
Referer
information, which makes their efforts here completely
useless.
(Like email spam, web spam has become something that I no longer pay much attention to unless it is really glaring and in my face.)
PS: if you have public Referer
logs, I sugges that you turn them off.
If you have private Referer
logs that are presented in any sort of
web interface, I suggest that you make sure that your software properly
quotes and escapes the Referer
information. I view my Referer
logs
in plain text with minimal processing, but I'm aware that this is
attractive only to crazy people like me.
(And never, ever investigate a spammed website with anything except an
extremely secure browser configuration; you should assume that all of
them are absolutely crawling with infectious malware. I use lynx
, or
sometimes wless
.)