Automated web software should never fill in the Referer
header
Yesterday, I noticed that Yahoo Pipes does a really irritating thing: if
someone has asked it to pull a syndication feed, it puts the Yahoo Pipes
info page about the feed into the Referer
header of its feed requests.
Wrong.
I feel very strongly that no automated web software should fill in
the Referer
header, ever. In practice and custom (if not in the
spec), Referer
has a very well defined meaning; it is there to tell
webmasters where a real human visitor came from. If you do not have
a real human activating a link right then, you do not get to fill in
Referer
. In practice, this means that if you are not a web browser
(and Yahoo Pipes is not), you do not get to ever use Referer
.
Why not? Simple. You do not get to use Referer
because doing so
makes the lives of webmasters harder. If software sprays irrelevant
and inaccurate information all over Referer
, webmasters have to work
harder to remove it when they look at and analyze their logs. Making
webmasters work harder irritates them, and it's also pure wasted time on
their part.
Yes, it's useful to tell people this information. However, like other
people who want to convey this same information, Yahoo Pipes should
put it into the customary and appropriate place for it, that being the
User-Agent
field. All sorts of feed aggregators and fetchers already
put this information there, so YP would have lots of company. The
potential argument that this makes the information harder to extract is
incorrect; since this use of Referer
is not standardized, webmasters
need custom parsing code to extract the information regardless of where
you put it.
(Considering that YP uses a User-Agent
field whose entire contents are
'Yahoo Pipes 1.0
', they have lots of room to add other things. Like,
say, the URL of an an overall information page about their software
agent and how it behaves.)
(I have written about this before, but that was only in the context of web crawlers, not general web software.)
|
|