Automated web software should never fill in the Referer header

October 17, 2009

Yesterday, I noticed that Yahoo Pipes does a really irritating thing: if someone has asked it to pull a syndication feed, it puts the Yahoo Pipes info page about the feed into the Referer header of its feed requests.

Wrong.

I feel very strongly that no automated web software should fill in the Referer header, ever. In practice and custom (if not in the spec), Referer has a very well defined meaning; it is there to tell webmasters where a real human visitor came from. If you do not have a real human activating a link right then, you do not get to fill in Referer. In practice, this means that if you are not a web browser (and Yahoo Pipes is not), you do not get to ever use Referer.

Why not? Simple. You do not get to use Referer because doing so makes the lives of webmasters harder. If software sprays irrelevant and inaccurate information all over Referer, webmasters have to work harder to remove it when they look at and analyze their logs. Making webmasters work harder irritates them, and it's also pure wasted time on their part.

Yes, it's useful to tell people this information. However, like other people who want to convey this same information, Yahoo Pipes should put it into the customary and appropriate place for it, that being the User-Agent field. All sorts of feed aggregators and fetchers already put this information there, so YP would have lots of company. The potential argument that this makes the information harder to extract is incorrect; since this use of Referer is not standardized, webmasters need custom parsing code to extract the information regardless of where you put it.

(Considering that YP uses a User-Agent field whose entire contents are 'Yahoo Pipes 1.0', they have lots of room to add other things. Like, say, the URL of an an overall information page about their software agent and how it behaves.)

(I have written about this before, but that was only in the context of web crawlers, not general web software.)

Written on 17 October 2009.
« A tale of network horror, or at least excitement
Backup MXes versus redundant MXes »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sat Oct 17 00:42:36 2009
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.