The fading HTTP Referer header and (Google) Search paywall bypasses

March 16, 2021

There are a lot of newspaper and other media places that have a general paywall (where you must be a subscriber to see their content), but also make an exception to this paywall if the visitor is coming directly from an Internet search (or at least a Google search; I don't know if these places will let visitors from other search engines in). Today, how these places generally know that you're coming from an Internet search is the HTTP Referer header that your browser puts on your request when you follow the link from the search page. That would be the same HTTP Referer header that's fading away, fundamentally because browsers don't like it.

Today it occurred to me that this creates some interesting issues both for Internet search engines and for places currently using a permeable paywall. For Internet search engines, it probably makes them willing to set an explicit Referrer-Policy header when browsers start defaulting to something inconvenient for them (Google Search and Bing don't appear to do this today). Search engines get a variety of things out of visibly sending traffic to media sites, or really out of visibly sending traffic to anywhere, so they have an incentive to make this traffic visible.

(I'd say that search engines have additional options if browsers refuse to cooperate, but the company behind the dominant Internet search engine is also the company behind the dominant Internet browser, so there's no risk of that. Chrome is going to do what is useful for Google, including sending Referer headers.)

If Internet search engines keep sending Referer headers, then places with permeable paywalls may not need to do anything. In theory, as the default Referrer-Policy changes, such places will need to live without knowing the terms that their visitors searched for. In practice I suspect that they already see a lot of visits with only the origin (the website, eg 'https://google.com') and no search terms, because that's what I do here. If for some reason the Internet search engines stop sending Referer entirely and won't do anything else to signal the origin, such as adding special query parameters to the URL, then who knows. Perhaps the paywalls will get less permeable than they are today. If nothing else, such a development would be interesting to see.

(Before I started thinking through what search engines would likely do, I guessed that publishers would wind up with a real problem. Now I think they probably will be able to just keep on as usual, because search engines are likely to take steps to keep things running as normal.)


Comments on this page:

By Jukka at 2021-03-17 00:55:37:

JSTOR is the only paywall I know that still requires the Referer field. But then again, it is a fairly large thing in university settings.

Written on 16 March 2021.
« Different views of what are basic and advanced Vim features
Remote X has been a life saver over this past year »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Tue Mar 16 00:11:54 2021
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.