Reverse engineering some settings for Google Search

January 30, 2018

As part of switching to uMatrix, I wound up destabilizing my Google cookies and thus more or less destroyed my old Google search settings (as I mentioned in passing in the original entry). This isn't the first time something like this has happened, but this time around I've managed to work out roughly what all of the magic processes are to manipulate them back to the way they used to be.

Google Search has at least three separate settings:

  • The straightforward search preferences for things like how many results are displayed on a single page. Google directly supports changing these through their search settings.

    (There are two versions of this settings page; see the end.)

  • Whether or not visiting google.com redirects you to your country Google domain (I believe the term for this is a 'country redirect'). You can turn off country redirects by visiting the magic URL www.google.com/ncr, which sets some opaque magic cookie value. I don't know how you turn country redirects back on, or if there's any way short of deleting your Google cookies.

    (As you can tell, I always turn country redirects off and leave them off. When I go to google.com, that's where I want to be; if I wanted google.ca, I would go there explicitly. I believe that Google still gives me somewhat location-based results, but at least I usually get to see the Google doodles.)

  • Whether you get what I'll call the non-JavaScript results page [JPG] or the JS results page [JPG]. One difference between these two pages that isn't obvious in my screenshots is how the URLs are handled. In the non-JS version, Google rewrites all of the actual links to be redirected through themselves for click-tracking. In the JS version, the links are intact and this tracking is done through an onmousedown attribute.

    (The JS version will do various things not pictured here, such as put in images and other content. The non-JS version sticks to plain, straightforward search results.)

    To complicate the picture, some Firefox addons (at least NoScript) will apparently fix Google's link mangling in the non-JS version, although this fixing isn't necessarily instant or complete.

How the two versions of the results page switch back and forth is complicated. As far as I can tell, it goes like this:

  • If you have JavaScript turned off and interpret <noscript> tags, you get the non-JS version, with all of the URL links rewritten to go through Google with tracking information. Google also sets a magic cookie so that this state is persistent as long as you have JS off (even if you later stop interpreting <noscript> tags).

    (If you have the magic cookie, your search result is immediately the non-JS page. If you don't, you first wind up on the JS page but a <noscript> tag redirects you to the non-JS version.)

  • If you have JavaScript turned on for Google, you get the JS version (and Google tracks your clicks on the URLs through JS). Doing a search with JS on clears the magic cookie from above.

  • If you have JS off, don't interpret <noscript> tags, and don't have the magic cookie, you get the JS version (with a <noscript> tag that would redirect you to the non-JS version if you were to interpret it). Since it's the JS version, it has the URLs not mangled (and your history works so you know if you've already read a search result), but since you're not interpreting JS, Google doesn't get to track your clicks.

I don't allow Google to run JavaScript (for all sorts of reasons). My NoScript setup interpreted <noscript> tags and also de-mangled the Google-mangled URLs in the non-JS version, which is the ideal outcome. My uMatrix setup doesn't interpret <noscript> tags and doesn't de-mangle the Google-mangled URLs if I get the non-JS page. The non-JS page fits within my normal browser window width; the JS page does not (even with JS off). However the JS page does give me real URLs that I can see if I've visited before, and if I scroll the window sideways all of the search content fits.

Until I did the experimentation for this entry, I didn't know how to switch my current uMatrix setup over to the non-JS version. Now that I do, I'm still going to stay with the JS version for the better URLs for now, although I may get irritated with its various annoyances and change my mind.

(There's probably a GreaseMonkey user script to fix the URLs, but I've had memory leak problems with GreaseMonkey in the past so I'm not entirely enthused about the idea of re-enabling it. In theory I could probably also write a Firefox WebExtension addon to do this, since it's just modifying the href attribute of <a> elements that match a specific pattern.)

As a side note, the search settings page actually comes in two versions, a 'JavaScript' version and a non-JS one (by analogy to the search results pages). You get the non-JS version if and only if you have the magic cookie that forces you to the non-JS search page right away; having JS turned off is not enough by itself to get the non-JS settings page. This can be confusing and annoying, especially since the JS settings page doesn't work if you have JavaScript turned off.


Comments on this page:

There is also the Search Link Fix Addon which transforms any Google tracking URLs in the search results. It works pretty good for my google searches (in Firefox 57+ and earlier versions).

As alternative to cookie based search settings: you can also encode some settings in a bookmark with a keyword (e.g. 'gg'), for example:

https://encrypted.google.com/search?hl=en&num=100&as_q=%s

With that you always get 100 results if you enter something like 'gg search term' - without having to set any cookie. Also, this seems to be enough to get non-localized google search results.

By cks at 2018-02-01 08:53:44:

Unfortunately the Search Link Fix addon is reversed from what I'd like; it fixes things up if you allow Google to use Javascript and so get the JS version of the search page. But if you want to allow Google's search stuff to run JS, it looks great.

The encrypted.google.com tricks are great to know; thank you. My Google search script uses the older www.google.com/search?q=... URL format, but I should probably switch it over.

(I have an entire infrastructure of tools for doing Google searches from the X selection or through things I type to dmenu and almost all of my searches use it, so changing it to force 100 results and so on would let me mostly ignore my cookie state.)

It’s not really an older format, your existing form is fine.

The encrypted.google.com host was the HTTPS-only version they offered in the transition period before they could switch www.google.com itself to HTTPS. Strictly speaking that host has been obsolete since they made all traffic HTTPS-only. I don’t know why they’ve kept it around.

It sure is handy for uMatrix though. I use it to isolate my blocking of JS to Google Search: most of my keywords etc. are based on that host and I have JS blocked only on encrypted.google.com, not the whole google.com domain. That way other Google properties remain unaffected – particularly ones that live directly on www.google.com without a host of their own. Also, if I decide I want the JS version while I’m in the middle of a search, I can just edit the address bar to change host – e.g. for Image Search, whose JS-enabled version has much better UI.

Note that you can use the gbv parameter to force JS vs non-JS. (”Google Basic Variant”, apparently? 1=non-JS variant, 2=regular variant) Though Image Search will override gbv=1 and redirect you to the JS variant unless you also have JS blocked…

The as_q thing is the advanced search page version of the regular q parameter (seemingly the only docs for it?), and almost everything you can do with as_foo can be done with just q with some advanced search syntax. The as_foo parameters just make it easier to compose a search in scripts I guess.

The only thing that does not appear to be possible without a cookie is preventing the country redirect on www.google.com. There isn’t one on encrypted.google.com though.

By Mark at 2022-08-01 14:15:32:

You can use EFF's Privacy Badger addon which demangles google search results

Written on 30 January 2018.
« Adding 'view page in no style' to the WebExtensions API in Firefox Quantum
How the IPs sending us malware break down (January 2018 edition) »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Tue Jan 30 00:03:50 2018
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.