Wandering Thoughts archives


Reverse engineering some settings for Google Search

As part of switching to uMatrix, I wound up destabilizing my Google cookies and thus more or less destroyed my old Google search settings (as I mentioned in passing in the original entry). This isn't the first time something like this has happened, but this time around I've managed to work out roughly what all of the magic processes are to manipulate them back to the way they used to be.

Google Search has at least three separate settings:

  • The straightforward search preferences for things like how many results are displayed on a single page. Google directly supports changing these through their search settings.

    (There are two versions of this settings page; see the end.)

  • Whether or not visiting google.com redirects you to your country Google domain (I believe the term for this is a 'country redirect'). You can turn off country redirects by visiting the magic URL www.google.com/ncr, which sets some opaque magic cookie value. I don't know how you turn country redirects back on, or if there's any way short of deleting your Google cookies.

    (As you can tell, I always turn country redirects off and leave them off. When I go to google.com, that's where I want to be; if I wanted google.ca, I would go there explicitly. I believe that Google still gives me somewhat location-based results, but at least I usually get to see the Google doodles.)

  • Whether you get what I'll call the non-JavaScript results page [JPG] or the JS results page [JPG]. One difference between these two pages that isn't obvious in my screenshots is how the URLs are handled. In the non-JS version, Google rewrites all of the actual links to be redirected through themselves for click-tracking. In the JS version, the links are intact and this tracking is done through an onmousedown attribute.

    (The JS version will do various things not pictured here, such as put in images and other content. The non-JS version sticks to plain, straightforward search results.)

    To complicate the picture, some Firefox addons (at least NoScript) will apparently fix Google's link mangling in the non-JS version, although this fixing isn't necessarily instant or complete.

How the two versions of the results page switch back and forth is complicated. As far as I can tell, it goes like this:

  • If you have JavaScript turned off and interpret <noscript> tags, you get the non-JS version, with all of the URL links rewritten to go through Google with tracking information. Google also sets a magic cookie so that this state is persistent as long as you have JS off (even if you later stop interpreting <noscript> tags).

    (If you have the magic cookie, your search result is immediately the non-JS page. If you don't, you first wind up on the JS page but a <noscript> tag redirects you to the non-JS version.)

  • If you have JavaScript turned on for Google, you get the JS version (and Google tracks your clicks on the URLs through JS). Doing a search with JS on clears the magic cookie from above.

  • If you have JS off, don't interpret <noscript> tags, and don't have the magic cookie, you get the JS version (with a <noscript> tag that would redirect you to the non-JS version if you were to interpret it). Since it's the JS version, it has the URLs not mangled (and your history works so you know if you've already read a search result), but since you're not interpreting JS, Google doesn't get to track your clicks.

I don't allow Google to run JavaScript (for all sorts of reasons). My NoScript setup interpreted <noscript> tags and also de-mangled the Google-mangled URLs in the non-JS version, which is the ideal outcome. My uMatrix setup doesn't interpret <noscript> tags and doesn't de-mangle the Google-mangled URLs if I get the non-JS page. The non-JS page fits within my normal browser window width; the JS page does not (even with JS off). However the JS page does give me real URLs that I can see if I've visited before, and if I scroll the window sideways all of the search content fits.

Until I did the experimentation for this entry, I didn't know how to switch my current uMatrix setup over to the non-JS version. Now that I do, I'm still going to stay with the JS version for the better URLs for now, although I may get irritated with its various annoyances and change my mind.

(There's probably a GreaseMonkey user script to fix the URLs, but I've had memory leak problems with GreaseMonkey in the past so I'm not entirely enthused about the idea of re-enabling it. In theory I could probably also write a Firefox WebExtension addon to do this, since it's just modifying the href attribute of <a> elements that match a specific pattern.)

As a side note, the search settings page actually comes in two versions, a 'JavaScript' version and a non-JS one (by analogy to the search results pages). You get the non-JS version if and only if you have the magic cookie that forces you to the non-JS search page right away; having JS turned off is not enough by itself to get the non-JS settings page. This can be confusing and annoying, especially since the JS settings page doesn't work if you have JavaScript turned off.

web/GoogleSearchSettings written at 00:03:50; Add Comment

Page tools: See As Normal.
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.