Wandering Thoughts archives

2015-04-30

I'm considering ways to mass-add URLs to Firefox's history database

I wrote yesterday about how I keep my browser history forever, because it represents the memory of what I've read. A corollary of this is that it bugs me if things I've read don't show up as visited URLs. For example, if all of the blog entries and so on here at Wandering Thoughts were to turn unvisited tomorrow, that'd make me twitch every time I read something here and saw a blue link that should instead be visited purple.

(One of the reasons for this is that links showing visited purple is a sign that they point to the right place. Under normal circumstances, if links on Wandering Thoughts suddenly go blue, something has probably broken. And when I'm drafting entries, a nominal link to an older entry that shows blue is a sign that I got the link wrong.)

Which winds up with the problem: Wandering Thoughts and indeed this entire site is in the process of moving from HTTP to HTTPS. The HTTP versions of all of the entries and so on are in my Firefox history database, but Firefox properly considers the HTTPS version to be a completely different URL and so not in the history. So, all of a sudden, all of my entries and links and so on are unvisited blue. At one level this is not a problem. After all, I know that I've read them all (I wrote them). In theory, I could leave everything here alone, then maybe re-visit links one by one as I use them in new entries or otherwise run across them. But the whole situation bugs me; by now, seeing all the links be purple is reassuring and the way things should be, while blue links here make me twitch.

Conceptually the fix is simple. All I have to do is get every HTTP URL for here out of my existing history database, mechanically turn the 'http:' into 'https:', and then add all of the new URLs to Firefox's history database. All of the last visited and so on values can be exactly copied from the HTTP version of the URL. The only problem is that as far as I know there is no tool or extension for doing this.

(There are plenty of addons for removing history entries, which is of course exactly the opposite of what I want.)

These days, Firefox's history in is a SQLite database (places.sqlite in your profile directory). There are plenty of tools and packages to manipulate SQLite databases, which leaves me with merely the problem of figuring out what actually goes into a history entry in concrete detail (and then calculating everything that isn't obvious). So all of this is achievable, but on the other hand it's clearly going to be a bunch of work.

(While the Places database is documented, parts of this documentation are out of date. In particular, current Firefox places.sqlite has a unique guid field in the moz_places table.)

PS: The other obvious nefarious hack is to literally rewrite the URLs in all current history entries to be 'https:' instead of 'http:', possibly by dumping and then reloading the moz_places table. Assuming that you can change the URL schema without invalidating any linkages in the database, this is simple. Unfortunately it has a brute force inelegance that makes me grumpy; it's clearly the expedient fix instead of the right one.

web/FirefoxAddHistoryDesire written at 23:36:51; Add Comment

Why I have a perpetual browser history

I've mentioned in passing that I keep my browser's history database basically forever, and I've also kind of mentioned that it drives me up the wall when web sites make visited links and unvisited links look the same. These two things are closely related.

Put simply, the visited versus unvisited distinction between links is a visible, visual representation of your current state of dealing with a (good) site. A visited link tells you 'yep, I've been there, no need to visit again'; an unvisited link tells you that you might want to go follow it. This representation of state is very important because otherwise we must fall back on our fallible, limited, and easily fooled human memories to try to keep track of what we've read and haven't read. This fallback is both error-prone and a cognitive load; mental effort you're spending to keep track of what you've read is mental effort you can't use on reading.

Of course this doesn't work on all sites (and doesn't work all the time even on 'good' sites). I'm sure you can come up with any number of sites and any number of ways that this breaks down, and so the visited versus unvisited state of a page is not important or useful information. But it works well enough on enough sites to be extremely useful in practice, at least for me.

And this is why I want my browser history to last forever. My browser history is the collected state representation of what I have and haven't read. It tracks things not just now, in my currently active browsing session as I work through something, but also back through time, because I don't necessarily forget things I've read long ago (but at the same time I don't necessarily remember them well enough to be absolutely confident that I've already read them). For that matter, I don't always get through big or deep sites in one go, so again the visited link history is a history of how far I've gotten in archives or reference articles or the like.

There is nothing else on the web that can give me this state recall, nothing else that serves to keep track of 'how far have I gotten' and 'have I already seen this'. The web without it is a much more spastic and hyperactive place. It's a relatively more hyperactive place if I only have a short-term state recall; I really do want mine to last basically forever.

(In fact for me anything without a read versus unread state indicator is an irritatingly spastic and hyperactive place. All sorts of things are vastly improved by having it, and lack of it causes me annoyance (and that example is on the web).)

web/BrowserHistoryForever written at 00:14:42; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.