Wandering Thoughts archives

2015-07-31

The future problem with Firefox's Electrolysis project

Firefox Electrolysis ('e10s' for short) is a project to push Firefox towards a multiprocess model like Chrome's. This is both a daunting amount of work and a praiseworthy goal with a number of benefits, but there is a problem lurking in the future and that is Firefox addons.

The direct problem is that any number of addons are not Electrolysis compatible for technical reasons. Firefox developers have partly worked around this with shims, but shims are an incomplete solution and can't make all addons work. Checking arewee10syet makes for depressing reading much of the time; a great many popular extensions are not working under Electrolysis (including NoScript, one of my critical extensions). It seems quite likely that a number of reasonably popular extensions will never be updated to be Electrolysis compatible and so people will be faced with a choice between not getting Electrolysis or abandoning them (the likely choice here being 'don't go e10s').

(The popularity of an addon has no relationship with the attention and spare time of its developer(s). There are any number of popular addons that have basically been abandoned by their developers.)

The indirect problem is that at some point Mozilla is going to want to turn Electrolysis on by default in a released Firefox version. In a straightforward version of the switch, some amount of reasonably popular extensions will partially or completely stop working. If people are lucky this will be obvious, so at least you know that you have a different browser now; if people are unlucky, the extension will quietly stop doing whatever it does, which is bad if this is, say, 'protecting me from some sort of bad stuff'. There are various things Firefox could do here to avoid silent breakage, like not enabling Electrolysis unless all your addons are known to be e10s compatible or warning you about some addons perhaps breaking, but none of the options are particularly good ones.

(Well, they're not particularly good ones if Mozilla's goal is widespread Electrolysis adoption. Mozilla could take the safe conservative approach if they wanted to; I just don't think they will, based on past behavior.)

When this future comes to pass, knowledgeable people can go in and turn off Electrolysis in order to get a fully working browser back (at least one hopes). Other people, well, I suspect we're going to see a lot of quietly or loudly upset people and Firefox is going to leak some more browser share as well as seeing some more people turn off Firefox automatic updates (with the resulting damage to security).

FirefoxElectrolysisProblem written at 23:18:47; Add Comment

2015-07-18

Some data on how long it is between fetches of my Atom feed

Recently I became interested in a relatively simple question: on average, how much time passes between two fetches of the Atom feed for Wandering Thoughts? Today I want to give some preliminary answers to that. To make life simple, I'm looking only at the blog's main feed and I'm taking 24 hours of data over Friday (local time). Excluding feed fetch attempts that are blocked for some reason, I get the following numbers:

  • the straight average is one fetch every 12.9 seconds (with a standard deviance of 13.7).
  • the median is one fetch every 9 seconds.

  • the longest gap between two feed requests was 130 seconds.

  • 90% of the inter-request gaps were 31 seconds or less, 75% were 18 seconds or less, and 25% were 3 seconds or less.

  • 6% of the feed fetch requests came at the same time (to the second) as another request; the peak number of fetches in one second is four, which happened several times.
  • 7.5% came one second after the previous request (and this is the mode, the most common gap), 6% two seconds, 6% three seconds, and 5.5% four seconds. I'm going to stop there.

Of course averages are misleading; a thorough workup here would involve gnuplot and peering at charts (and also more than just 24 hours of data).

This is an interesting question partly because every so often people accidentally publish a blog entry and then want to retract it. Retraction is difficult in the face of syndication feeds; once an entry has started to appear in people's syndication feed fetches, you can no longer just remove it. My numbers suggest strongly that even moderately popular blogs have very little time before this starts happening.

AtomFetchTimeGaps written at 03:18:04; Add Comment

2015-07-04

Googlebot and Feedfetcher are still aggressively grabbing syndication feeds

Somewhat more than a year ago I wrote about how I'd detected Googlebot aggressively crawling my syndication feeds, despite them being marked as 'stay away'. At the time I was contacted by someone from Google about this and forwarded various information about it.

Well, you can probably guess what happened next: nothing. It is now more than a year later and Googlebot is still determinedly attempting to pound away at fetching my syndication feed. In fact it made 25 requests for it yesterday, all of which got 403s as a result of me blocking it back then. In fact Googlebot is still trying on the order of 25 times a day despite getting 403s on all of its requests for this URL for literally more than a year.

(At least it seems to be down to only trying to fetch one feed URL.)

Also, because I was looking, back what is now more than a year and a half ago I discovered that Google Feedfetcher was still fetching feeds; as a result I blocked it. Well, that's still happening too. Based on the last 30 days or so, Google Feedfetcher is making anywhere between four and ten attempts a day. And yes, that's despite getting 403s for more than a year and a half. Apparently those don't really discourage Google's crawling activities if Google really wants your data.

I'd like to say that I'm surprised, but I'm not in the least bit. Google long ago stopped caring about being a good Internet citizen, regardless of what its propaganda may say. These days the only reason to tolerate it and its behavior is because you have no choice.

(As far as I can tell it remains the 800 pound gorilla of search traffic, although various things make it much harder for me to tell these days.)

Sidebar: The grumpy crazy idea of useless random content

If I was a real crazy person, it would be awfully tempting to divert Google's feed requests to something that fed them an endless or at least very large reply. It would probably want to be machine generated valid Atom feed entries full of more or less random content. There are of course all sorts of tricks that could be played here, like embedding honeypot URLs on a special web server and seeing if Google shows up to crawl them.

I don't care enough to do this, though. I have other fish to fry in my life, even if this stuff makes me very grumpy when I wind up looking at it.

GooglebotStillCrawlingFeeds written at 00:56:23; Add Comment

By day for July 2015: 4 18 31; before July; after July.

Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.