Wandering Thoughts archives

2014-05-25

Firefox, DRM, and reality

Today I ran across someone retweeting yet another petition asking Firefox to reject DRM or more specifically to reject EME. As this person doesn't usually seem to be given to supporting quixotic and crazy causes, the whole thing has pushed me over the edge about this issue. Apparently soft honesty is not working on people, so I'm going to try being blunt about the situation.

What I think most people who support this petition don't understand is that by asking Firefox to reject DRM and EME, they are asking Firefox to slit its own throat. At this point the all but certain result of Firefox rejecting EME is Firefox's browser share declining drastically, probably to the small single digits. One consequence of this would be Firefox losing its ability to influence the further evolution of the web (for exactly the same reason that almost no one cares what eg the Konqueror people think about such stuff).

Let's start with a basic fact: to most people, browsers are fungible commodities. Most people don't really care all that much about which specific browser they use provided that it can browse the web well. On the one hand this has been great for enabling alternate browsers like, well, Firefox; if you do the same job but better people can easily switch and will do so (Chrome is an excellent example of this). On the other hand this means that most people will promptly switch to another browser if their current browser stops meeting their needs, especially when good alternatives are widely available.

There are four important browsers today: Firefox, Chrome, Safari, and Internet Explorer. Three out of the four of these have embraced EME more or less wholeheartedly and so serious EME-enabled alternatives to Firefox are available on every single major platform it runs on, including Linux (in the form of Chrome). On all platforms except Linux, an EME-enabled alternative is the platform's default browser, making it especially accessible.

A non-EME-enabled browser does not meet many people's needs. Many people care about watching video and listening to music, sometimes rather a lot (enough, sometimes, to pay for it). If their browser stops doing this very few people will decide to stick to it for other reasons such as intellectual purity; most people will shrug and more or less immediately chuck it out in favor of some other browser that actually works.

The inevitable result of Firefox not supporting EME is that it will no longer be a browser that fully meets many people's needs. Almost all of these people will drop Firefox and switch to alternatives, which almost all of them already have conveniently at hand. Firefox's browser share will decrease down to people who don't care about EME'd content, people who don't have an alternative, and people who are willing to endure pain for the sake of either principles or Firefox's other attractions. I do not think that there are very many of these people.

(As a sign of how much people care about these sort of principles in the face of even weak attraction to an arguably better browser, look at how many people moved from Firefox (genuine open source et al) to Chrome (controlled and driven by a large advertising company that is not your friend).)

The EME DRM battle was lost no later than when the other three browsers embraced EME. To demand that Firefox continue fighting it is to demand that Firefox conduct a suicide mission.

You are of course free to ask that Firefox immolate itself in the name of intellectual purity, although I don't think that this is wise or that you're going to be successful (thankfully). But please understand and be honest about what you're actually asking for.

FirefoxDRMReality written at 22:23:15; Add Comment

2014-05-18

What it would take to replace Firefox as my web browser

In the wake of Firefox's decision to support EME (cf) a number of people have been angrily advocating for switching away from Firefox to another browser. I think that this is wrong-headed (see here (via) for one cogent explanation of why), but let's set that aside for now and merely ask what it would take for something to replace Firefox as my web browser.

At one level the answer can be garnered from the Firefox extensions that I use. I don't just want 'a web browser'; I want a web browser with gestures and some equivalent of NoScript (practically speaking I also want the ability to edit text areas in a real editor). I also need a browser that can be remote controlled to open new URLs in new windows, because this is a big part of how I use Firefox in practice, but I think basically all of them will do that these days.

But that isn't the full story. What I really require is a browser and a set of extensions that I can trust both now and in the future, and this is at least partly a cultural issue that goes deep. As I've found out the hard way, Chrome does not have the right culture; 'good' Chrome extensions that provided more or less the equivalent of my Firefox environment have gone bad. It's likely that any smaller open source browser would have a browser and extensions culture that was good, but it's not absolutely guaranteed, and I don't think that I can really trust a commercial browser these days (sorry, Opera, although I may be selling you short).

All of which brings me around to my strong impression that there are no real browser alternatives left. In some ways Unix users have it better than Windows and Mac users, in other ways worse; we lack highly capable native system browsers but we have a number of reasonably capable free alternatives not beholden to various commercial interests, alternatives that optimistically might develop into something reasonably capable.

What this really points out is that developing an attractive, sophisticated modern browser is a lot of work, work that goes beyond the HTML rendering and associated issues (DOM, JavaScript, CSS, and so on). The days that browsers were just a thin wrapper around a HTML display widget are long gone, which means that you now need serious person-power to have a serious, competitive browser. And, well, there are not many places with that and it's probably not very attractive to duplicate the work of, say, Firefox. So I'm not surprised that there aren't all that many options for me to pick from.

(If I were serious about evaluating Firefox alternatives I think I'd have to look at Konqueror (although it seems to lack a NoScript equivalent) and Opera. Gnome has a browser but I doubt it's going to appeal to a power user like me. There are other Unix browsers but my impression is that none of them are up to a competitive level.)

FirefoxReplacementThoughts written at 00:41:20; Add Comment

2014-05-04

How I set up my Firefox 29's UI

Firefox 29 came out recently and brought with it a new user interface, which a number of people had negative reactions to. I'm not one of them. Bucking my usual trend with Firefox UI redesigns, I've come around to liking this one more than Firefox's previous UIs. Before I describe how I have my UI set up, I have to make a confession; I didn't arrive at my current setup or feelings instantly and the new UI had to grow on me. I actually run what is basically Firefox Nightly (hand built periodically from the latest trunk code), so for me the new UI dropped several months ago and I've had time to acclimatize and fiddle around.

My two priorities in my Firefox UI is that I want as much space as possible for content, especially vertical space, and I mostly care about quick access to things I use frequently; infrequently used things can be somewhat awkward. I also don't use bookmarks as I have a different solution to the problem.

Because a picture is worth a thousand words, I've put a picture of how my Firefox window looks here. My UI customizations that create this are:

  • Turn off displaying the menu bar. This is a less radical change than it sounds because you can get the menu bar back temporarily by tapping Alt.

    (That tapping Alt makes the Firefox menubar reappear is one of those irritating undiscoverable UI features. I wouldn't have found it if I didn't accidentally hit Alt from time to time and then realized what the flickering in my Firefox window was.)

  • Strip out almost everything from the 'URL bar' area, because I want as much space as possible for the URL box itself. The S in a red circle icon (and the menu dropdown next to it) is from NoScript, one of my essential Firefox addons.

  • Use the *-bar customization feature to add my most commonly used menus back to the tab bar area over on the right. What I have there at the moment is History, Preferences, Character Encoding, and Web Developer Tools.

    (*-bar customization is part of the menu you get when you hit the right mouse button over empty space in eg the tab bar. You can also use this menu to turn back on the menu bar.)

I've moved away from having a status bar or addons bar at the bottom of my Firefox windows. It takes up extra space and it's simply not that useful any more. I kind of miss having a loading progress indicator but on the modern web those have been misleading for years and in practice the thing I really care about is whether the page is fully loaded or not and the per-tab throbber does that okay for me.

(Firefox 29 can also put many menus and indicators from addons in the tab bar area if you want, which helps reduce the need for a separate status bar.)

There are things I don't fully like about this UI, for example I'd still prefer the stop/reload button to be a full button outside of the URL box, but for my tastes my current setup works decently. At first I disliked being forced to always have a tab bar even in single-tab windows but losing the menu bar makes up for it and the result is actually more functional (eg it's now easier to move pages between being standalone windows and being in a tab in an existing window).

Firefox29Setup written at 01:10:12; Add Comment

2014-05-03

My Firefox 29 extensions and addons

It's been a few years since I last did a comprehensive inventory of my Firefox addons (here, then updated here and here due to memory issues), so it's about time for another go around. Besides, Firefox 29 has come out and upended the Firefox UI again. As before this is for my main browser instead of my testing browser, although the testing version has almost all of them too.

Safe browsing:

  • NoScript to disable JavaScript for almost everything. I browse with JS blocked and only enable it selectively on sites when I have to (and almost always temporarily). I consider this more an issue of safety than of performance; I simply don't trust most JavaScript from most sites to not do things that will make me unhappy. These days my strategy for dealing with most JavaScript based websites that I actually want to use is my Chrome incognito hack.

    (NoScript also takes care of blocking Flash, Java, and so on. I also have Firefox's own preferences set to 'ask before running' for them, which NoScript seems to usually override when I tell it to go ahead with Flash stuff. This doesn't worry me because the Firefox setting is a fallback precaution in case something sneaks past NoScript.)

  • CookieSafe 3.0.5, with the actual addon here. I browse through a filtering proxy and it blocks ordinary cookies, but it can't do anything about cookies I get over HTTPS or via JavaScript. I use CookieSafe to block those (there's some more explanation here). For me, CookieSafe 3.1a10 has an explosive interaction with NoScript that hangs Firefox in some sort of infinite JavaScript loop, so I am still on 3.0.5 aka the 2011-12-10 version of CookieSafe.

CookieSafe hasn't been updated recently and apparently still has issues. If I didn't browse behind a filtering proxy I'd probably switch to Self-Destructing Cookies, which I'm actually experimenting with in my testing browser. Honestly, one of the reasons I don't switch anyways is just the hassle of figuring out which of my current cookies I want to whitelist permanently.

I've experimented with Ghostery. It works and I like the idea, but I don't have much use for it in my main browser (which is already protected against basically everything Ghostery would block) and I don't completely trust the business model involved in its development and thus it (I worry about the Chrome extensions problem). The EFF's Privacy Badger sounds nice but it's only just been announced and it's in alpha (and again I don't have much use for it in my main browser).

User interface:

  • FireGestures. I believe the only customizations I've made have been to change what gestures map to what (for reasons described here). It's worked great and has been free of both trouble and memory bloat. I really like how it can export and import your configured gestures, making it very easy to have the same set of gestures on all of the copies of Firefox that I wind up using (home, office desktop, office laptop, sometimes Firefox on servers).

Improving my life:

  • It's All Text! handily deals with how browsers make bad editors. The more I have it available the more I use it (and the longer comments and so on I wind up leaving, because I can actually edit them sensibly; this may not be a plus, all things considered).

  • Open in Browser. You know all of those annoying websites that insist that patches are not plain text that the browser can show or are sure that you want to download that PDF instead of viewing it in the browser? This fixes them. It's so discreet an addition that you may even forget that you have it installed (this happened to me once and was vaguely embarrassing).

I view PDFs in the browser with Mozilla's PDF Viewer, which I believe is now normally packaged with Firefox.

Miscellaneous:

  • CipherFox gives me access to more information about TLS connections. It's not perfect (partly because Firefox plain doesn't make certain information available to extensions) but I'll take what I can get here. The one thing I really miss is simple information about whether the connection has perfect forward secrecy.

I've experimented with a number of the SSL/TLS certificate monitoring extensions like Certificate Patrol and Perspectives. None of them worked well in my environment with the amount of work I was willing to put into tending them.

This set of extensions is stable and doesn't lead to memory bloat. I can and do leave my main Firefox running for weeks without any problems (on my home machine I started the current instance on April 17th, when I last rebooted). In fact I should probably restart more frequently than I do; I have a couple of pending extension updates at the moment that have been patiently waiting for me to get around to quitting and restarting.

Although I would still like to be able to use GreaseMonkey and Stylish, they still seem to cause memory bloat problems for me as before. I haven't tested things on Firefox 29, but I did retest with a previous version of Firefox a couple of months ago and still saw my old problems. Maybe someday.

Firefox29Extensions written at 00:47:36; Add Comment

2014-04-29

Static sites are stable sites

The usually cited advantage of generating static HTML for your website is the performance you easily get. But there's another, slightly less obvious advantage of a static site: it's easy to preserve and to maintain in operation with basically no attention and effort.

Dynamic sites need code. That means you need a web host where you can run your kind of code and your programs, and it also means that you need to do all of the ongoing work to keep your site's code running. If it's code from other people you'll probably need to apply security updates and other changes from upstream (and perhaps do wholesale updates). If it's your code you at least need to make sure that it works with new versions of programming language X or package Y or the like. If you walk away from a dynamic site it's likely to wind up with security holes and may fall over entirely.

A static site has none of this. Pretty much anyone and everyone can host static files and they're basically never going to stop working. Your current HTML may not have the latest hotness in five or ten years but it's almost certain to render in browsers, probably decently (browsers are pretty good at backwards compatibility, especially if you avoid weird layout tricks). Keeping a static site up requires essentially no effort, it's feasible to walk away from it for years at a time (or even forever), and anyone can keep your site up. Migrating from host to host is also essentially a zero effort thing if you ever need to do that; just copy the files.

(In fact I have a static site that I haven't touched in more than a decade (by now it's basically a historical artifact) and apart from massive link-rot in external links I believe that everything still works fine. If I tried that with a dynamic site of any complexity I'm pretty sure I'd have an ugly mess on my hands. Just think of the shifts in language versions over the time since I last updated the site is scary.)

(This is of course related to my language longevity issue, in that with a static site you don't need to worry about that (among other issues).)

StaticHTMLStability written at 00:17:56; Add Comment

2014-04-09

Pragmatic reactions to a possible SSL private key compromise

In light of the fact that the OpenSSL 'heartbleed' issue may have resulted in someone getting a copy of your private keys, there are least three possible reactions that people and organizations can take:

  • Do an explicit certificate revocation through your SSL CA and get a new certificate, paying whatever extra certificate revocation cost the CA requires for this (some do it for free, some normally charge extra).

  • Simply get new SSL certificates from whatever certificate vendor you prefer or can deal with and switch to them. Don't bother to explicitly revoke your old keys.

  • Don't revoke or replace SSL keys at all, based on an assessment that the actual risk that your keys were compromised is very low.

These are listed in declining order of theoretical goodness and also possibly declining order of cost.

Obviously the completely cautious approach is to assume that your private keys have been compromised and also that you should explicitly revoke them so that people might be protected from an attacker trying man in the middle attacks with your old certificates and private keys (if revocation actually works this time). The pragmatic issue is that this course of action probably costs the most money (if it doesn't, well, then there's no problem). If your organization has a lot riding on the security of your SSL certificates (in terms of money or other things) then this extra expense is easy to justify, and in many places the actual cost is small or trivial compared to other budget items.

But, as they say. There are places where this is not so true, where the extra cost of certificate revocations will to some degree hurt or require a fight to get. Given that certificate revocation may not actually do much in practice, there is a real question of whether you're actually getting anything worthwhile for your money (especially since you're probably doing this as merely a precaution against potential key compromise). If certificate revocation is an almost certainly pointless expense that's going to hurt, the pragmatics push people away from paying for it and towards one of the other two alternatives.

(If you want more depressing reading on browser revocation checking, see Adam Langley (via).)

Getting new certificates is the intermediate caution option (especially if you believe that certificate revocation is ineffective in practice), since it closes off future risks that you can actually do something about yourself. But it still probably costs you some money (how much money depends on how many certificates you have or need).

Doing nothing with your SSL keys is the cheapest and easiest approach and is therefor very attractive for people on a budget, and there are a number of arguments towards a low risk assessment (or at least away from a high one). People will say that this position is obviously stupid, which is itself obviously stupid; all security is a question of risk versus cost and thus requires an assessment of both risk and cost. If people feel that the pragmatic risk is low (and at this point we do not have evidence that it isn't for a random SSL site) or cannot convince decision makers that it is not low and the cost is perceived as high, well, there you go. Regardless of what you think, the resulting decision is rational.

(Note that there is at least one Certificate Authority that offers SSL certificates for free but normally charges a not insignificant cost for revoking and reissuing certificates, which can swing the various costs involved. When certificates are free it's easy to wind up with a lot of them to either revoke or replace.)

In fact, as a late-breaking update as I write this, Neel Mehta (the person who found the bug) has said that private key exposure is unlikely, although of course unlikely is nowhere near the same thing as 'impossible'. See also Thomas Ptacek's followup comment.
Update: But see Tomas Rzepka's success report on FreeBSD for bad news.

Update April 12: It's now clear from the results of the CloudFlare challenge and other testing by people that SSL private keys can definitely be extracted from servers that are vulnerable to Heartbleed.

My prediction is that pragmatics are going to push quite a lot of people towards at least the second option and probably the third. Sure, if revoking and reissuing certificates is free a lot of people will take advantage of it (assuming that the message reaches them, which I would not count on), but if it costs money there will be a lot of pragmatic pressure towards cheap options.

(Remember the real purpose of SSL certificates.)

Sidebar: Paths to high cost perceptions

Some people are busy saying that the cost of new SSL certificates is low (or sometimes free), so why not get new ones? There are at least three answers:

  • The use of SSL is for a hobby thing or personal project and the person involved doesn't feel like spending any more money on it than they already have or are.

  • There are a significant number of SSL certificates involved, for example for semi-internal hosts, and there's no clear justification for replacing only a few of their keys (except 'to save money', and if that's the justification you save even more money by not replacing any of them).

  • The people who must authorize the money will be called on to defend the expense in front of higher powers or to prioritize it against other costs in a fixed budget or both.

These answers can combine with each other.

SSLPragmaticKeyCompReactions written at 00:59:49; Add Comment

2014-04-07

Giving in: pragmatic If-Modified-Since handling for Tiny Tiny RSS

I wrote yesterday about how Tiny Tiny RSS drastically mishandles generating If-Modified-Since headers for conditional GETs, but I didn't say anything about what my response to it is. DWiki insists on strict equality checking between If-Modified-Since and the Last-Modified timestamp (for good reasons), so Tiny Tiny RSS was basically doing unconditional GETs all the time.

I could have left the situation like that, and I actually considered it. Given the conditional GET irony I was never saving any CPU time on successful conditional GETs, only bandwidth, and I'm not particularly bandwidth constrained (either here or potentially elsewhere; 'small' bandwidth allocations on VPSes seem to be in the multiple TBs a month range by now). On the other hand, these requests were using up quite a lot of bandwidth because my feeds are big and Tiny Tiny RSS is quite popular, and that unnecessary bandwidth usage irritated me.

(Most of the bandwidth that Wandering Thoughts normally uses is in feed requests, eg today 87% of the bandwidth was for feeds.)

So I decided to give in and be pragmatic. Tiny Tiny RSS expects you to be doing timestamp comparisons for If-Modified-Since, so I added a very special hack that does just that if and only if the user agent claims to be some version of Tiny Tiny RSS (and various other conditions apply, such as no If-Not-Modified header being supplied). Looking at my logs this appears to have roughly halved the bandwidth usage for serving feeds, so I'm calling it worth it at least for now.

I don't like putting hacks like this into my code (and it doesn't fully solve Tiny Tiny RSS's problems with over-fetching feeds either), but I'm probably going to keep it. The modern web is a world full of pragmatic tradeoffs and is notably lacking in high-minded purity of implementation.

MyIfModifiedSinceHack written at 01:06:39; Add Comment

2014-04-06

How not to generate If-Modified-Since headers for conditional GETs

Recently I looked through my syndication feed stats (as I periodically do) and noticed that the Tiny Tiny RSS program was both responsible for quite a lot of feed fetching and also didn't seem to ever be successfully doing conditional GETs. Most things in this situation aren't even attempting conditional GETs, but investigation showed that Tiny Tiny RSS was consistently sending a If-Modified-Since header with times that were generally just a bit after the actual Last-Modified timestamp of the syndication feed. For good reasons I require strict equality of If-Modified-Since values, so this insured that Tiny Tiny RSS never made a successful conditional GET.

Since I was curious, I got a copy of the current Tiny Tiny RSS code and dug into it to see where this weird If-Modified-Since value was coming from and if there was anything I could do about it. The answer was worse than I was expecting; it turns out that the I-M-S timestamp that Tiny Tiny RSS sends has absolutely nothing to do with the Last-Modified value that I sent it. Where it comes from is that whenever Tiny Tiny RSS adds a new entry from a feed to its database it records the (local) time at which it did this, then the most recent such entry timestamp becomes the If-Modified-Since value that Tiny Tiny RSS sends during feed requests.

(You can see this in update_rss_feed in include/rssfuncs.php in the TT RSS source. Technically the time recorded for new entries is when TT RSS started processing the updated feed, not the moment it added the database record for a new entry.)

This is an absolutely terrible scheme, almost as bad as simply generating random timestamps. There are a cascade of things that can go wrong with it:

  • It implicitly assumes that the clocks on the server and the client are in sync, since If-Modified-Since must be in the server's time yet the timestamp is generated from client time.

  • Tiny Tiny RSS loses if a feed publishes a new entry, TT RSS pulls the feed, and then the feed publishes a second entry before TT RSS finishes processing the first new entry. TT RSS's 'entry added' timestamp and thus the If-Modified-Since timestamp will be after the revised feed's date, so the server will 304 further requests. TT RSS will only pick up the second entry when a third entry is published or the feed is otherwise modified so that its Last-Modified date moves forward enough.

  • If the feed deletes or modifies an entry and properly updates its overall Last-Modified timestamp as a result of this, Tiny Tiny RSS will issue what are effectively unconditional GETs until the feed publishes a completely new entry (since the last time that TT RSS saw a new entry will be before the feed's new Last-Modified time).

There are probably other flaws that I'm not thinking of.

(I don't think it's a specification violation to send an If-Modified-Since header if you never got a Last-Modified header, but if it is that's another flaw in this scheme, since Tiny Tiny RSS will totally do that.)

This scheme's sole virtue is that on a server which uses timestamp comparisons for If-Modified-Since (instead of equality checks) it will sometimes succeed in getting 304 Not Modified responses. Some of these responses will even be correct and when they aren't really correct, it's not the server's fault.

IfModifiedSinceHowNot written at 02:19:46; Add Comment

2014-03-23

Differences in URL and site layout between static and dynamic websites

One of the big but subtle differences between a statically rendered site and a dynamically rendered one is simply how you design the URL structure for both of them. One example here is dynamically rendered versus statically rendered blogs.

Broadly speaking, in a statically rendered site you want to have a minimum number of URLs and for each chunk of core content to appear in a relatively minimal number of places, because you have to pre-generate every URL. The more you let content and URLs propagate, the more pages you have to re-render any time you change something (or simply in general), even if they contain mostly redundant information or may never be requested or both. This is going to drive you towards very simple site layouts; in a blog you might have only the individual entries, a front page with recent entries, and then a relatively simple scheme for archives.

In a dynamically rendered site more URLs are almost free and so you often casually grow URLs and even entire URL hierarchies that offer alternative ways of accessing your core content. After all, often all you need to procedurally generate an entire new URL hierarchy is a single chunk of parameterized code (and 'pagination' of large results is often provided for free by your framework, adding more URLs). Provided that this generation is reasonably efficient you might as well create as many ways of accessing your core content as you can think of. On a blog you might support looking at things by any combination of date, category, tag, author, and so on. All you need is some dispatch rules and some lookup filtering.

(You need to create the same core content in both static and dynamic sites; the difference is how many URLs it is visible under.)

The corollary of this is that you may not have a very happy time if you try to go from a dynamic site to a static site while keeping more or less the same URL structure. In a shift like this you probably want to rethink how things are indexed, which basically means rethinking the overall site design and URL structure.

(The corollary to the corollary is that if you're not sure whether you're going to wind up generating things statically or dynamically you should start out by designing your site as a static site, with a URL layout that works for that. As a bonus you'll likely get a simpler, more focused URL structure.)

StaticVsDynamicSiteLayout written at 02:13:27; Add Comment

2014-03-10

The problem of conditional GET and caches for dynamic websites

For reasons beyond the margins of this entry, Aristotle Pagaltzis recently noticed an oddity with syndication feeds for this blog. To paraphrase his message, he made an initial plain feed request with no If-None-Match header, got back something with an ETag header, redid the request as a conditional GET with the same tag in If-None-Match, and got back a different result with a different ETag. On the surface this sounds like my caching is broken, but what is really going on is that the traditional irony of conditional GET for dynamic sites is interacting with the desire to reduce load through caching.

The dynamic site conditional GET problem is that in many dynamic environments you need to more or less build the entire page in order to determine its ETag and Last-Modified information. If you want to have a full page cache to reduce your load under some circumstances and you don't have explicit cache invalidation (which is very hard in a file based engine), you don't necessarily have fully accurate ETag values; reducing load implies relying on the ETag cached in the page cache even though the actual page and its ETag may have changed since then. If you serve an old cached version of a page, expire it, and then regenerate it, the newly generated version may well be different. This is basically the traditional conflict between a desire for more cache hits and a desire for absolutely current information.

(You can try to track dependency information in the page cache and revalidate it before you use a cache entry and its ETag, but the general problem there is that the more you revalidate the slower a cache hit is. This is especially acute in a file based engine because the validators are harder and often less efficient.)

However this is not a full explanation of what Aristotle Pagaltzis saw (not unless the cache entry for the feed expired between the two requests, which it probably didn't). What is also going on is that DWiki is doing some special hacks in certain circumstances in order to reduce the impact of generating syndication feeds. This is relatively important here because feeds are requested quite often and they're one of the most expensive things to generate (partly because I set the number of entries in feeds quite high).

What I found when I started looking at my conditional GET logs at some point was that I was getting a significant number of requests that were not conditional GETs, ie they lacked both an If-None-Match and an If-Modified-Since header. Overcome by grumpyness I decided that if these people could not be bothered to do conditional GET I was not going to go out of my way to (re)generate current content for them, so what my DWiki setup does is serve them syndication feeds from the page cache for much longer than it does for people who make conditional GETs. This means that if you do what Aristotle did, your first request may get served from an old cache entry but then your second one is recomputed from scratch (now that it's a proper conditional GET). Under the right circumstances this will result in a changed ETag.

(It looks like roughly one seventh of my successful syndication feed requests are being sent without conditional GET at the moment. This doesn't count things like Googlebot that are getting their requests refused outright.)

Conceptually this is tilting the balance between cache hits and avoiding staleness in the direction of more cache hits under some circumstances. I don't think there's anything wrong with this as long as you're doing it deliberately and with your eyes open (and ideally based on numbers).

With that said, now that I've had my attention drawn to this I'm probably going to rethink how I want to handle caching for various sorts of syndication feed requests. My initial syndication feed caching was set up rather a long time ago and there have been several generations of overall cache improvements since then. It's quite likely that the relative cost of generating syndication feeds has shifted in favour of caching them less and generating them more often.

(One of the things that has happened on Wandering Thoughts is that syndication feeds are requested so often that they're almost always in the page cache. I actually routinely flush them from cache by hand any time I publish or revise an entry, which is probably a warning sign I should have paid attention to some time ago.)

ConditionalGETAndCaching written at 23:08:43; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.