You don't need to bake your site to static files to be fast

April 11, 2011

Recently (for my value of recently) there has been a bunch of people pushing rendering your website to static files as the way to make it able to stand up to surprise load; for example, Tim Bray's More on Baking (which has links to others). I disagree with this approach, because it's not necessary and it has some significant downsides.

You don't need static files to go fast when you're hit with load; you just need software that doesn't suck. A little caching helps a lot and is generally very easy to add to decent software, and honestly these days any blog or framework with pretensions of quality should have some sort of option for this. As I've discussed before, the load surges from popular links aren't the same as being heavily loaded in general and thus they can be dealt with with much simpler techniques.

Since everyone likes anecdotal evidence, I will point to DWiki (the software behind this blog) as an existence proof of my thesis. DWiki is not what you could call small and efficient, and it still manages to hold up against load with only a relatively crude cache. It's not running on a big dedicated server, either; this machine is relatively modest and this blog shares it with lots of other things. I've never been linked to by any of the big traffic sources, but I have been pounded by spambots and the campus search engine without anyone really noticing.

The big downside of static rendering is the problem that Tim Bray glosses over: cache invalidation. Your static rendering is a cache of your real website, so when your real website changes (for example, someone leaves a comment or you modify an entry) you need to invalidate all of the static renderings for all of the URLs where the updated content appears. Tim Bray makes this sound easy because he has cleverly arranged to not have anything that he needs to do cache invalidation on, but he has done so by being aggressively minimalistic (for example, he doesn't really do tagging or categories). This is, to put it one way, very unusual. Most blog software that you want to use is all about having multiple views of the same basic information; you have the entry page and the main page and the category or tag pages and various archival views, and you may have syndication feeds for some or all of them. All of this multiplies the number of URLs involved in your site quite a bit.

(This URL multiplication also increases the cost of baking your site, of course. If you have a heavily baked site and more than a modest amount of content, you probably aren't going to have many alternate views of your content, or at least not very many alternate views of content that you expect to change very often. For example, you might make all of your archive and category pages just have the titles of entries so that you don't have to re-render them if you modify an entry.)

Ultimately it is the usual tradeoff; baked sites run faster at the cost of more work for you. I think that this is a bad tradeoff for most people, since most people do not have heavily loaded sites and an occasional load surge is quite easy to deal with (provided that you have software that doesn't suck).

PS: possibly I am overly optimistic about the quality of common blogging and framework software.

Written on 11 April 2011.
« The importance of test suites for standards
Some common caching techniques for dynamic websites »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Mon Apr 11 22:52:29 2011
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.