You don't need to bake your site to static files to be fast

April 11, 2011

Recently (for my value of recently) there has been a bunch of people pushing rendering your website to static files as the way to make it able to stand up to surprise load; for example, Tim Bray's More on Baking (which has links to others). I disagree with this approach, because it's not necessary and it has some significant downsides.

You don't need static files to go fast when you're hit with load; you just need software that doesn't suck. A little caching helps a lot and is generally very easy to add to decent software, and honestly these days any blog or framework with pretensions of quality should have some sort of option for this. As I've discussed before, the load surges from popular links aren't the same as being heavily loaded in general and thus they can be dealt with with much simpler techniques.

Since everyone likes anecdotal evidence, I will point to DWiki (the software behind this blog) as an existence proof of my thesis. DWiki is not what you could call small and efficient, and it still manages to hold up against load with only a relatively crude cache. It's not running on a big dedicated server, either; this machine is relatively modest and this blog shares it with lots of other things. I've never been linked to by any of the big traffic sources, but I have been pounded by spambots and the campus search engine without anyone really noticing.

The big downside of static rendering is the problem that Tim Bray glosses over: cache invalidation. Your static rendering is a cache of your real website, so when your real website changes (for example, someone leaves a comment or you modify an entry) you need to invalidate all of the static renderings for all of the URLs where the updated content appears. Tim Bray makes this sound easy because he has cleverly arranged to not have anything that he needs to do cache invalidation on, but he has done so by being aggressively minimalistic (for example, he doesn't really do tagging or categories). This is, to put it one way, very unusual. Most blog software that you want to use is all about having multiple views of the same basic information; you have the entry page and the main page and the category or tag pages and various archival views, and you may have syndication feeds for some or all of them. All of this multiplies the number of URLs involved in your site quite a bit.

(This URL multiplication also increases the cost of baking your site, of course. If you have a heavily baked site and more than a modest amount of content, you probably aren't going to have many alternate views of your content, or at least not very many alternate views of content that you expect to change very often. For example, you might make all of your archive and category pages just have the titles of entries so that you don't have to re-render them if you modify an entry.)

Ultimately it is the usual tradeoff; baked sites run faster at the cost of more work for you. I think that this is a bad tradeoff for most people, since most people do not have heavily loaded sites and an occasional load surge is quite easy to deal with (provided that you have software that doesn't suck).

PS: possibly I am overly optimistic about the quality of common blogging and framework software.

Comments on this page:

From at 2011-04-12 05:21:21:

And god forbid you change your site template/layout, you'll have to rebuild everything. :)

From at 2011-04-12 08:00:52:

You don't need to edit everything for layout changes if you use SSI. Even if you have duplicate stuff in each page it should be easy to sed or whatever. Personally I think I've saved enormous amounts of time not worrying about the foibles of blogging platforms, by just using static HTML files (with SSI). Details here:

From at 2011-04-12 08:03:47:

More work for who? Your link links to a page where it says static is more work for the author of the software, but the context in which you're linking it is talks about dynamic being better as long as the software you're using doesn't suck--which sounds like the author of the software isn't you?

And if you're writing from scratch, writing a static baked system is surely easier than writing a dynamic system with caching. So: better performance, and easier to write.

Obviously I'm missing a point of yours here since you reached the opposite conclusion.


From at 2011-04-12 10:35:51:

Brent Simmons started this recent trend about baked sites because so many Wordpress sites go down when John Gruber links to them. He has been banging on this drum since. The reality is that this is solely a fault of Wordpress, which runs oodles of SQL queries with huge result sets on each and every hit. Or in short, it’s an utter pig.

You don’t need to bake your weblog to not go down when Gruber links you. You just need the code to not suck rocks. The whole meme is a tempest in a tea cup.

Aristotle Pagaltzis

From at 2011-04-12 18:02:25:

Where do you draw the distinction between a baked site and a cached site? They're both a snapshot of a dynamic site. They both suffer from potentially stale cache. They both require an invalidation mechanism for the publishers.

I think you're trivializing what is a complex problem regardless of your performance optimization strategy. I'm quite curious as to what your solution is.

By cks at 2011-04-14 00:13:57:

My thoughts on the distinction turned out to be long enough that I made them into an entry, BakingVersusSpeedII.

Written on 11 April 2011.
« The importance of test suites for standards
Some common caching techniques for dynamic websites »

Page tools: View Source, View Normal, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Mon Apr 11 22:52:29 2011
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.