Wandering Thoughts archives

2011-04-11

You don't need to bake your site to static files to be fast

Recently (for my value of recently) there has been a bunch of people pushing rendering your website to static files as the way to make it able to stand up to surprise load; for example, Tim Bray's More on Baking (which has links to others). I disagree with this approach, because it's not necessary and it has some significant downsides.

You don't need static files to go fast when you're hit with load; you just need software that doesn't suck. A little caching helps a lot and is generally very easy to add to decent software, and honestly these days any blog or framework with pretensions of quality should have some sort of option for this. As I've discussed before, the load surges from popular links aren't the same as being heavily loaded in general and thus they can be dealt with with much simpler techniques.

Since everyone likes anecdotal evidence, I will point to DWiki (the software behind this blog) as an existence proof of my thesis. DWiki is not what you could call small and efficient, and it still manages to hold up against load with only a relatively crude cache. It's not running on a big dedicated server, either; this machine is relatively modest and this blog shares it with lots of other things. I've never been linked to by any of the big traffic sources, but I have been pounded by spambots and the campus search engine without anyone really noticing.

The big downside of static rendering is the problem that Tim Bray glosses over: cache invalidation. Your static rendering is a cache of your real website, so when your real website changes (for example, someone leaves a comment or you modify an entry) you need to invalidate all of the static renderings for all of the URLs where the updated content appears. Tim Bray makes this sound easy because he has cleverly arranged to not have anything that he needs to do cache invalidation on, but he has done so by being aggressively minimalistic (for example, he doesn't really do tagging or categories). This is, to put it one way, very unusual. Most blog software that you want to use is all about having multiple views of the same basic information; you have the entry page and the main page and the category or tag pages and various archival views, and you may have syndication feeds for some or all of them. All of this multiplies the number of URLs involved in your site quite a bit.

(This URL multiplication also increases the cost of baking your site, of course. If you have a heavily baked site and more than a modest amount of content, you probably aren't going to have many alternate views of your content, or at least not very many alternate views of content that you expect to change very often. For example, you might make all of your archive and category pages just have the titles of entries so that you don't have to re-render them if you modify an entry.)

Ultimately it is the usual tradeoff; baked sites run faster at the cost of more work for you. I think that this is a bad tradeoff for most people, since most people do not have heavily loaded sites and an occasional load surge is quite easy to deal with (provided that you have software that doesn't suck).

PS: possibly I am overly optimistic about the quality of common blogging and framework software.

web/BakingVersusSpeed written at 22:52:29; Add Comment

The importance of test suites for standards

A while back I noted that very few of the web's standards have a test suite and that this could be a problem. You might reasonably ask why this matters, especially since so few standards in general have a test suite.

My answer is that having an official test suite for the standard does a lot of things:

  • it lets you know if the implementation you've created actually conforms. Without this you're left with various sorts of ad-hoc tests that may be hard to set up and run (eg, do you interoperate with as many of the other implementations as possible in as many situations as possible).

  • it means that everyone has the same idea of conformance and what the correct behavior is. Ideally this includes odd and unconventional behaviors, because the people who created the test suite looked for areas in the standard that could be missed or misunderstood and added tests to cover them.

  • a test suite forces standard writers to be unambiguous about what the standard says. When people write tests, they also have to come up with what the results of the tests should be.

  • the process of creating a test suite exercises the standard and thus helps to insure that it doesn't have subtle contradictions and that it is complete. These issues will also be discovered by attempting to implement the standard, but the advantage of a test suite is that it discovers these issues before the standard is frozen.

  • the process of creating the test suite also makes sure that the standard's authors understand at least some of the implications of the standard's requirements before the standard is finalized. This is not as good as requiring an implementation, but it will at least find some of the problems.

All of these are good and praiseworthy things, but there's another way to look at the situation. The reality is that every standard needs a test suite and is going to get at least one. The only question is whether the 'test suite' will be written independently by every sane implementor, using whatever potentially mistaken ideas of the proper standard-compliant behavior that the implementor gathered from reading the standard a few times, or if the test suite will be created by people who know exactly what the standard requires because they wrote it.

(Every sane implementor needs a test suite because they need to test whether their implementation actually works right.)

(Yes, all of this is obvious and well known. I just feel like writing it down, partly to fix it in my mind.)

tech/TestSuiteImportance written at 00:05:59; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.