More on baking websites to static files and speed
A commentator on my first entry on this asked a good question:
Where do you draw the distinction between a baked site and a cached site? They're both a snapshot of a dynamic site. They both suffer from potentially stale cache. They both require an invalidation mechanism for the publishers.
I think there are two important differences.
First, a baked site is effectively a permanent cache. The word 'permanent' is the important thing, because it means that you absolutely have to get invalidation right because there is nothing else that will save you if the wrong data gets into the baked site.
(Any permanent cache has almost the same problem that invalidation must be completely correct.)
A temporary cache can do invalidation on heuristics because if the heuristics don't work out, bad data will time out 'soon enough' anyways. The ultimate version of this is to have no invalidation heuristics at all, just timeouts, and accept temporarily stale pages or data. This makes the problem of cache invalidation (or validation) much simpler, especially in extreme cases; such cases are good enough to make you survive Slashdot-style load surges, so for many people that's all they need.
Second, a cache still works if there is a cache miss; a baked site generally does not. This means that you have a big hammer to deal with cache problems: you simply flush the entire cache. Your site is suddenly slow until the cache rebuilds, but it still works and more importantly, it is instantly guaranteed correct and current. There is no equivalent with typical implementations of baked sites (although there are implementation tricks that give you this); the software may let you force a full rebuild, but it won't give you a correct site on the spot since 'populating' the 'cache' is an asynchronous process.
This also means that your site still works completely if something didn't make it into the cache or if the cache is malfunctioning for some reason. Pre-baked sites have no similar mechanism; if something doesn't get baked for some reason or gets removed somehow, well, it's a 404 until you (or software) notice and fix it. The advanced version of this is that it's quite easy and natural to deliberately have a partially cached dynamic site, instead of caching everything. There's no such natural equivalent for baked sites (although once again it can be done with implementation tricks).