2013-02-04
Dynamic web sites and blogs need not be slow, illustrated
There is a common myth or perception that dynamic websites and especially dynamic blogs are slow. I've tilted at this particular windmill before, but today I feel like providing some actual numbers for DWiki (partly because I just finished some performance tuneups). DWiki can run either as a CGI (under light load) or as a SCGI server (under heavier load). The interesting load test is for the SCGI server version, complete with its Rube Goldberg hookup. So let me give you some numbers for that.
The full SCGI stack, including the the C CGI program that talks to the SCGI server, can serve this blog's front page in roughly 50 milliseconds a request and a random entry's page in 25 milliseconds a request. Actually those numbers aren't realistic because I turned off two layers of caching in order to slow things down. Under a Slashdot-style load directed primarily at a small number of URLs, those times would drop to 7 milliseconds for the front page and 6 milliseconds for the random entry.
(That time includes starting a C program for every request and having it communicate with the SCGI server. Today's moral is apparently that starting small C programs is really fast on a multiprocessor system, although this is a bit artificial since I haven't tried to run a bunch of them at once as would be happening in a real load situation. The other moral for today is that performance numbers really depend on your assumptions; all of mine here are 'single request at a time'.)
Giving you SCGI numbers is actually a bit misleading, because this blog spends most of its time running as a CGI. So what are the numbers like there? Not as impressive, I'm afraid; DWiki averages 250 milliseconds (ie a quarter of a second) to serve the front page and 215 milliseconds to serve the random entry. Most of that time is the overhead of starting a rather substantial Python program as a CGI. I will note that sane people do not do that. While a quarter of a second is not great response time, I also feel that it's not particularly terrible.
(Sane people would also use a more efficient SCGI gateway than my CGI, and maybe a more efficient SCGI server implementation than DWiki's.)
What these timings show is two things. The first is that running a dynamic website in the right way is important; running DWiki as a CGI costs about a fifth of a second. The second is the vast importance of caches, because even though I turned off two layers of caching there's a third layer that's vital for decent performance. Running as a SCGI server but with no caches at all, DWiki averages 320 milliseconds per request for the front page and 245 milliseconds for a random entry. A completely uncached DWiki request served via a CGI would take over half a second (and use a lot of CPU and do a lot of disk IO).
I'm not going to claim that getting DWiki fast was easy and I'm not going to claim that DWiki was fast from the start. Rather the opposite; many years ago, DWiki started out life as the uncached, CGI-only version that would have rather bad performance today. It took many iterations of both adding caching layers and tuning performance to get it up to its current full speed under load (and I still somewhat regret that it takes so much caching to go fast). On the other hand, DWiki is really a worst case for a blogging engine at this point; it has so many core points of inefficiency and waste that a better blogging engine would avoid. If DWiki can be made to go fast despite this, my view is that anything can. Slow dynamic blog engines are slow because they are slow, not because they are dynamic.
(To put DWiki's problems one way, it turns out that filesystem hierarchies make poor substitutes for a real database once things scale up.)
Sidebar: The hardware involved
All of these timings are still misleading because I haven't told you what hardware all of this is running on. All of these timings are on a 2007 or so vintage Dell SC 1430 with a single Xeon 5000 (3 GHz) and one GB of memory. And it's an active multi-user machine that spends much of its time running relatively heavyweight spam filtering processes and so on (although my timing runs were done at a relatively quiet time).
One of my dreams is to bring up an instance of DWiki and this blog on a small virtual machine (512 MB RAM, say) to see how fast I can get it to go in that sort of minimal environment and how I'd have to configure things. But, sadly, too many potential projects, too little time.
What good cryptography error messages need to include
Suppose that you have a system that involves cryptography and at some point your cryptography indicates that there is a problem, one where something isn't verifying properly. Many programs fumble what happens next in a way that significantly degrades their security; they give people bad error messages, ones that make it hard for people to understand what's wrong.
Crypto error messages are very important because crypto error messages are your one chance to convince people not to go on anyways. People really want to what they were planning to do, so their default action is to override your warnings or errors and go on anyways (in whatever way that that takes). If there is a real problem (as opposed to a false alert) you desperately want people to stop, so you desperately want to convince them that there is a problem.
Generic error messages do not do this. Cryptic error messages do not do this. What both communicate to most people is nothing more than 'something went wrong in the magic black box' and that is not going to stop people from going on.
So my view is that a good crypto error message needs to include four things (at least). It needs to tell me exactly what is wrong, what it happened to, what it might mean, and what the minimal workaround is. As much as possible it should cover all of this in plain language because most people will not understand technical jargon. A good error message should also be comprehensive, in that the program should check everything it can instead of just stopping at the first error.
(It goes without saying that a good error message should also be accurate. Inaccurate error messages are a great way of training people to ignore them entirely.)
Although it's tempting to say that the goal of a crypto error message is to give people enough information that they can make an informed decision, that's not really it. The real goal is to give people enough information to convince them that you're right and something really is wrong. Allowing them to make an informed decision is secondary, although important in reality (because in reality it's probably more likely that you're wrong). If people override your warning, you want to have given them enough information so that they can confidently say why you're wrong.
(Well, so that people who care and pay attention can confidently say so. I'm a realistic idealist about this.)