What makes DWiki and other dynamic file based blog engines slow
In yesterday's entry I mentioned that DWiki (the software behind this blog) is pretty much a worst case for a blog engine as far as speed goes. Today I feel like talking about what makes DWiki slow, and by extension the things that can slow down any dynamic file based blog engine. Part of why is so that you (if you are considering writing such a thing) can avoid the mistakes that I made.
(Some of the slowness is because chunks of DWiki's code are not exactly the best that they could be, but the issues there are generally dwarfed by the general ones I'm about to discuss.)
For basic background, DWiki is about as pure a dynamic file based blog engine as you could ask for; conceptually it is purely a bunch of views of a filesystem hierarchy (actually of two of them). Each entry and each comment is stored in a separate file in a directory hierarchy (entries are files in category subdirectories and comments are files in a per-entry subdirectory that is itself in a mirror of the entry's regular hierarchy). Entries (and comments) are written and stored in DWiki's wikitext dialect, not HTML, and the time of an entry (or a comment) is the modification time of its file.
This gives DWiki two main slow points. The most obvious one is converting DWikiText to HTML. At the level of a single entry it isn't a terribly bad process, taking about 6 milliseconds to render yesterday's entry (and then about 4 milliseconds to render the sidebar text, which is also wikitext in a file). But at the level of the blog front page this adds up fast; ten entries is already over 60 milliseconds (although per-entry rendering varies by a few milliseconds depending on what's in them). Still, 60 milliseconds is not a terrible killer.
(In retrospect, one of the reasons to use Markdown or some other popular wikitext format is that other people may well write fast HTML converters for you. With a private wikitext, you're on your own.)
The less obvious but much larger slow point is that DWiki has to walk
the filesystem any time it needs to know the relationship between
entries, or just to find them all. The obvious case is the blog's front
page, which needs to find the N most recent entries; in a file based
engine like DWiki you do this by walking the filesystem to find all the
stat()ing them to find their timestamp, sorting the list,
and taking the top N. More subtly, DWiki also needs to do this walk when
displaying individual entries in order to figure out what the next and
previous entries are so that it can generate links to them. And if you
want to display some sort of calendar of what days or weeks or months
have entries? Again you need a walk.
(Comments are usually less of a problem because the filesystem walks to find them are smaller and more focused. The exception is if you do something crazy like 'show N most recent comments'.)
This filesystem walk is not a big issue for a small blog (which will
have a modest number of files). But when your blog gets more and more
entries, well, things scale up and slow down. Rendering the front page
of WanderingThoughts without any caches currently takes 3,299
lstat()s and scans 18 directories; rendering yesterday's entry
lstat()s and scans 13 directories. This takes a while
even if everything is in the kernel's caches.
(You can optimize the walking code as much as
you want but you still have to
stat() every file no matter what you
do. For scale, a raw filesystem walk over all WanderingThoughts
entries currently takes about 200 milliseconds with hot kernel caches
(in Python, but
find take similar amounts of time).)
The way around these problems is to cache or pregenerate this information, which is why if I was doing a file based blog design again there would be an explicit 'publish entry' step (among other changes).
(DWiki is as weirdly limited as it is because its initial design was to run purely read only, with no write access to anything. Comments and on-disk caching still haven't fundamentally changed that attitude.)
Sidebar: two other DWiki performance-related design mistakes
DWikiText allows bare words (in the usual WikiWord format) to be links
if and only if the target of the link exists. This turns out to be a bad
idea if you want to cache the rendered HTML, because suddenly changes
elsewhere in the filesystem (not just changes to the page itself) can
invalidate the HTML; a file appearing or disappearing can create or
remove a WikiWord link. This adds a couple of extra
time DWiki loads a cached HTML rendering.
(This is not just a performance issue. It means that you can't have a simple model of 'compile the HTML of an entry when it's published and you're done'; you have to worry that publishing a new entry will need an old entry to suddenly be regenerated. The headaches are just not worth it; use a wikitext that requires explicit markup for links and then makes them always be links, whether or not the target exists.)
DWiki has an authentication and permission system that controls things like who can see or comment on an entry. Cleverly I made two terrible decisions when designing it; permissions are embedded in the DWikiText markup and permissions can be per file not just per directory hierarchy. In short, DWiki kind of has to render each file to find out if it can render each file. This is saved only by the fact that generally you're going to render a file anyways any time you need to check its permissions (if it's accessible), but if I was doing it again I would not do this; it could be pretty bad if there were a lot of access-restricted pages.
(DWiki caches this permission information along with the rendered HTML, which helps. The actual code model for doing this is in retrospect kind of terrible, partly because it evolved in multiple steps and was never refactored to be sane.)