2013-08-10
The importance of names, illustrated through my mistake
A while back I refactored chunks of DWiki's core DWikiText to HTML rendering code to fix a fundamental error that I'd made in the original design. Today I discovered (entirely by accident) that in the process I had accidentally broken most of DWiki's on-disk caching of that rendering. Well, 'broken' is a strong word and not really an adequate description. What I'd done was stop using it.
In the new refactored code there were (and are) three important
functions. _render()
is the low-level function that does
the actual rendering and returns a render results object,
_render_cached()
layers caching on top of _render()
, and
finally _render_html()
calls _render_cached()
and then gives
you the full HTML from the resulting RRO.
DWiki generates HTML by expanding templates that wind up invoking
high-level HTML generation functions (you can see the huge list here); a number of them take DWikiText and generate
various HTML forms of it (with or without the title, only the first
paragraph, only the title, and so on). Several of those wanted the full
HTML so they used _render_html()
and got caching. Others wanted
some subset of the HTML and so couldn't use _render_html()
. What
they should have used was _render_cached()
; instead, I coded them
all to use _render()
. As a consequence none of them used any
on-disk caching; every time they were used they re-rendered the wikitext
from scratch.
(The really embarrassing one was wikitext:cache
, which exists in large
part simply to cache stuff.)
I can see more or less how I made the mistake. After all, the name of
the function I had them call is exactly what I wanted when I was writing
the code (and it even had a nice short and direct name). It's just that
it has (or really lacks) some side effects and I completely overlooked
those side effects because the name didn't shove them in front of
me. In hindsight a much better name for _render()
would actually
have been _render_uncached()
; if it had had that name I would have
been directly confronted by what it didn't do any time I put it in code
and as a result I wouldn't have done so.
Sometimes I learn my lessons the hard way. I hope that this one sticks. If I'm smart I'll take the time and annoyance to rename the functions to better names, lest I recreate this mistake the next time I revisit this area of DWiki code.
(This bugfix is not yet in the Github version but will be in a day or so, when it doesn't blow up here.)
Sidebar: what did and didn't get cached, and how I missed this initially
When I first did this refactoring everything that Wandering Thoughts used got cached (and I watched the caches get populated when I initially made the change here). But that was because I didn't immediately take advantage of the new features I'd also added. When I revised Atom feeds to omit the title from the generated HTML and changed several aspects of how entries look (eg, adding dates in between the entry title and its main text), I changed to using HTML generators that didn't cache. Of course by then I wasn't looking at the cache because I knew it all worked.
The net result was that the only thing that was using the on-disk cache was looking at Wandering Thoughts through the front page, the sub-pages for categories, or paging back through these indexes. Neither generating Atom feeds nor looking at individual entries was using the on-disk cache.
(It turns out that enough things (I think mostly web spiders) walk through Wandering Thoughts using the index pages to fill up the on-disk cache enough that I didn't notice them being unusually empty.)