The anatomy of a DWiki bug

August 13, 2005

Yesterday's entry mentioned in passing fixing a bug in DWiki. Today's entry is about the anatomy of that bug (and how I found it).

This starts with DWiki's processing model, which involves displaying pages in a number of different formats, which are called 'views'. (More about this is in ProcessingModel.) Not all views are valid for all pages (for example, the 'history' view isn't valid for a file without a version history).

If an URL doesn't explicitly mention a view, DWiki uses a default. Files always default to the 'normal' view (which shows their DWikiText as HTML), but directories can specify that they want to default to something else, like the 'blog' directory view that creates WanderingThoughts.

DWiki puts a toolbar at the bottom to give you access to the alternate views of the page that you're viewing. This raises a little issue for links that the page shows: what view of the target do they take you to? (For technical reasons this is mostly relevant for directory pages.)

I decided that the best answer was that non-default views should be modes, so links would show the target in the current view if the target could be displayed in it. This meant that if you visited a non-default view of a directory and then went into a subdirectory, you saw the subdirectory in the same way.

The actual problem:

This logic turns out to have a little problem, made visible through the following sequence:

  • visit a directory with a non-normal default view.
  • switch to the normal view of this directory.
  • because this is not the default view, links to files are now made with an explicit view-setting '?normal' on the end of the URL.

This is bad for two reasons: it is redundant, and worse it makes these links look like they are new pages when in fact they may be pages you've already visited.

The latter is especially important for search engines crawling a DWiki site, since I want them to index the canonical URL for the page plus not wind up thinking that I have a lot of URLs with duplicate content. (I suspect that this causes search engines to dislike one's site, since one winds up looking like a search engine spammer. And even if it doesn't, it increases the total number of URLs in a DWiki that they have to crawl, slowing down the overall process.)

The fix

The fix is pretty simple: if we're generating a link to a page in an explicitly specified view but that view is already the page's default view, just leave out explicitly setting the view.

This has a little downstream problem; now if you go into a non-default view in a directory, go down into a subdirectory for which this is the default view, and then go down into a sub-sub-directory for which it is not, you will not still wind up in the same view. Instead you wind up in the sub-sub-directory's default view.

Fortunately this is down into the area of taste decisions, so I'm comfortable with this.

How I found the issue

I found this problem by looking at my server logs and noticing that the MSN spider was crawling file pages in Software with URLs that included an explicit '?normal' specifier. This made me scratch my head; while they were valid URLs, I was puzzled where the MSN spider was getting them from.

Software is a directory that uses a non-normal default view. Suddenly a little light went on in my head as I thought about the path from visiting Software, following its 'View Normal' link, and perhaps seeing links to regular pages with an explicit ?normal in the URLs. Following the path by hand showed me that DWiki was generating such URLs; mystery solved.

An ironic side note

In the process of writing this blog entry, I just found another bug (again one of logic). This one was an incorrectly parenthesized multi-clause if statement that caused DWiki to explode when I put two or more '[[/Software/]]' links into this article. (It had to have the trailing slash, but two or more links to a nonexistent page would have done it too.)

Again I'm not at all sure if unit tests would have found this issue, because I'm not sure if I'd ever have thought to write a test for this specific case.

Written on 13 August 2005.
« Chiming in on static versus dynamic typing
Those amusing Referer spammers »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sat Aug 13 01:58:05 2005
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.