The wiki trap (that we've fallen into)

April 14, 2012

A number of years ago we had a not very good support website area that was basically a bunch of HTML pages with very little organization and navigation. In late 2009, when things reached the point where the support site absolutely had to be improved somehow, we made what has turned out to be a bad mistake: we turned it into a wiki. I say that this was a bad mistake because about a year later, we discovered that our wiki software was effectively abandoned. Oh, there's still something by the same name being developed, but it uses a different wikitext format and there's no automatic migration from the old wikitext to the new one. As far as we're concerned, it's different software.

This has left us seriously stuck. It's clear that we have to migrate but it's equally clear that doing so will be a significant amount of work because all of the content is 'locked up' inside the wiki's custom markup (and its file organization). Any means of getting an HTML version of the content will require stripping out what is probably a lot of standard markup and styling that the wiki adds, and we may have to resort to spidering our own site just to get all pages.

(We aren't even thinking about trying to extract the historical record of content modifications. That too is locked up inside the wiki, somehow, but it's not important enough to justify the effort to get it out. So we're going to lose stuff in this migration and no, that does not make me happy.)

In the mean time the wiki software has flaws like bad searching, the content has various issues, and the overall navigation needs to be improved, but doing anything in the current wiki is pretty much a dead end. So we mostly haven't touched it; we make minimal changes to update the content when we change our systems and every so often some especially dedicated person does a little bit more. Deeper issues of structure and wiki features remain completely untouched, partly because modifying the software ourselves is just not going to happen.

(Maybe we could do some of these changes without too much work. But every time we contemplate spending some time learning how to improve the current wiki, the inevitable question is why we don't use that time to start figuring out how to migrate and what to migrate to instead of wasting it on work we'll throw away.)

This is what I'll call the wiki trap: once you put your content into a wiki you're probably stuck with the wiki, for better or worse, because migrating away from a wiki you've adopted is generally quite difficult. A wiki is a one-way ratchet, a place where content checks in but it never checks out.

(Technically you can have a wiki that stores content in HTML form. This wiki would not have this issue, assuming that you can get at the raw content. The other way out is for your wiki to support exporting content to raw HTML. By the way, I'm still ignoring the edit history here.)

I don't have an answer to this situation. In fact this situation makes me very unhappy. I like wikis as a general rule, they give you a bunch of convenient features in one place, I really like simple markup, and there's a lot of appeal to allowing a broad bunch of people edit access to our support documentation so they can improve it. But at the same time I can't deny that putting our content into a wiki has turned out to be very painful experience in a way that could easily happen again with pretty much any other wiki software that we move to. It really looks like we'd be significantly better off with our content in dumb HTML in the filesystem (and assembled into web pages in any number of brute force ways).

(You may be tempted to say that our situation is an exceptional one caused by an unwise choice of wiki software. At one level this is true, but at another level how long is the life of the information that you're putting into your wiki now, and how sure are you that something this could never happen to your wiki software over that lifetime? I'm not sure that I can bet on any wiki software that I want to use to have a fifteen year lifespan, for example, and some of our support documentation is likely to live that long.)


Comments on this page:

From 87.79.236.202 at 2012-04-14 17:36:07:

Something like Gollum would address this, where the wiki is a directory structure of plaintext files written in something like Markdown or Textile (hence no lock-in to the wiki markup) versioned using Git repository (hence no lock-in to its storage backend).

That is how I would design any wiki, really. Data format is everything, code is nothing.

Aristotle Pagaltzis

From 188.226.51.71 at 2012-04-15 15:06:48:

Aristotle:

That was my first thought as well, but how many wikis use git+markdown? When Gollum project gets abandoned, and you'll want to migrate to, say, mediawiki or dokuwiki, will it be easier to do with git-fast-export and markdown converter plus custom scripts to

1. Parse markdown (probably "Gollum-flavored" markdown, not just a regular one) and convert to new wiki syntax

2. Push diffs/versions to a whatever structure/db new wiki uses

3. Probably fill in some metadata, not related to pages themselves

That doesn't seem much easier than whatever-custom-wiki at all. Am I missing something here?

From 78.35.25.18 at 2012-04-15 19:45:32:

The thing about using something like Git and Markdown to power a wiki is that there is lots of software that will be able to do something with your data, with no conversions. But sure, migrating to a horrible data format and locking up your data so that only one particular piece of software can read it is no easier than migrating away from it and liberating your data.

So the glib answer to your question is: you missed that I wouldn't want to migrate to Dokuwiki or Mediawiki in the first place. :-)

Aristotle Pagaltzis

From 78.35.25.22 at 2012-04-16 22:22:18:

Oh, I missed your first question.

Actually, plenty more wiki engines use Git + Markdown, mostly recent little ones, but there is also ikiwiki which is fairly widely adopted, aside of course from Gollum.

It’s unlikely you can take the data from any of them and just drop it into the other – but it seems much more feasible to port the data between them, amounting roughly to a git filter-branch with a script that moves/renames files, and maybe lightly munges them for wiki-link conventions.

And as mentioned above, you can still do something useful with the data even in complete absence of any wiki software.

Aristotle Pagaltzis

From 173.78.217.3 at 2012-04-17 23:01:08:

I don't quite get the point of this post, past it being a venting session, Chris.

Format Lock-in starts with the word Format. Don't ever want to be stuck? Don't mark anything up in any way. Roooiiight.

You got bit. It happens. Sucks. A wildly polar reaction from the pain isn't really reasonable.

Guess what? It's going to happen again.

By trs80 at 2012-04-18 07:52:21:

I still don't understand why you think it's not possible to upgrade from Moin 1.5 to 1.9.

By cks at 2012-04-18 11:33:08:

@trs80: It's possible to upgrade but the question is whether or not it's wise to do so, or more exactly whether it's wise to spend all of the time it would take on an upgrade to MoinMoin 1.9 instead of a migration to something else. In addition we have a general rule of 'once burned, twice shy' and MoinMoin has definitely burned us once.

(We have no particular attachment to or love of MoinMoin; we picked it because it was there, more or less.)

By cks at 2012-04-18 11:46:43:

@173.78.217.3: One way to put it is that most people, us included, think of putting things into a wiki as a way of freeing your content, not of locking it up. And it's really easy to think that the popular (free) software you pick is always going to be supported, so that you are making a choice that is just as safe as basic HTML; however, this is clearly wrong.

(Basic HTML is completely safe. The odds of it ceasing to be supported in, oh, the next 20 years are effectively nil and if something does happen either there will be huge support for tools to rewrite it well or you won't care about your current documentation anyways.)

Written on 14 April 2012.
« Tanenbaum was wrong in the Tanenbaum-Torvalds debate
My view on why CISC (well, x86) won over RISC in the end »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sat Apr 14 01:38:28 2012
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.