Wandering Thoughts archives


The right way to do wikitext transitions

Suppose that you have a wiki and for some reason you really need to change the wikitext dialect that it accepts (sadly this is not always hypothetical). As I once alluded to in a parenthetical aside, there is a right way to do this, one that will not make people swear off your software. As part of this you can make a small change to your wiki engine that will make all sorts of transitions much easier and thus make people happier with your markup language.

To put it simply, the wrong way to do wikitext transitions is anything that does not use your normal wikitext rendering engine. The right way is to use your regular wikitext rendering engine but instead of having it output HTML, have it output your new wikitext markup. Using your regular engine means that the conversion process interprets the wikitext exactly as it usually gets displayed; you never have a case where the conversion thinks the wikitext markup means one thing but it actually means another.

(You are also quite likely to have a complete conversion, since the rendering engine is itself a natural checklist of all of your markup. And if you miss some markup that's actually used, you can spot it from unexpected HTML in the output.)

So why don't wiki authors routinely do this? My guess is that many wikis don't actually have rendering engines with real parsers, but instead mostly use regular expression based progressive rewrites of the input text. Such progressive rewrites are relatively easy for wikitext to HTML because your output format is generally hard to confuse with your input format (which means that you don't run the risk of accidentally reprocessing already fully processed output). They are not as easy with wikitext to wikitext, because here your output format is easily confused with as yet unprocessed input.

(This is the old general regular expression problem of wanting to rename A to B at the same time that you rename B to A.)

A closely related way to make people happy with you is to have some way to dump out raw (untemplated) HTML for wikitext pages. People like this because it makes migrating away from your wikitext engine much simpler. Content in plain HTML is extremely portable and relatively easy to put into something else; the HTML that your wiki outputs for actual pages is not so much, because it is ornamented with navigation, sidebars, and so on. Also, when you have a specific 'output plain HTML' mode you can easily make it walk all wikitext pages for people instead of forcing them to crawl their site.

(This is on my mind lately because we are staring at this issue; we have a MoinMoin wiki that we need to turn into something else, and extracting the content in some usable form is clearly going to be a pain.)

I understand that some wikitext engines can import sufficiently plain and straightforward HTML and turn it into wiki markup (eg, I believe there is software to do this for Markdown). I consider this going above and beyond the call of duty for a wiki, but if you want to do it and can do it well it'll certainly be appreciated. If you support both simple HTML output and simple HTML input, try to make sure that doing a round trip doesn't change the markup (because sooner or later some joker will try it, just to see what happens).

web/GoodWikiTextTransitions written at 21:34:45; Add Comment

Configuration management is not documentation, at least not of intentions

In this Sysadvent paen to configuration management I read the following little bit:

[...] In using a config management system, you are implicitly documenting the system's "desired state" - Why is the system configured this way? [...]

No you aren't. If you use very well named configuration management classes and variables and so on, you may have at best started to document the why of your configuration. Otherwise, configuration management documents the how of your configuration but it can only lightly touch on the much bigger picture of why.

(Here I want to distinguish between a CM configuration itself and any comments that you add to the CM configuration. Using a CM system doesn't require writing comments, and writing documentation on the why's of a configuration doesn't require putting that documentation into a CM configuration file as comments.)

Documentation on why needs to cover two aspects of the system, neither of which CM captures. The first aspect is why the system exists at all; what is the high level picture of why the system and the services running on it do and how they interrelate to other machines and services. Your CM system can tell you that this system runs Apache, but there's nothing in the CM configuration itself that will necessarily tell you why. The second aspect is why this system is set up the way it is, things like why you chose a particular daemon and why its configuration is the way it is. There may be vitally important information buried in these decisions, for example the painfully acquired knowledge that on machines with X memory you cannot set Apache parameter Y larger than value Z, but a CM configuration is again silent on all of that.

(And there is also things like why you didn't use some attractive setting.)

There is also a meta-issue, which is that a CM configuration is usually an incomplete specification of the real system. Using a CM system is all about telling it what you do to a target system, ie more or less what you change on it. If you don't need to change something, if the system comes set up for it correctly from the start, it's quite likely that this knowledge will not be in your CM configuration. This is great for compact CM setups, except that once again it means that your CM configuration is missing important information about the system.

(You can use a CM system to redundantly specify everything about a system's configuration, carefully telling it to do things like enable all of the Apache modules that you need even though they're all already enabled in the default install. But I really suspect that most people writing CM configurations are not that bloody-minded and determined; instead they specify the changes and additions that they needed to make to the base system to get things working.)

sysadmin/ConfigMgmtIsNotDocumentation written at 00:06:16; Add Comment

Page tools: See As Normal.
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.