2011-05-31
The programmer's problem with WikiText systems
Here is a confession: I have been sitting on an update to DWiki that adds some things to its dialect of wikitext for going on four years now. The core problem boils down to two issues specific to wikis.
The first is that in a wiki, additions to your markup language are more or less forever. Once people actually start writing pages that use them, you have three choices; you can support the addition forever, you can drop the addition and live with the resulting broken pages, or you can go through your entire page database and try to rewrite all of the pages to use some new equivalent. It is possible to do the last option well, but people usually don't and then sysadmins hate you.
(If you want to do a good job, reuse your real wikitext parser and just have it emit the correct new version of the wikitext instead of HTML. This way you guarantee that your conversion process parses the old wikitext in exactly the same way that page rendering does.)
The second is the eternal wikitext lament: all of the good formatting characters (and character sequences) are taken. You never have enough formatting markup to go around, and every bit of markup that you use is a bit of plain text that people can't use (this is especially so for the good markup). When combined with the first problem, this means that you want to be really sure that some particular bit of markup is the right use for its particular character sequence before committing to it because you're mostly stuck if it later turns out that you made a bad choice; a bad choice can be really painfully bad, forever locking off an otherwise attractive bit of markup.
It's interesting (well, to me) to think about why HTML doesn't suffer from this issue. Part of it is that HTML adds features slowly, but I think that a large part of it is that HTML's markup is not in short supply the way that wikitext markup is. The goal of good wikitext markup is to be unintrusive and small. HTML doesn't have this concern since it's already made the decision that its markup will be clearly intrusive, and thus it has a much wider range of decent markup to choose from.
(The CS geek way to put this is that HTML has made the decision to put all of its markup in its own namespace, separate from your document's actual text, whereas wikitext markup is trying to live in the same namespace as your document. One of the times that HTML becomes unusually irritating is exactly when the two namespaces can't be kept separate because your document text is riddled with <'s and &'s.)
2011-05-30
My recent experience with Firefox's speed
I used to look a bit oddly at people who talked about Firefox being slow, because that wasn't my experience at all. My Firefox worked fast and stayed that way even when I left it running for months, and I usually attributed this to not allowing Javascript to run (I wrote about this a long time ago and I still stand by it). The idea of migrating to another browser for faster speed always sounded a bit odd; how much closer to instant responses could you really get?
(My Firefox was not the speediest thing going when running heavy Javascript, but then my machine itself is not the most modern and I wasn't too surprised by that.)
Then recently I upgraded from my personally compiled vintage 2006 Firefox (which was more or less some 3.0 alpha or beta version, built from the then current CVS source code) to a current Firefox 3.6. The result of this has given me a whole new appreciation of this issue, because my new Firefox is, shall we say, not speedy. In fact it's frequently decidedly pokey, with clear pauses every so often when I want it to do things like open new windows, follow links, or even scroll text. Many things seem to stutter or otherwise not work anywhere near as well as my old Firefox did.
I'm not sure what the cause of this bad performance is but my suspicion is on Firefox's new sqlite-based history database, which is known to have problems on Linux to start with. I keep a perpetual browsing history, so I have a very large history database (my home database sqlite file is 185 MBytes); I can believe that even checking it is kind of slow (especially if it involves actual disk IO). By contrast, Firefox's old history database basically kept everything in memory and seems to have worked fine for me despite the size.
(The obvious experiment is to temporarily throw away my history or drastically reduce its size, but that has certain downsides. I keep my large history because I find it very useful, after all.)
Sidebar: why I finally upgraded my old Firefox
The short answer is 'it seemed about time'. Two things pushed me over
the edge; first, my old Firefox was crashing fairly frequently when I
browsed LiveJournal, and second, it looked increasingly like I was going
to have to give up xfs on Fedora because using xfs crashes the X
server and the bug
did not look like it was going to get fixed. My main reason for still
using my old Firefox was that it was the last Firefox version that could
use the fonts I like, but it needed xfs for other fonts; if I was going
to have to give up xfs, I was effectively going to have to give it up
anyways.
2011-05-25
More not supporting random query parameters in URLs
A while ago I wrote Websites should not accept random parameters in requests, arguing that your web server or application should not accept query parameters that it doesn't expect. Recently there was a comment there that argued:
Postel's law does apply in the case. Consider the aspect of client and server versioning. Moreover, links may be recorded and followed at a later date. Your proposal of not serving a page with unexpected URLs will break links as soon as the server is modified.
I feel that this incorrectly merges the two difference cases of query parameters that you once used but no longer do and query parameters that you never used at all.
Query parameters that you've used in the past are a case of Cool URIs don't change. Any time you consider changing URLs, you should be making a conscious and deliberate decision. Ideally you'll carefully evaluate what URLs you want to stay and what URLs you want to stop working. And yes, sometimes I think that the right decision is to remove old URLs, because cool URLs are hard.
While it's tempting to 'support' old query parameters by accepting them and then ignoring them, I think that this is often a mistake. Mapping old URLs to new results is not necessarily doing anyone any favours; you really do need to return the same sort of thing that people are expecting. You can do a certain amount of remapping, but something like turning an Atom syndication feed into an HTML page is not actually any better than removing the URL entirely and returning a 404 result. If you want to really continue supporting the query parameters, you can't necessarily ignore them; you may well need to generate some redirects to the appropriate new versions of the pages.
Or in short: you don't want to just serve up any old page in response to old URLs, you want to serve up a useful result for the people who are using those old URLs.
(And if you can't serve up a useful result, the last thing you want to do is preserve the URL as something useful. Either return a 404 or serve up a redirect to some URL that you actually want people to use now.)
For query parameters that you have never used, the logic of my original argument applies in full force. You have no idea what effect or result the client expects to get from the query parameter, so robustness argues that you should not guess. That you might use some query arguments in the future is a weak argument; why is the client generating them now? The time to add support for them is when you actually use them (either in URLs you generate on the server or in a version of software that you push to clients).