It's a multi-protocol world after all

September 24, 2005

I just fixed a wee bug in DWiki's Atom syndication feeds. The bug was that https:// URLs (such as references to Red Hat's Bugzilla) got mangled in Atom feeds, and only in Atom feeds, to be prefixed with the web site's URL.

DWiki normally generates shortened URLs that have full paths but omit the 'http://website/' bit (for various reasons). But when it generates Atom feed entries, DWiki needs to generate only absolute, fully qualified URLs (the Atom spec calls for this, among other reasons). This means that it needs to be able to recognize which URLs were already fully qualified URLs (because they refer to external websites) and which ones aren't. To tell if a URL was already fully specified DWiki was just looking for 'http://' at the start of the URL it had. So DWiki thought https:// URLs weren't fully qualified and 'helpfully' qualified them in the Atom feed entries.

When I wrote that code, I had forgotten that it's a multi-protocol world (technically, a multi-scheme world). And in a multi-scheme world, checking for just one scheme is almost certainly a bug. In this case, I should have been checking to see if the URL had any scheme at all (which takes somewhat more code; DWiki now goes to this effort).

As a result, I have a new mantra: if my code is looking for http:// and I'm not about to connect to a web server, I probably have a bug. (The magnitude of the bug may vary, but as a minimum all my code should look for https:// too.)

(The wonder of having a blog and talking about my own code bugs is that I can display my stupid programming moments in public. Perhaps it'll goad me into writing higher quality code from the start.)

Written on 24 September 2005.
« Be cautious with numbers in awk
A spammer roundup »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sat Sep 24 03:56:43 2005
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.