2010-03-20
Web analytics versus GET parameter security
I have recently run into an interesting collision between typical web analytics practices (and applying this to random URLs) and good security and robustness. The straightforward manifestation is that links to WanderingThoughts entries from the Planet Sysadmin Twitter feed don't work; trying to follow one gets you remarkably terse error messages from DWiki (the software behind WanderingThoughts).
DWiki is very cautious. One of the ways that this manifests is that it doesn't accept random query parameters on requests; it knows what query parameters each URL accepts, and anything else is an error. I maintain that this is both secure and robust; certainly my logs have a constant parade of attempts to exploit the willingness of bad PHP applications to accept additional random (and, as it turns out, dangerous) query parameters. The abrupt error messages are happening because of extra query parameters.
The extra query parameters aren't directly visible in the URLs in the Twitter feed, which uses bit.ly to shorten the URLs, and they aren't in the original form of the entries on Planet Sysadmin. Instead the shortening process is adding them on.
The extra query parameters are always '?utm_source=twitterfeed&utm_medium=twitter'. Some web searching suggests that this sort of query parameter is added to URLs so that JavaScript based, on-page web analytics packages can track the source of inbound links (I think especially advertising based stuff, but I'm not sure). This hijacking of query parameters for web analytics does require that your target application ignore random extra query parameters; as this incident nicely illustrates, this is not an assumption that you can or should make in general, for other people's URLs (and you'll want to test it for your own web application).
(I suspect that the direct culprit is twitterfeed's analytics features, which I further suspect are enabled by default.)
PS: I've let the Planet Sysadmin people know about this, so it'll presumably get fixed at some point. Assuming that twitterfeed and all of the other moving parts involved in this allow you to turn it off.
2010-03-14
Space and content
One of the things that's been driven into me in the process of writing WanderingThoughts is that the amount of horizontal and vertical space that your content sits in does affect its readability (even when it is not absurdly small or absurdly large), and in turn this affects how you write your content. Some things only look right when inside narrow margins and look horrible when expanded outside of that, and vice versa. The same is true in many ways for vertical space and how much is visible at once.
More concretely, my entries for WanderingThoughts often have relatively short paragraphs. One reason for this is that I both preview and read WanderingThoughts entries with relatively narrow margins. Short paragraphs wind up looking right for me in this environment and long paragraphs often look wrong, and this holds true even though the longer paragraphs look perfectly fine in my editor, sometimes even preferable; often it feels somewhat unnatural to write paragraphs that are only two or three sentences long, although I know they'll look right in the end.
(What I think happens is that when my horizontal margins shrink, paragraphs look bulkier and longer because they take up more vertical space. Similarly, when horizontal margins widen, paragraphs look smaller and shorter, eventually reaching a point of unreadable absurdity.)
What I take from this and from related experiences is that you cannot really divorce content from its presentation. Your knowledge of how things look (and how they will look) will affect how you put your content together, and so your website's layout affects how you structure your writing. If you revise your website's fundamental layout, you may well wind up structuring your writing differently.
The side effect of writing this down is that I now feel somewhat more sympathy for people who try to create fixed-size, fixed-font website layouts. In this view, they're responding to this with an attempt to make sure that everyone will see the content the same way.
(I still think it's a misguided attempt. Even if you control the width of the content area you can't necessarily control the size of the user's fonts, and it is the interrelationship between the two that matters.)
One obvious corollary is that a blog's support for draft entries should let you preview them in something that is as close to the real site layout as possible, complete with your usual sidebars and so on, so that you can see how everything fits together.
(How important this is depends on how intrusive your sidebars and header and so on are.)