The HTML tax (in Python, and in general)

January 23, 2009

The HTML tax is my name for all of those bits of verbosity that you have to include when you write straight HTML, as opposed to something more compact. What do I mean by that? Well, consider all of the things that you need for a well formed, standards compliant basic HTML web page these days.

A relatively minimal page needs a doctype, a <head> section with a <title> and ideally a <meta> declaration for the charset, and then the boilerplate of the <body> tag. If sensibly formatted (by my standards) that is at least six lines before I get to do anything more interesting than pass in the title text. After that comes a steady drizzle of closing tags in the actual content, most of which are just a distraction from what actually matters.

Does the formatting of the HTML matter? Yes, absolutely; because this is directly embedded in your source code, it needs to be readable just like the rest of your source code. One of my issues with the HTML tax in Python specifically is that I think that sensible HTML formatting does not look very much like natural Python code formatting, so you have an appearance clash in your code; things change abruptly from style to style, interrupting the visual flow of your source. (Well, of my source.)

This isn't the end of it, because six lines is too much verbosity to inline every time you want to produce a different HTML page. So you rapidly start coming up with some way to pass around strings to be put together with these six or so lines to make up your full error pages or results pages. I think that this inevitably winds up being templating via functions, where you call error("Thing A went wrong"), error() wraps its string argument in your standard error blurb and calls stdpage(bodyconts), stdpage() wraps its string argument in your standard page boilerplate, and a large blob of HTML eventually comes out the end.

Templating via functions is not bad, exactly (as long as you keep it simple enough), but the problem is that the HTML tax (and all of the structure that you've built to get around it) serves as a strong disincentive to deviate from the canned structure, even when it would make for better messages to the user. In other words, once you have a canned error routine, everything becomes a canned error. By contrast, by reducing the tax overhead, simple HTML generation systems encourage creating specialized HTML pages when you could use them, instead of relying on generic ones with some blanks filled in.

(To rephrase a previous entry, templating systems don't solve this problem because they (generally) don't make it much easier to create a new specialized 'template' than writing straight HTML.)

Written on 23 January 2009.
« The NFS re-export problem
Towards a better undo »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Fri Jan 23 01:51:51 2009
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.