2009-01-23
The HTML tax (in Python, and in general)
The HTML tax is my name for all of those bits of verbosity that you have to include when you write straight HTML, as opposed to something more compact. What do I mean by that? Well, consider all of the things that you need for a well formed, standards compliant basic HTML web page these days.
A relatively minimal page needs a doctype, a <head> section with
a <title> and ideally a <meta> declaration for the charset, and
then the boilerplate of the <body> tag. If sensibly formatted (by
my standards) that is at least six lines before I get to do anything
more interesting than pass in the title text. After that comes a steady
drizzle of closing tags in the actual content, most of which are just
a distraction from what actually matters.
Does the formatting of the HTML matter? Yes, absolutely; because this is directly embedded in your source code, it needs to be readable just like the rest of your source code. One of my issues with the HTML tax in Python specifically is that I think that sensible HTML formatting does not look very much like natural Python code formatting, so you have an appearance clash in your code; things change abruptly from style to style, interrupting the visual flow of your source. (Well, of my source.)
This isn't the end of it, because six lines is too much verbosity to
inline every time you want to produce a different HTML page. So you
rapidly start coming up with some way to pass around strings to be
put together with these six or so lines to make up your full error
pages or results pages. I think that this inevitably winds up being
templating via functions, where you call error("Thing A went wrong"),
error() wraps its string argument in your standard error blurb and
calls stdpage(bodyconts), stdpage() wraps its string argument in
your standard page boilerplate, and a large blob of HTML eventually
comes out the end.
Templating via functions is not bad, exactly (as long as you keep it simple enough), but the problem is that the HTML tax (and all of the structure that you've built to get around it) serves as a strong disincentive to deviate from the canned structure, even when it would make for better messages to the user. In other words, once you have a canned error routine, everything becomes a canned error. By contrast, by reducing the tax overhead, simple HTML generation systems encourage creating specialized HTML pages when you could use them, instead of relying on generic ones with some blanks filled in.
(To rephrase a previous entry, templating systems don't solve this problem because they (generally) don't make it much easier to create a new specialized 'template' than writing straight HTML.)