XHTML on the web is for masochists

July 30, 2006

Web design purists like to talk up XHTML at the moment, but as far as I can tell almost everyone who is trying to do XHTML today is a masochist (or ignorant).

First, Internet Explorer does not support XHTML. Not even IE7 will support XHTML, which means that for all practical purposes you cannot serve only XHTML to visitors; some of them need to get an HTML version instead.

The usual dodge is to serve the same XHTML document as XHTML to browsers that can handle it but as text/html to everyone else. The problem here is that XHTML and HTML have different rules for several areas; creating a XHTML page that will render the same in HTML requires painstaking and awkward contortions.

Changing the Content-Type of a URL on a request by request basis means that your web server needs to do some dynamic stuff on every request, even requests for what would otherwise be static files.

Since the Content-Type varies from person to person, I believe that you need to mark your pages as non-cacheable, to avoid having web caches serve a cached version with the wrong Content-Type to a browser that can't handle it.

And for all of this extra work, what you get is basically equivalent to writing HTML 4.01 strict; it's not as if XHTML gives you more layout power or is easier to write.

(Actually most people are probably ignorant of these issues. This also explains the huge collection of web pages that claim to be valid XHTML but aren't, which would have catastrophic effects if browsers actually believed them, since with XML and XHTML you are supposed to refuse to do anything with the document if it's invalid.)

Some further reading


Comments on this page:

From 87.79.236.202 at 2008-12-16 05:24:49:

You don't need to mark your pages truly non-cachable, but for all intents and purposes the result will be the same. Specifically, you need to send Vary: User-Agent or Vary: Accept (or the combination) in your headers. Due to the huge variations of user agent strings, the former is a tacit statement of non-cacheability, whereas the latter fares a lot better. Eric Bowman described an insanely clever hack on rest-discuss – he uses Vary: Cookie to be able to set a server-controlled cache key.

But note that the only Vary value that Internet Explorer assigns meaning to is User-Agent, which it will treat as absent, since the browser itself always sends the same user agent string. For all other values, IE will behave as for Vary: * and re-request the page on every access.

Aristotle Pagaltzis

Written on 30 July 2006.
« Weekly spam summary on July 29th, 2006
The limitations of Unix atime »

Page tools: View Source, View Normal.
Search:
Login: Password:

Last modified: Sun Jul 30 13:25:40 2006
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.