There are two different situations for content-types

January 9, 2008

It's struck me that there are two different situations for content-type sniffing in web agents: when the web browser knows that it is requesting a specific sort of thing, and when it can have no expectations about what it will get. The first situation happens when the browser is doing things like loading CSS stylesheets or retrieving inlined images; the second situation usually happens when the user clicks on a link.

When the browser knows what it is supposed to be getting, it already knows what it's trying to do with the data, and either it's good data or it's not (and the browser has to check). Because the situation is already unambiguous, insisting on the web server sending the right content type as well as valid data is just being legalistic.

(Even with inlined images, a given browser has a relatively small number of image formats it supports and generally all of them have good validity and sanity checks.)

As a result, I believe that in practice pretty much all web agents are fairly forgiving of content types in this situation, and may outright ignore them no matter how technically incorrect this is. (In a sense this is a rerun of the strict validation versus loose validation arguments, and we know how those ended.)

When the browser does not know what it is supposed to be getting, the options are many, the ambiguity is huge (as is the potential harm), and many of the potential formats themselves generally lack good validity checks. Here, browsers use the server content type to disambiguate the situation, and if the results are wrong, well, at least it's not their fault.

(Sometimes the ambiguity is unresolvable. If I send you a valid HTML document as text/plain, is it because I want you to read the HTML source or because I screwed up?)

Written on 09 January 2008.
« Why I feel that a missing Debian package is a bad sign
What you don't know about other peers in BitTorrent »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Wed Jan 9 23:03:42 2008
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.