There are two different situations for content-types
It's struck me that there are two different situations for content-type sniffing in web agents: when the web browser knows that it is requesting a specific sort of thing, and when it can have no expectations about what it will get. The first situation happens when the browser is doing things like loading CSS stylesheets or retrieving inlined images; the second situation usually happens when the user clicks on a link.
When the browser knows what it is supposed to be getting, it already knows what it's trying to do with the data, and either it's good data or it's not (and the browser has to check). Because the situation is already unambiguous, insisting on the web server sending the right content type as well as valid data is just being legalistic.
(Even with inlined images, a given browser has a relatively small number of image formats it supports and generally all of them have good validity and sanity checks.)
As a result, I believe that in practice pretty much all web agents are fairly forgiving of content types in this situation, and may outright ignore them no matter how technically incorrect this is. (In a sense this is a rerun of the strict validation versus loose validation arguments, and we know how those ended.)
When the browser does not know what it is supposed to be getting, the options are many, the ambiguity is huge (as is the potential harm), and many of the potential formats themselves generally lack good validity checks. Here, browsers use the server content type to disambiguate the situation, and if the results are wrong, well, at least it's not their fault.
(Sometimes the ambiguity is unresolvable. If I send you a valid HTML
text/plain, is it because I want you to read the HTML
source or because I screwed up?)