2006-07-30
XHTML on the web is for masochists
Web design purists like to talk up XHTML at the moment, but as far as I can tell almost everyone who is trying to do XHTML today is a masochist (or ignorant).
First, Internet Explorer does not support XHTML. Not even IE7 will support XHTML, which means that for all practical purposes you cannot serve only XHTML to visitors; some of them need to get an HTML version instead.
The usual dodge is to serve the same XHTML document as XHTML to browsers that can handle it but as text/html to everyone else. The problem here is that XHTML and HTML have different rules for several areas; creating a XHTML page that will render the same in HTML requires painstaking and awkward contortions.
Changing the Content-Type of a URL on a request by request basis means that your web server needs to do some dynamic stuff on every request, even requests for what would otherwise be static files.
Since the Content-Type varies from person to person, I believe that you need to mark your pages as non-cacheable, to avoid having web caches serve a cached version with the wrong Content-Type to a browser that can't handle it.
And for all of this extra work, what you get is basically equivalent to writing HTML 4.01 strict; it's not as if XHTML gives you more layout power or is easier to write.
(Actually most people are probably ignorant of these issues. This also explains the huge collection of web pages that claim to be valid XHTML but aren't, which would have catastrophic effects if browsers actually believed them, since with XML and XHTML you are supposed to refuse to do anything with the document if it's invalid.)
Some further reading
- Ian Hickson's Sending XHTML as text/html Considered Harmful
- Mark Pilgrim
- Anne van Kesteren's XHTML is invalid HTML and MIME types matter; DOCTYPEs don't.
- W3 XHTML compatibility guidelines
Weekly spam summary on July 29th, 2006
This week, we:
- got 12,284 messages from 218 different IP addresses.
- handled 17,177 sessions from 899 different IP addresses.
- received 152,193 connections from at least 48,479 different IP addresses.
- hit a highwater of 7 connections being checked at once.
Most of these are up somewhat from last week, although they're within the levels that I've come to think of as 'normal variation'. The day to day figures were quite variable:
Day | Connections | different IPs |
Sunday | 15,872 | +6,909 |
Monday | 22,221 | +6,672 |
Tuesday | 26,190 | +7,950 |
Wednesday | 22,421 | +6,288 |
Thursday | 23,553 | +7,173 |
Friday | 26,121 | +8,609 |
Saturday | 15,815 | +4,878 |
Kernel level packet filtering top ten:
Host/Mask Packets Bytes 213.4.149.12 7102 369K 212.216.176.0/24 5702 287K 81.88.225.210 4275 235K 62.212.90.203 3960 195K 61.128.0.0/10 3900 197K 64.71.176.237 3685 221K 220.160.0.0/11 2800 140K 218.0.0.0/11 2529 126K 213.0.31.4 2496 120K 210.54.141.0/24 2395 115K
This is more or less around the expected levels.
- 213.4.149.12 and 81.88.225.210 reappear from last week.
- 62.212.90.203 has inconsistent reverse DNS, and we don't accept that
from its network area. (It's also currently in
bl.spamcop.net
.) - 64.71.176.237 tried to keep sending stuff with a
MAIL FROM
that had tripped our spamtraps. - 213.0.31.4 uses a bad
HELO
name. Since that's Telefonica IP space and it has no reverse DNS, next week it will be banned for that. - 210.54.141.0/24 is xtra.co.nz outgoing mail machines, which tried to
keep sending stuff with a
MAIL FROM
that had tripped our spamtraps. Given that the username of theMAIL FROM
is 'uk_winner', I think I can safely chalk up yet another badly managed webmail system.
Connection time rejection stats:
38282 total 19078 dynamic IP 15363 bad or no reverse DNS 2583 class bl-cbl 246 class bl-njabl 165 class bl-sdul 123 mailup.info 80 class bl-sbl 67 class bl-dsbl 33 class bl-spews 27 class bl-ordb
Out of the top 30 most rejected IP addresses, 7 were rejected more than
100 times; the champion is 82.89.202.5 (an interbusiness.it IP address)
with 419 rejections. 18 of the top 30 are currently in the CBL and six
are currently in bl.spamcop.net
.
Hotmail's numbers got worse this week:
- no messages accepted.
- 11 messages rejected because they came from non-Hotmail email addresses.
- 15 messages sent to our spamtraps.
- 5 messages refused because their sender addresses had already hit our spamtraps.
- 1 messages refused due to its origin IP address being a telkom.co.za IP address.
All of the 'non-Hotmail' addresses rejected were from either msn.com or one of the non-US Hotmail domains. However, almost all of the usernames are typical of advance fee fraud spam usernames (things like 'britishinternational_lottery04' and 'dr_charis_adam13'), so I don't think we're missing much.
And the final numbers:
what | # this week | (distinct IPs) | # last week | (distinct IPs) |
Bad HELO s |
528 | 44 | 307 | 45 |
Bad bounces | 38 | 26 | 38 | 34 |
The leading bad HELO
source is 213.129.201.64, with 135 rejections.
In a surprise, this week we got no bounces to any of the three
38-character hex strings. We did get bounces to all of the other usual
suspects, with the most-hit username being 'noreply
' (5 bounces).