Wandering Thoughts archives

2006-07-30

XHTML on the web is for masochists

Web design purists like to talk up XHTML at the moment, but as far as I can tell almost everyone who is trying to do XHTML today is a masochist (or ignorant).

First, Internet Explorer does not support XHTML. Not even IE7 will support XHTML, which means that for all practical purposes you cannot serve only XHTML to visitors; some of them need to get an HTML version instead.

The usual dodge is to serve the same XHTML document as XHTML to browsers that can handle it but as text/html to everyone else. The problem here is that XHTML and HTML have different rules for several areas; creating a XHTML page that will render the same in HTML requires painstaking and awkward contortions.

Changing the Content-Type of a URL on a request by request basis means that your web server needs to do some dynamic stuff on every request, even requests for what would otherwise be static files.

Since the Content-Type varies from person to person, I believe that you need to mark your pages as non-cacheable, to avoid having web caches serve a cached version with the wrong Content-Type to a browser that can't handle it.

And for all of this extra work, what you get is basically equivalent to writing HTML 4.01 strict; it's not as if XHTML gives you more layout power or is easier to write.

(Actually most people are probably ignorant of these issues. This also explains the huge collection of web pages that claim to be valid XHTML but aren't, which would have catastrophic effects if browsers actually believed them, since with XML and XHTML you are supposed to refuse to do anything with the document if it's invalid.)

Some further reading

web/XHTMLMasochism written at 13:25:40;

Weekly spam summary on July 29th, 2006

This week, we:

  • got 12,284 messages from 218 different IP addresses.
  • handled 17,177 sessions from 899 different IP addresses.
  • received 152,193 connections from at least 48,479 different IP addresses.
  • hit a highwater of 7 connections being checked at once.

Most of these are up somewhat from last week, although they're within the levels that I've come to think of as 'normal variation'. The day to day figures were quite variable:

Day Connections different IPs
Sunday 15,872 +6,909
Monday 22,221 +6,672
Tuesday 26,190 +7,950
Wednesday 22,421 +6,288
Thursday 23,553 +7,173
Friday 26,121 +8,609
Saturday 15,815 +4,878

Kernel level packet filtering top ten:

Host/Mask           Packets   Bytes
213.4.149.12           7102    369K
212.216.176.0/24       5702    287K
81.88.225.210          4275    235K
62.212.90.203          3960    195K
61.128.0.0/10          3900    197K
64.71.176.237          3685    221K
220.160.0.0/11         2800    140K
218.0.0.0/11           2529    126K
213.0.31.4             2496    120K
210.54.141.0/24        2395    115K

This is more or less around the expected levels.

  • 213.4.149.12 and 81.88.225.210 reappear from last week.
  • 62.212.90.203 has inconsistent reverse DNS, and we don't accept that from its network area. (It's also currently in bl.spamcop.net.)
  • 64.71.176.237 tried to keep sending stuff with a MAIL FROM that had tripped our spamtraps.
  • 213.0.31.4 uses a bad HELO name. Since that's Telefonica IP space and it has no reverse DNS, next week it will be banned for that.
  • 210.54.141.0/24 is xtra.co.nz outgoing mail machines, which tried to keep sending stuff with a MAIL FROM that had tripped our spamtraps. Given that the username of the MAIL FROM is 'uk_winner', I think I can safely chalk up yet another badly managed webmail system.

Connection time rejection stats:

  38282 total
  19078 dynamic IP
  15363 bad or no reverse DNS
   2583 class bl-cbl
    246 class bl-njabl
    165 class bl-sdul
    123 mailup.info
     80 class bl-sbl
     67 class bl-dsbl
     33 class bl-spews
     27 class bl-ordb

Out of the top 30 most rejected IP addresses, 7 were rejected more than 100 times; the champion is 82.89.202.5 (an interbusiness.it IP address) with 419 rejections. 18 of the top 30 are currently in the CBL and six are currently in bl.spamcop.net.

Hotmail's numbers got worse this week:

  • no messages accepted.
  • 11 messages rejected because they came from non-Hotmail email addresses.
  • 15 messages sent to our spamtraps.
  • 5 messages refused because their sender addresses had already hit our spamtraps.
  • 1 messages refused due to its origin IP address being a telkom.co.za IP address.

All of the 'non-Hotmail' addresses rejected were from either msn.com or one of the non-US Hotmail domains. However, almost all of the usernames are typical of advance fee fraud spam usernames (things like 'britishinternational_lottery04' and 'dr_charis_adam13'), so I don't think we're missing much.

And the final numbers:

what # this week (distinct IPs) # last week (distinct IPs)
Bad HELOs 528 44 307 45
Bad bounces 38 26 38 34

The leading bad HELO source is 213.129.201.64, with 135 rejections.

In a surprise, this week we got no bounces to any of the three 38-character hex strings. We did get bounces to all of the other usual suspects, with the most-hit username being 'noreply' (5 bounces).

spam/SpamSummary-2006-07-29 written at 00:29:00;


Page tools: See As Normal.
Search:
Login: Password:

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.