2008-12-31
Certificate authorities seem to be a real weakness in SSL
My main reaction to recent events is that they show how certificate authorities are one of the real practical weaknesses in SSL. All of the theoretical security in the world can be trivially thrown away by shoddy practices on the part of a single trusted certificate authority, and we've recently seen not one but two such CAs exposed.
The case of the improperly issued mozilla.com certificate is a clear-cut procedural failure, either failing to vet your resellers or failing to properly vet a signing request or, for that matter, both flaws at once, depending on your perspective. My view is that it is both at once, since Comodo appears to have been happy to take money from what could generously be called an extremely dodgy reseller and to delegate customer verification to resellers, which I am not convinced is at all consistent with a CA's responsibilities.
While the RapidSSL CA exploit is primarily an attack based on the weaknesses of MD5, note that it was only made possible by the sloppy security habits of RapidSSL, namely sequential serial numbers and fast, predictable signing speeds. The latter is especially significant as a CA security issue, because it means that the ability to sign certificates with RapidSSL's CA certificate is very 'close' to the public-facing website and is fully accessible by automated means. Even if an attacker who can compromise the website does not get the CA private key, they have probably gained the ability to get a valid certificate for any host.
(I am going to generously assume that RapidSSL's internal software won't auto-sign CA delegation certificates or wildcarded certificates, and that RapidSSL's private key is held in a hardened dedicated crypto processor, not in something that an attacker could compromise.)
It would be nice if we could believe that these were isolated incidents. But I can't; I think that they are the inevitable end product of the kind of economic incentives that CAs operate under. There is simply no money to be made from security and from vetting people; security costs money, and vetting your resellers and customers not only costs money, it also loses you money as you have to turn down more resellers and customers instead of taking their money. In the absence of actual costs to this behavior (as opposed to occasional bad press), we should expect that there are far more weak CAs out there that just haven't been exposed yet.
2008-12-19
Comments and dialogues
As Author writes, you don't strictly speaking need to have blog comments in order to have a (blog) dialog. Well, sort of (at least in my opinion). One of the strong advantages of blog comments, and I think why would-be commentators keep asking for them, is that they are the easiest way to make sure that the person who wrote the original entry sees your reaction to their entry. Fairly often the primary audience for your reaction is the original author, so not getting their attention renders your entire effort pointless.
(They can ignore your comment or pretend that it doesn't exist, for example by deleting it, but usually even this means that they've seen it. And yes, comments don't always guarantee that your reaction is seen, but I think that they are a lot more sure than the alternatives in practice.)
In practice I think it's somewhat more than that. Comments (at least comments that don't get moderated or deleted) offer a winning combination of being public, attracting the attention of the original author, and being the most likely way to get other people who read the original entry to see your reaction. Nothing else really comes close; email is private, and your own blog entry may not attract the attention of either the original author or other readers, especially if your blog is less widely read and influential.
(Given that people are social, I think that having an audience is important to people. Writing into the void is much less motivating. Also, at some level I think that it is impossible to have an actual dialog without attracting the other party's attention; otherwise you just have a bare reaction. I do think that you can have a dialog with the overall audience without the participation of the original author, though.)
What this suggests to me is that blogging needs better ways for people to discover and track such cross-blog references. In the specific case of the anime blogging community, my impression is that it is still small enough that someone could build a manually maintained 'planet anime' that pulled blog feeds in order to automatically discover such cross-references and assemble them for people to peruse. You'd probably want to structure things so that people could follow references both to specific blogs and to specific postings (ideally with syndication feeds for everything).
(Ignoring the censorship issue, trackbacks offer the possibility of achieving this without having to put your reaction in a comment, but they've been killed by spam. Things like Technorati searches might also do it, if they worked, but unfortunately they don't, at least not reliably.)
(I touched on this overall idea in passing in an earlier entry.)
2008-12-16
Why XHTML is doomed, at least in its strict validation form
Right now, if you are doing 'XHTML' there are basically three options for what is actually going on:
- you are smart and clever and knowledgeable about XHTML (and masochistic).
- your pages are actually
text/htmlpages and thus not real XHTML (regardless of what the validators may tell you). - your pages are not displaying in IE.
The vast majority of people doing 'XHTML' are in the second category. (The third category is not popular for the obvious reason.)
Almost all of the pages in the second category either don't validate as XHTML or wouldn't actually work as intended if they were actually interpreted as XHTML. And there are a lot of them, because 'XHTML' has become a technical superstition, a kind of prophylactic good housekeeping seal of approval that is invoked to bless your pages with 'modern web standards'.
In practice this huge number of existing invalid XHTML pages means that it is too late to introduce strict validation. If you introduce strict validation, either almost no one would use it (people keep doing 'XHTML', not real XHTML), or almost no one would pass it, creating a huge pressure to give in on strict validation. Believing that people will rewrite their pages to pass strict validation in any volume is a fantasy; most people simply do not do all that much pointless work (and yes, it is pointless work).
(But we have strict validation now, you cry. Not really. What we have
right now is an illusion that is sustainable because IE does not do
XHTML, which creates the excuse to serve pages as text/html so that
IE can see them, which lets people not have their noses very forcefully
rubbed in all the validation failures. If you make IE do XHTML, you
remove the excuse, destroy the illusion, and things come tumbling down
into one of the two options above. I expect the first option, with a
lot of denial and plain ignorance about it because we already have
that today.)
The net result is that strict XHTML is doomed in either case. In the first case, it is doomed to demonstrated irrelevance; in the second case it is just plain doomed.
(And if you remove strict validation from XHTML, I think that what you are left with is more or less HTML5 plus namespaces with some syntactic differences. Or possibly no syntactic differences, as I haven't been keeping up with the latest news.)
2008-12-13
The pragmatic problem with strict XHTML validation
There is a pragmatic problem with strict XHTML validation (well, several, but I'm only going to pick on one right now). It goes like this:
Strict XHTML validation in the browser clearly punishes users. If there is more than a trace amount of actual XHTML problems, this means that not doing strict validation is significantly more user friendly and thus a significant advantage for any browser that is not XHTML strict.
Given that you are punishing people by failing strictly, you are effectively engaged in a giant game of chicken with all of the other browser vendors. A policy of passing up the user friendliness advantage of non-strict XHTML validation is sustainable only as long everyone passes it up; the first significant browser vendor to break ranks will necessarily cause a massive stampede for the 'graceful failure' exit. And sooner or later someone is going to break ranks.
(This game of chicken is made more unsustainable by the fact that Microsoft IE is not even playing yet.)
I don't think that hoping for only a trace amount of XHTML validation failures is realistic. Even with the most optimistic view of content generation (where all XHTML content is automatically generated or checked) here are bugs and oversights in automatic page generators, and for that matter bugs and oversights in validating parsers. My pessimism says that someone is going to get something wrong sooner or later, even in widely used software.
(In fact my personal view is that strict XHTML validation has survived until now only because almost no content actually is XHTML. In the real world of the web the only commonly used 'XML' formats are syndication feeds, which are often invalid and are never parsed with strict XML error handling by any feed reader author who wants actual users.)
2008-12-12
Two-step updates: the best solution to the valid XHTML problem
Let us suppose that you want to create an environment that insures that XHTML stays valid XHTML. The corollary of how validation failures should punish the person actually responsible for them is that the site author should be punished for invalid XHTML, and you need to punish them directly.
So let's make the web server itself validate your XHTML; if it's not valid, it doesn't get served. Does this punish the site author? Not necessarily, because in order for the site author to notice they have to read their site; otherwise, again, the only people this is punishing are the readers, just like before, but this time the big BZZT is coming from the server instead of from the browser. (Arguably this makes it worse.)
So we have a couple of goals and needs. Clearly we need direct feedback to the author about invalid XHTML, and we want to give them no choice about getting it, which means that we need to force them to use our tool at some point when they update their website. We also want to make sure that readers are not punished due to invalid content (or at least are punished as little as possible), which means that we always need to serve up valid XHTML, which means that we can't allow in-place editing of the live page data.
Thus we want a two-step update process, where the author first prepares the update using whatever tools they want to use and then publishes it by running a tool provided by the web server. The tool validates all of the update and only applies it if validation succeeds; if it fails, the tool complains and stops, leaving the web server serving up the old (and valid) data.
(The author can still ignore the error and the lack of updates if they really want to. Ultimately there is nothing you can do to guarantee that people pay attention.)
A two-step update process can be implemented in a number of ways. The most straightforward is probably using a version control system with checkin guards, such that you can only check in a new version if it passes validation. 'Publishing' is then checking in and signaling the web server to get the new version from the VCS.
(However, if you want to make sure that users can't possibly edit the live data, you'll want to keep it in some sort of opaque database. Finding a justification for this is up to you.)
2008-12-09
What sort of user interfaces the web is good for
Here is a corollary to what standard interfaces are good for, in the form of a thesis:
The web excels at standardized interfaces, but requires increasingly heroic amounts of work for excellent customized ones (once you want to step out of what the browser gives you). This implies that web applications are a natural fit for occasional usage things, where any awkwardness of the web interface is overcome by the advantage of not having to learn yet another custom interface that you use for ten minutes once every month, or for things where the natural web interface is a great fit for the task, but they are not a good fit for anything else.
(For example, Google Maps is very nice, but in a sense it is a bear dancing. As Google Earth demonstrates.)
This raises the obvious question: what things are a natural fit for the normal web interface? I think that the answer is filling in not too large forms, or at least forms with relatively little interaction with the system (for, eg, immediate error checking on some fields), and navigating through information in relatively simple ways that have only a few choices. Usefully, I think it turns out that there are a lot of applications that don't want to do much more than that.
(This suggests that one reason that people actively like web applications over their old non-web versions is that they were sick and tired of programmers coming up with yet more interfaces for what was fundamentally filling in forms and doing basic information navigation. Possibly this is an obvious corollary.)