2005-09-12
The annoyance of arbitrary limits
In theory, WanderingThoughts has been syndicated on LiveJournal as the LJ user cks_techblog (here) for some time. In practice, LiveJournal has an undocumented, hard-coded limit on how large syndication feeds can be; go over the limit and LJ refuses to process your feed. Because I dislike small limits, DWiki defaults to putting lots of entries in syndication feeds, more than enough to exceed LiveJournal's size limit given how I write.
(LiveJournal compounds the problem by not allowing a syndication feed
to be modified after it's created. If the feed URL could be changed,
it could have been pointed at a URL that used a smaller size limit,
using the /latest/<NN> VirtualDirectory form.
And had LiveJournal documented the limit, I might have known to
point the person who created the feed to use an URL like that to
start with.)
For a long time I ignored the issue, on the grounds that it was LiveJournal's problem and not mine. But it always nagged at me, one of those little background irritations about CSpace. Today I finally gave in and created an ugly (although generalized) workaround to size-limit the syndication feeds DWiki creates for certain IP addresses.
I'm not fond of the workaround, but I'm less fond of the situation without the workaround. I've found that that's life in the real world, where clean software runs up against irritating situations and has to get its hands dirty.
Moral: please avoid arbitrary limits in your software. They are going to irritate someone sooner or later. Probably sooner than you think.
Of course, had LiveJournal avoided an arbitrary limit, or handled it better, everything could have just worked. Since LiveJournal was transferring the full syndication feed only to drop it later, it could have done something smarter, like only parse the first N kilobytes or the first 20 entries (especially since it only pays attention to the first 20 entries anyways).
(Then LiveJournal added insult to two injuries by reordering the 20 entries it picked (fortunately the top 20 entries) in some completely wacky order that is neither feed order, reverse feed order, or even chronological.)
But at least the irritation that LiveJournal was fetching a useless feed from me is now gone. That's progress, right?
2005-09-06
Two sides of Internet identity
The issue of identity on the Internet is a tricky one, and one of the things that makes it so confusing is that people often want them to do two different things:
- prove who someone is: the web site really is EBay's.
- prove who someone isn't: that email isn't from a well-known spammer.
This is confusing because the issue mostly doesn't came up in the real world, where physical identities are generally one to a person. Checking drivers' licenses at the door is all you need, whether you want to enforce a guest list or keep your creepy stalker ex-boyfriend out.
But no feasible Internet identity scheme is one to a person. This means that while Internet identity schemes (ranging from Microsoft Passport to LiveJournal user names to OpenID) can provide positive identification (you are someone I know), they cannot provide negative identification.
In turn, this means that if you need negative identification on the Internet the only way to create it is manually, with a 'whitelist' of positive identification. If you want to make sure your creepy stalker ex-boyfriend is not reading your LiveJournal, you have to restrict who can read it to only people you've approved (and hope that you haven't been fooled by one of them).
Even positive identities mean less than people would like them to because it is very hard to reliably link one identity to another, especially to real world identities. VeriSign itself was once fooled into issuing a SSL certificate to a 'Microsoft' that wasn't the one headquartered in Redmond Washington (one story on this is here).
Any time someone proposes an Internet scheme that relies on negative identities (including but not limited to 'we can kill spam by requiring everyone to digitally sign their email, then refuse email signed by known spammers'), you should run for the hills.
2005-09-03
Varying interpretations of improper CIDRs
What I mean by a 'CIDR' is a network address specification in CIDR notation (more at the Wikipedia entry for Classless Inter-Domain Routing).
A 'proper' CIDR is one where the host address portion is all zero. It's easiest to see this for /8's, /16's, and /24's; for example, 128.100.0.0/16 is a 'proper' CIDR, with the last two octets zero, but 128.100.128.0/16 is not.
Every so often people argue for flexible interpretations of CIDRs that allow for 'improper' ones. This is a bad idea. Assuming that your software accepts something like '128.100.128.0/16' at all, what IP address range does it mean? There are at least three possibilities:
- 1.
128.100.0.0to128.100.255.255 - The proper /16 that contains 128.100.128.0.
- 2.
128.100.128.0to128.101.127.255 - A /16 sized address range starting at 128.100.128.0.
- 3.
128.100.128.0to128.100.255.255 - The portion of the proper /16 starting at 128.100.128.0.
All three are plausible answers. Which one any particular piece of software uses depends on the implementation details of how it parses CIDRs. And of course this means that different programs you have, or your programs and my programs, may well have different views on what they cover.
Sidebar: it's probably best to write CIDRs out in full
It's popular to abbreviate CIDRs by leaving off zero octets that are part of the host address portion, for example writing '200/8' instead of '200.0.0.0/8'. Unfortunately, as someone found recently, not all software accepts the short form. And worse, not all software that doesn't really accept the short form will tell you about it; sometimes it will try to guess what it thinks you really meant and get it wrong.