2011-03-28
The problem with contributing documentation to projects
Every so often, a well intentioned person will suggest that you should help out open source projects by contributing documentation (there are several variants of how this suggestion is made, but this is what all of them boil down to). The problem is that this is much harder to do than you might think, especially for significant technical documentation like API details.
Setting aside all of the other difficulties of creating documentation, the core issue is that official documentation must above all be accurate. It is not enough to describe something that works (for you); you need to be correct about what you are describing and how it operates, you need to be complete, and you need to be describing the correct way to do something (instead of, say, an odd hackish workaround that you stumbled over or an internal interface). In order to produce this sort of accurate documentation, you generally need to be an expert with a deep understanding of the code, the history, and the philosophy of the project.
(An additional issue is that you generally need to be an expert with the current bleeding edge development version, because this is generally what the project is working on. Any expertise with older released versions is much less useful; even if the project is going to maintain them for some time, projects are generally not enthused about having older versions be better documented than new ones.)
Trying to contribute documentation that is not necessarily accurate is just as dangerous as other overly-detailed bug reports and patches, and for the same reasons. At a minimum, you force a project expert to double-check your documentation (and even for an expert this may take work and investigation). At the worst, your inaccurate work is good enough to fool people on superficial examination and is accepted into the project's documentation, where its subtle problems may fester for some time.
By contrast, carrying on on a blog to note down some things you've worked out is much easier. You don't need to be complete, and no one with any sense expects a third-party blog entry to be accurate in the way that the official documentation is.
2011-03-19
The two sides of (PPPoE) DSL service
(A quick disclaimer: this is how it works in Toronto. I think it's how it works pretty much anywhere where you have a choice of DSL providers, but I'm not sure.)
There are two parts of your DSL service: providing the basic DSL signal tone, what lights up the '(A)DSL' light on your DSL modem, and your actual PPPoE-based 'DSL ISP service' that connects you to the Internet through your ISP. What a lot of people don't realize is that these parts are not tied together except in a business sense (you can't normally buy one without the other). There is nothing that says 'this DSL ISP credential can only be used on this line' or 'this line will only connect to this ISP'; instead it is more like plugging into a cloud and then bringing up a VPN to your ISP.
(This is basically why (and how) the 'test@test' special testing DSL username that I mentioned yesterday works. It's just another available DSL 'ISP' and credentials.)
This opens up a number of tricks. The most useful one is that you can bring up your regular DSL connection on any line that has DSL signal (either because your line broke but you have access to some other one, or just because you're at someone else's place). You can also connect to multiple DSL services at once on the same line, from either the same computer or (I think) different ones.
This also means that it's possible to test known working elements of DSL service independently. For instance, assume that you are trying to help a friend get their DSL connection working on their computer, and you already have your own DSL connection. You can troubleshoot their computer, their DSL ISP credentials, and their DSL line separately; you should be able to bring up their connection on your computer and your line, their computer ought to be able to bring up their connection on your line, and your computer ought to be able to bring up your connection on their line. This can be very useful and reassuring if you seem to be fighting multiple issues at once.
The most interesting trick is that it is theoretically possible to have a 'bare' DSL ISP, one that simply gives you DSL PPPoE credentials and leaves it up to you to get DSL signal tone somehow. At one point the University of Toronto effectively was such a creature (through a special agreement with Bell); if you already had a DSL connection, you could authenticate with your UofT identifier and you would wind up being directly connected on the UofT network. Among other things, this generally resulted in better performance when you were talking to UofT machines since you didn't have to go through the Internet to do so (and at the time the Internet was slower than it is today).
(As far as I know, Bell will not sell you bare DSL signal tone at all and I don't think any ISP here offers bare DSL credentials as a normal service. It's possible that there are contractual restrictions between Bell and the DSL ISPs that prevent them from doing so, although I doubt there are any technical restrictions that would prevent a very, very friendly ISP from setting this up as a temporary favour. Note that 'bare DSL signal tone' here is different from naked DSL.)
As a side note, I don't think that this was a conscious design goal of the local DSL environment so much as just the easiest way to implement a system where the telco gives multiple ISPs access to the basic DSL infrastructure. Restricting what line can talk to what ISP (possibly with what credentials) is simply more work than designing and running a cloud-like architecture that's a free for all, and there's certainly less to go wrong (at the telco's level, at least).
2011-03-14
Why growing IPv6 usage is going to be fun, especially for sysadmins
Recently at home, my testing Firefox started hanging when I tried to visit a particular website that I browse every so often (experimentation eventually showed that it would display the site, but very very slowly). I could browse it from work, my syndication feed reader at home was still talking to it, and in fact my regular Firefox instance could still see it. It was rather odd, or even outright mysterious.
My home machine has a native IPv6 address and connectivity (not just a 6to4 setup); my work machine only has 6to4 connectivity. The particular website I was having problems with has both IPv4 and IPv6 addresses, and, I believe, has had them for some time. And, you guessed it, there is some sort of connectivity problem between me and their IPv6 address (or addresses, since they have various sub-sites for static media and so on).
This IPv6 connectivity issue affects only my testing Firefox because only that Firefox runs in an environment that allows it to make IPv6 connections; my regular Firefox and my syndication feed reader both run through setups that force IPv4 only traffic. I don't have a general IPv6 connectivity problem, since I can visit places like ipv6.google.com and test sites such as this one say that my setup is fine.
Now, I'm an experienced system administrator and I know that I have IPv6 enabled (because I configured it specifically). It still took me a fair while to make the leap from 'weird inability to browse this website in Firefox' to 'IPv6 related issue', and I still don't know exactly what's wrong between hither and yon (although testing here suggests that it is a routing reachability issue).
This is the world we have to look forward to as IPv6 becomes more widely deployed and used. In a sense this is a good development, because it will smoke out a lot of configuration issues and so on. But in another sense it is a terrible world, because what users see is things breaking when you (or they) turn on IPv6.
One consequence of this is that I don't feel very optimistic about World IPv6 Day; I expect it to be notable mostly for mysterious problems and lots of complaints. I am frankly amazed that Google and other major websites could be talked into it.
(My bad workaround for my problem was to tell my system that the IPv6 addresses associated with this website were unreachable. This causes an immediate connection failure and the browser immediately falls back to IPv4.)
2011-03-09
Why feed readers have been a geek flash in the pan
People have been bemoaning the slow decline of syndication feed readers for a while now. While I think that there are many causes of this, in my view one of these are that syndication feed readers are inherently an interface with a limited, geeky appeal.
To see why, you only have to describe the sales pitch for feed readers: they present a minimal and essentially unstyled and un-designed interface (or at least a generic one) to browsing, but not skimming, through a lot of information. This is a great pitch for specialists, but really, does this sound like something that ordinary people are going to want to use if they have alternatives?
Ordinary people actually like how websites style themselves, and they don't deal with the Internet information overload this way; instead of trying to follow a river of (personalized) news, they dabble in entertainment. As I've written before, a conventional browser and a source of URLs is a much better interface for this than the typical feed reader, including web-based ones such as Google Reader.
How did so many people miss the boat on this? (After all, Google built Google Reader even though they have apparently since abandoned it due to lack of use.)
While I'm sure that part of it was geek enthusiasm for new, geekish stuff that fits our needs, I think that the unforeseen growth of Twitter and Facebook are a decent part of it. Twitter and Facebook created a way to get a more or less personalized stream of URLs that you might want to take a look at; you effectively get to mine what your personal social network thinks is notable (which is how much news gets around in real life to start with).
(Reddit and Digg sort of filled the same niche, but they are nowhere near as personalized. Your social network's stream of URLs is effectively curated, and curated by specific people to boot to make it easier to filter. Us hairless apes are very good at deciding how much we should trust what some specific person that we know tells us about something; it has been a core survival ability for quite a while.)
2011-03-08
My personal hard drive capacity curve inflection point
It's not news that hard drives have been rapidly growing in capacity for, oh, the last two decades or more, all the while dropping in price. This affected everyone differently, but for me the inflection point where this made things really change was around 2006, or somewhat before it, and it changed in a really concrete way.
When I got a new machine in 2006, it was the first time that I wound up with enough hard drive space to swallow my old machine's data wholesale, without even having to think about it. In all of my prior machine transitions, I didn't have enough disk space to just take a full copy of the old machine and also fit everything I wanted into the new machine, so I wound up keeping the old machine running through an increasingly baroque series of lashups.
(At one point I had an Ultrix machine, an SGI Indy, and a Linux machine all running, each of them with some portion of my files. The resulting contortions were awkward and still ripple through to my current workstation environment.)
Part of this change was the steady march of ever cheaper consumer drives. But another part of it was that by 2006, consumer drive technology had finally reached the point where I was willing to trust it. All of my pre-2006 machines were built with SCSI drives, both because SCSI drives were more reliable and trustworthy than IDE drives and because IDE itself was kind of a beast to get running well on Linux; for a long time the received wisdom was that if you wanted good reliable performance, you paid extra for SCSI and accepted lower drive capacities. But by 2006, SATA and Linux support for it was far enough along that I could spec SATA drives without significant qualms and finally hop on the consumer drive bandwagon.
(It helped that SATA drives were cheap enough that I could spec two drives and then mirror them.)
Hard drive capacities have only gotten better since 2006, of course, and my home machine now has more disk space than I really have a good use for. But that 2006 machine was my personal Rubicon, the point where I finally had enough disk space that I could do things differently than I ever had before.