Wandering Thoughts archives

2016-01-10

Updating software to IPv6 is often harder than you might think

A while back D.J. Bernstein wrote was is now a famous rant about IPv6. Due to various things, this DJB article is on my mind and today I want to talk about one part of it that DJB casually handwaves, which is updating all software to support IPv6.

The obvious problem with software is that most of the traditional system APIs have specified IP addresses in fixed-size objects and with explicit, fixed types. Very little software has been written using generic APIs and variable-sized addresses, where you could just drop the bigger IPv6 addresses in without trouble; instead a lot of software knows that it is talking IPv4 with addresses that take up 4 bytes. Such software cannot just be handed IPv6 addresses, because they overflow the space and various things would malfunction. Instead systems have been required to define an entirely new and larger 'address family' for IPv6 and then software has had to be updated to support it along side IPv4.

The first complication emerges here: not only do you need a new address family, you need new APIs that can accept and return the new address family. Sometimes you need the new APIs because old APIs were defined only as returning 4-byte IPv4 addresses; sometimes you need new APIs because tons of people wrote tons of code that just assumed old APIs only ever return 4-byte IPv4 addresses.

(You could break all that code, but that would be a recipe for a ton of bugs for years. Let's not go there.)

But the larger problem is that IP addresses don't confine themselves just to the networking layer of programs where they get handled by generic system APIs that you can make cope with the new IPv6 addresses. Instead, in many important programs IP(v4) addresses ripple through all sorts of other code. This code may represent them in all sorts of ways and it may do all sorts of things to manipulate them, things that 'know' various facts that are only true for IPv4 addresses. For instance, I have several sets of code in several different languages that know how to make DNS blocklist queries given an 'IP address'. Depending on the language, it may split a string at every '.' or it may take four bytes in a specific order from a 32-bit integer, or even a byte addr[4] array.

(Some of this code may be at some distance from actual network programs. Consider code that attempts to process web server log files and do things like aggregate traffic by network regions, or even just tell when the logs have IP addresses instead of hostnames.)

All of this code needs to be revised for IPv6. Some of the revisions are simple. Others take more work and need to know things about, say, the typical and canonical ways of representing IPv6 addresses. Other code may need to be completely rethought from the ground up; for example, I have code that represents IP address ranges as pairs of '(start, end)' integers and supports various operations on them, including 'give me all of the IP addresses in this range set'. This works fine for IPv4 addresses, but the entire data structure may need to be totally redone for IPv6 and certain parts of the API might not make sense any more.

(And then there's the cases where IP addresses are stored in files and retrieved later. They are probably not being stored in large or arbitrary sized fields, so how they are stored may not be able to store IPv6 addresses. So we're looking at database restructuring here, and also restructuring of field validators and so on.)

Then you have all of the stuff that knows how to talk about IP addresses, for example in configuration files for programs. Much of this is likely specific to IPv4 addresses, so both code and specifications will need to be revised for IPv6 addresses. In turn this may ripple through to cause difficulties or require changes to the configuration file language; you may need to make IPv6 addresses accepted with some sort of quoting if your language treats ':' in words specially, for example. All of this involves far more than mechanical code changes and code updates; we're well at the level of system architecture and design, with messy tradeoffs between backwards compatibility and well supported IPv6 addresses.

(Exim famously has a certain amount of heartburn with lists of IPv6 addresses in its configuration files because long ago ':' was chosen as the default list separator character.)

Of course, IP addresses are just the start of the problem; it spirals off in several directions from them. One direction is IPv6 netblocks and address ranges; there's kind of a new syntax there, and people have to rethink configuration files that currently designate ranges via syntax like '127.100.0.'. Another one is that there are various special sorts of IPv6 addresses that your systems may need to be aware of, like link-local addresses. A third is the broad issue of per-source ratelimits; a simple limit that's per IPv6 address may not work very well in an IPv6 environment where people have relatively large subnets pushed down to their home connections or whatever.

All of this can be done, but it all adds up to a significant amount of work, both in raw programming and in design and architecture to make the right decisions about how systems should look and work in an IPv6 enabled environment. It should be no surprise that progress has been slow overall (and occasionally buggy) and people continue to design, build, and hack together systems that are implicitly or explicitly IPv4 only.

(If you only have to deal with IPv4 today, some of the high level issues may be effectively invisible to you.)

(I've written other stuff about the problems I see with DJB's IPv6 migration ideas in earlier entries.)

IPv6SoftwareUpdatePain written at 03:30:04; Add Comment

2016-01-02

I've realized I need to change how I read Twitter

People talk a lot about modern social web sites being designed to be sticky, to encourage you to keep them open and to interact with them all of the time. For a long time, this was not my experience with them; I felt no reason to stick around Facebook, for example. Then I got on Twitter with a fortuitous choice of client.

I am the kind of person who has historically had a 'gotta read them all' mindset about, well, basically everything on the Internet that I've wound up following (and Usenet, too). If you are this kind of person, Twitter is a terrible trap (especially with a client that lets you see read versus unread tweets). Once you follow enough people, there will be new tweets to read more or less every refresh interval. I may not read them right away, but there they are, tugging at my attention, and they will take time to read eventually. And of course there's the ever present temptation to take a (nominally short) break by reading some of the pending tweets.

For me, the inevitable result of following Twitter in my current way is a fractured attention and a slow but constant drain of time away from other things. It's especially pernicious because it doesn't feel like much time, since individual bursts of reading may be short. But the cumulative effect adds up and adds up.

(This should not be surprising, and really it isn't. We've long known the effects of breaking concentration and how little interruptions can have outsized effects; people write about various aspects of this all the time, and I've read them and nodded along with it. Yet here I was, quietly walking into doing exactly this to myself. One can draw various lessons here.)

At a one level, how I need to treat Twitter is straightforward. Rather than seeing it as something that I read all of, I need to treat it as a stream that I dip my toe into every so often (and only every so often). At another level, there's a vast difference between knowing a theoretical answer and being able to change my habits to carry it out in practice. It's going to take me time to work out how to do this in a way that works for me, and willpower to not keep backsliding into old 'read it all' habits.

(And I'll miss reading all of my Twitter feed; there's really nice stuff there that I enjoyed following. That what makes it hard, that I know I'm going to be missing things that I want to read.)

It's been quite interesting to be sucked into Twitter this way, bit by bit, and then realize that I was being pulled in and working out what the effects on me were. I have some views on why Twitter worked on me where other 'social web' sites haven't, but that's going to be another entry.

TwitterBreakingAddiction written at 01:07:47; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.