Wandering Thoughts archives


A humbling experience of misreading some simple (Go) code

Every so often, I get to have a humbling experience, sometimes in public and sometimes in private. Recently I was reading Go Range Loop Internals (via) and hit its link to this Damian Gryski (@dgryski) tweet:

Today's #golang gotcha: the two-value range over an array does a copy. Avoid by ranging over the pointer instead.


I ran the code on the playground, followed it along, and hit a 'what?' moment where I felt I had a mystery where I didn't understand why Go was doing something. Here is the code:

func IndexValueArrayPtr() {
  a := [...]int{1, 2, 3, 4, 5, 6, 7, 8}

  for i, v := range &a {
    a[3] = 100
    if i == 3 {
      fmt.Println("IndexValueArrayPtr", i, v)

Usefully, I have notes about my confusion, and I will put them here verbatim:

why is the IndexValueArrayPtr result '3 100'? v should be copied before a[3] is modified, and v is type 'int', not a pointer.

This is a case of me reading the code that I thought was there instead of the code that was actually present, because I thought the code was there to make a somewhat different point. What I had overlooked in IndexValueArrayPtr (and in fact in all three functions) is that a[3] is set on every pass through the loop, not just when i == 3.

Misreading the code this way makes no difference to the other two examples (you can see this yourself with this variant), but it's crucial to how IndexValueArrayPtr behaves. If the a[3] assignment was inside the if, my notes would be completely true; v would have copied the old value of a[3] before the assignment and this would print '3 4'. But since the assignment happens on every pass of the loop, a[3] has already been assigned to be 100 by the time the loop gets to the fourth element and makes v a copy of it.

(I think I misread the code this way partly because setting a[3] only once is more efficient and minimal, and as noted the other two functions still illustrate their particular issues when you do it that way.)

Reading an imaginary, 'idealized' version of the code instead of the real one is not a new thing and it's not unique to me, of course. When you do it on real code in a situation where you're trying to find a bug, it can lead to a completely frustrating time where you literally can't see what's in front of your eyes and then when you can you wonder how you could possibly have missed it for so long.

(I suspect that this is a situation where rubber duck debugging helps. When you have to actually say things out loud, you hopefully get another chance to have your brain notice that what you want to say doesn't actually correspond to reality.)

PS: The reason I have notes on my confusion is that I was planning to turn explaining this 'mystery' into a blog entry. Then, well, I worked out the mystery, so now I've gotten to write a somewhat different sort of blog entry on it.

programming/GoMisreadingSomeCode written at 23:59:29; Add Comment

The IPv6 address lookup problem (and brute force solution)

In Julia Evans' article Async IO on Linux: select, poll, and epoll, she mentioned in passing that she straceed a Go program making a HTTP request and noticed something odd:

Then [the Go program] makes 2 DNS queries for example.com (why 2? I don’t know!), and uses epoll_wait to wait for replies [...]

It turns out that this is all due to IPv6 (and the DNS standards), and it (probably) happens in more than Go programs (although I haven't straced anything else to be sure). So let's start with the problem.

Suppose that you have a 'dual-stack' machine, one with both IPv4 and IPv6 connectivity. You need to talk to a wide variety of other hosts; some of them are available over IPv4 only, some of them are available over IPv6 only, and some of them are available over both (in which case you traditionally want to use IPv6 instead of IPv4). How do you look up their addresses using DNS?

DNS currently has no way for a client to say 'give me whatever IPv4 and IPv6 addresses a host may have'. Instead you have to ask specifically for either IPv4 addresses (with a DNS A record query) or IPv6 addresses (with a DNS AAAA record query). The straightforward way for a dual-stack machine to find the IP addresses of a remote host would be to issue an AAAA query to get any IPv6 addresses, wait for it to complete (or error out or time out), and then issue an A query for IPv4 addresses if necessary. However, there are a lot of machines that have no IPv6 addresses, so a lot of the time you'd be adding the latency of an extra DNS query to your IP address lookups. Extra latency (and slower connections) doesn't make people happy, and DNS queries are not necessarily the fastest thing in the world in the first place for various reasons.

(Plus, there are probably some DNS servers and overall DNS systems that will simply time out for IPv6 AAAA queries instead of promptly giving you a 'no' answer. Waiting for a timeout adds substantial amounts of time. Properly operated DNS systems shouldn't do this, but there are plenty of DNS systems that don't operate properly.)

To deal with this, modern clients increasingly opt to send out their A and AAAA DNS queries in parallel. This is what Go is doing here and in general (in its all-Go resolver, which is what the Go runtime tries to use), although it's hard to see it in the net package's source code until you dig quite far down. Go waits for both queries to complete, but there are probably some languages, libraries, and environments that immediately start a connection attempt when they get an answer back, without waiting for the other protocol's query to finish too.

(There is a related algorithm called Happy Eyeballs which is about trying to make IPv6 and IPv4 connections in parallel and using whichever completes first. And there is a complicated RFC on how you should select a destination address out of the collection that you may get from your AAAA and A DNS queries.)

Sidebar: DNS's lack of an 'all types of IP address' query type

I don't know for sure why DNS doesn't have a standard query type for 'give me all IP addresses, either IPv4 or IPv6'. Per Wikipedia, DNS itself was created in the mid 1980s, well before IPv6 was designed. However, IPv6 itself is decades old at this point, which is lots of time to add such a query type to DNS and have people adopt it (although it might still not be universally supported, which would leave you falling back to explicit A queries at least). My best guess for why such a query type was never added is a combination of backwards compatibility worries (since initially not many DNS servers would support it, so clients would mostly be making an extra DNS query for nothing) and a general belief on the part of IPv6 people that IPv4 was going to just go away entirely any day soon, really.

(We know how that one turned out. It's 2017, and IPv4 only hosts and networks remain very significant.)

tech/IPv6AddressLookupProblem written at 00:27:36; Add Comment

Page tools: See As Normal.
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.