2017-06-06
A humbling experience of misreading some simple (Go) code
Every so often, I get to have a humbling experience, sometimes in public and sometimes in private. Recently I was reading Go Range Loop Internals (via) and hit its link to this Damian Gryski (@dgryski) tweet:
Today's #golang gotcha: the two-value range over an array does a copy. Avoid by ranging over the pointer instead.
I ran the code on the playground, followed it along, and hit a 'what?' moment where I felt I had a mystery where I didn't understand why Go was doing something. Here is the code:
func IndexValueArrayPtr() { a := [...]int{1, 2, 3, 4, 5, 6, 7, 8} for i, v := range &a { a[3] = 100 if i == 3 { fmt.Println("IndexValueArrayPtr", i, v) } } }
Usefully, I have notes about my confusion, and I will put them here verbatim:
why is the IndexValueArrayPtr result '3 100'? v should be copied before a[3] is modified, and v is type 'int', not a pointer.
This is a case of me reading the code that I thought was there
instead of the code that was actually present, because I thought
the code was there to make a somewhat different point. What I had
overlooked in IndexValueArrayPtr
(and in fact in all three
functions) is that a[3]
is set on every pass through the loop,
not just when i == 3
.
Misreading the code this way makes no difference to the other two
examples (you can see this yourself with this variant), but it's crucial to how
IndexValueArrayPtr
behaves. If the a[3]
assignment was inside
the if
, my notes would be completely true; v
would have copied
the old value of a[3]
before the assignment and this would print
'3 4'. But since the assignment happens on every pass of the loop,
a[3]
has already been assigned to be 100 by the time the loop
gets to the fourth element and makes v
a copy of it.
(I think I misread the code this way partly because setting a[3]
only once is more efficient and minimal, and as noted the other two
functions still illustrate their particular issues when you do it
that way.)
Reading an imaginary, 'idealized' version of the code instead of the real one is not a new thing and it's not unique to me, of course. When you do it on real code in a situation where you're trying to find a bug, it can lead to a completely frustrating time where you literally can't see what's in front of your eyes and then when you can you wonder how you could possibly have missed it for so long.
(I suspect that this is a situation where rubber duck debugging helps. When you have to actually say things out loud, you hopefully get another chance to have your brain notice that what you want to say doesn't actually correspond to reality.)
PS: The reason I have notes on my confusion is that I was planning to turn explaining this 'mystery' into a blog entry. Then, well, I worked out the mystery, so now I've gotten to write a somewhat different sort of blog entry on it.
The IPv6 address lookup problem (and brute force solution)
In Julia Evans' article Async IO on Linux: select, poll, and epoll,
she mentioned in passing that she strace
ed a Go program making a
HTTP request and noticed something odd:
Then [the Go program] makes 2 DNS queries for example.com (why 2? I don’t know!), and uses
epoll_wait
to wait for replies [...]
It turns out that this is all due to IPv6 (and the DNS standards),
and it (probably) happens in more than Go programs (although I
haven't strace
d anything else to be sure). So let's start with
the problem.
Suppose that you have a 'dual-stack' machine, one with both IPv4 and IPv6 connectivity. You need to talk to a wide variety of other hosts; some of them are available over IPv4 only, some of them are available over IPv6 only, and some of them are available over both (in which case you traditionally want to use IPv6 instead of IPv4). How do you look up their addresses using DNS?
DNS currently has
no way for a client to say 'give me whatever IPv4 and IPv6 addresses
a host may have'. Instead you have to ask specifically for either
IPv4 addresses (with a DNS A
record query) or IPv6 addresses (with
a DNS AAAA
record query). The straightforward way for a dual-stack
machine to find the IP addresses of a remote host would be to issue
an AAAA
query to get any IPv6 addresses, wait for it to complete
(or error out or time out), and then issue an A
query for IPv4
addresses if necessary. However, there are a lot of machines that
have no IPv6 addresses, so a lot of the time you'd be adding the
latency of an extra DNS query to your IP address lookups. Extra
latency (and slower connections) doesn't make people happy, and DNS
queries are not necessarily the fastest thing in the world in the
first place for various reasons.
(Plus, there are probably some DNS servers and overall DNS systems
that will simply time out for IPv6 AAAA
queries instead of promptly
giving you a 'no' answer. Waiting for a timeout adds substantial
amounts of time. Properly operated DNS systems shouldn't do this,
but there are plenty of DNS systems that don't operate properly.)
To deal with this, modern clients increasingly opt to send out their
A
and AAAA
DNS queries in parallel. This is what Go is doing
here and in general (in its all-Go resolver, which is what the Go
runtime tries to use), although it's hard to see it in the net
package's source code until you dig quite far down. Go waits for
both queries to complete, but there are probably some languages,
libraries, and environments that immediately start a connection
attempt when they get an answer back, without waiting for the other
protocol's query to finish too.
(There is a related algorithm called Happy Eyeballs which is about
trying to make IPv6 and IPv4 connections in parallel and using
whichever completes first. And there is a complicated RFC on how you should select a
destination address out of the collection that you may get from your
AAAA
and A
DNS queries.)
Sidebar: DNS's lack of an 'all types of IP address' query type
I don't know for sure why DNS doesn't have a standard query type
for 'give me all IP addresses, either IPv4 or IPv6'. Per Wikipedia, DNS
itself was created in the mid 1980s, well before IPv6 was designed.
However, IPv6 itself is decades old at this point, which is lots
of time to add such a query type to DNS and have people adopt it
(although it might still not be universally supported, which would
leave you falling back to explicit A
queries at least). My best
guess for why such a query type was never added is a combination
of backwards compatibility worries (since initially not many DNS
servers would support it, so clients would mostly be making an extra
DNS query for nothing) and a general belief on the part of IPv6 people
that IPv4 was going to just go away entirely any day soon, really.
(We know how that one turned out. It's 2017, and IPv4 only hosts and networks remain very significant.)