The modern trend of variable DNS results and its effects on troubleshooting

March 21, 2022

I tweeted:

It's DNS. Of course it's DNS, it's always DNS. (Okay, it involves DNS, because the modern Internet loves to give different people different DNS answers for the same thing and then all your other troubleshooting goes out the window until you notice.)

That parenthetical bit is important and increasingly irritating. The story is somewhat straightforward; a co-worker reported that people in his group were having problems downloading things from Dropbox from inside our networks, although sometimes it would work for a while. He managed to find a specific canary URL inside the Dropbox download process that would fail. For us, the host of the URL is a DNS CNAME to 'edge-block-www-env.dropbox-dns.com', which we could get the IPv4 address for, and so we dug into various troubleshooting based on both the name and the IP address from various network points we have access to.

Wait, did I say 'the' IP address? That turned out to be a lie. If you queried the authoritative nameservers for dropbox-dns.com from almost all of the the university's network and various other networks in Toronto, the dropbox-dns.com hostname I mentioned resolves to 162.125.11.15. If you queried the authoritative nameservers from the specific /24 of our networks that our resolving DNS servers sit on, you got the IP address 162.125.3.15 (and now you get 162.125.4.15). The IP address that most everyone gets works fully. The IP address that we got doesn't work from within the university, although it works from outside.

(And all of these IP addresses have a 60 second TTL, so one theory about why things worked some of the time is that every so often the DNS servers gave our resolvers the good IP.)

All of this created a marvelous matrix of troubleshooting confusion. If you ran a verbose curl command or did a ping or DNS lookup from outside our network, the IP address you got would work inside it. If you did the reverse, the IP address wouldn't work, if you tested from inside the university; if you tested from outside, it did work. If you didn't realize you were getting different IP addresses and just copied them back and forth in the process of troubleshooting (because the IP addresses were much, much shorter than the hostname of the URL), things got confusing (I certainly confused myself).

Clearly there is more of a problem than just the different DNS results, but the different DNS results certainly didn't help. And increasingly, that is the reality of DNS lookup results. They aren't constant in any way, either over time or in what you see from network location to network location (even very fine-grained ones; it's really just one /24 of ours).

Written on 21 March 2022.
« Prometheus: using gauge-like things as if they were counters
Getting a fixed baud rate on your serial ports for logins under systemd »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Mon Mar 21 22:06:00 2022
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.