Wandering Thoughts

2016-07-25

An irritating systemd behavior when you tell it to reboot the system

For reasons well beyond the scope of this entry, I don't use a graphical login program like gdm; I log in on the text console and start X by hand through xinit (which is sometimes annoying). When I want to log out, I cause the X server to exit and then log out of the text console as normal. Now, I don't know how gdm et al handle session cleanup, but for me this always leaves some processes lingering around that just haven't gotten the message to give up.

(Common offenders are kio_http_cache_cleaner and speech-dispatcher and its many friends. Speech-dispatcher is so irritating here that I actually chmod 700 the binary on my office and home machines.)

Usually the reason I'm logging out of my regular session is to reboot my machine, and this is where systemd gets irritating. Up through at least the Fedora 24 version of systemd, when it starts to reboot a machine and discovers lingering user processes still running, it will wait for them to exit. And wait. And wait more, for at least a minute and a half based on what I've seen printed. Only after a long timer expires will systemd send them various signals, ending in SIGKILL, and force them to exit.

(Based on reading manpages it seems that systemd sends user processes no signals at all at the start of a system shutdown. Instead it probably waits TimeoutStopSec, sends a SIGTERM, then waits TimeoutStopSec again before sending a SIGKILL. If you have a program that ignores everything short of SIGKILL, you're going to be waiting two timeout intervals here.)

At one level, this is not crazy behavior. Services like database engines may take some time to shut down cleanly, and you do want them to shut down cleanly if possible, so having a relatively generous timeout is okay (and the timeout can be customized). In fact, having a service have to be force-killed is (or should be) an exceptional thing and means that something has gone badly wrong. Services are supposed to have orderly shutdown procedures.

But all of that is for system services and doesn't hold for user session processes. For a start, user sessions generally don't have a 'stop' operation that gets run explicitly; the implicit operation is the SIGHUP that all the processes should have received as the user logged out. Next, user sessions are anarchic. They can contain anything, not just carefully set up daemons that are explicitly designed to shut themselves down on demand. In fact, lingering user processes are quite likely to be badly behaved. They're also generally considered clearly less important than system services, so there's no good reason to give them much grace period.

In theory systemd's behavior is perhaps justifiable. In practice, its generosity with user sessions simply serves to delay system reboots or shutdowns for irritatingly long amounts of time. This isn't a new issue with systemd (the Internet is full of complaints about it), but it's one that the systemd authors have let persist for years.

(I suspect the systemd authors probably feel that the existing ways to change this behavior away from the default are sufficient. My view is that defaults matter and should not be surprising.)

When I started writing this entry I expected it to just be a grump, but in fact it looks like you can probably fix this behavior. The default timeout for all user units can be set in /etc/systemd/user.conf with the DefaultTimeoutStopSec setting; set this down to less than 90 seconds and you'll get a much faster timeout. However I'm not sure if systemd will try to terminate a user scope other than during system shutdown, so it's possible that this setting will have other side effects. I'm tempted to try it anyways, just because it's so irritating when I slip up and forget to carefully kill all of my lingering session processes before running reboot.

(You can also set KillUserProcesses in /etc/system/logind.conf, but that definitely will have side effects you probably don't want, even if some people are trying to deal with them anyways.)

linux/SystemdRebootIrritation written at 23:35:47; Add Comment

I should learn more about Grub2

I have a long-standing dislike of Grub2 (eg, and). Ever since I started having to deal with it I've felt that it's really overcomplicated, and this complexity makes it harder to deal with. There's a lot more to know and learn with Grub2 than there is with the original Grub, and I resent the added complexity for what I feel should be a relatively simple process.

You know what? The world doesn't care what I think. Grub2 is out there and it's what (almost) everyone uses, whether or not I like it. And recent events have shown me that I don't know enough about how it works to really troubleshoot problems with it. As a professional sysadmin, it behooves me to fix this sort of a gap in my knowledge for the same reason that I should fix my lack of knowledge about dpkg and apt.

I'm probably never going to learn enough to become an expert at Grub 2 (among other things, I don't think there's anything we do that requires that much expertise). Right now what I think I should learn is twofold. First, the basic operating principles, things like where Grub 2 stores various bits of itself, how it finds things, and how it boots. Second, a general broad view of the features and syntax it uses for grub.cfg files, to the point where I can read through one and understand generally how it works and what it's doing.

(I did a little bit of this at one point, but much of that knowledge has worn off.)

Unfortunately there's a third level I should also learn about. Grub2 configurations are so complex that they're actually mostly written and updated by scripts like grub2-mkconfig. This means that if I want to really control the contents of my grub.cfg on most systems, I need to understand broadly what those scripts do and what they get controlled by (and thus where they may go wrong). Since I don't think this area is well documented, I expect it to be annoying and thus probably the last bit that I tackle.

(If I cared about building custom grub2 configurations, it should be the first area. But I don't really; I care a lot more about understanding what Grub2 is doing when it boots our machines.)

linux/Grub2ShouldLearn written at 00:55:21; Add Comment

2016-07-24

My view on people who are assholes on the Internet

A long time ago, I hung around Usenet in various places. One of the things that certain Usenet groups featured was what we would today call trolls; people who were deliberately nasty and obnoxious. Sometimes they were nasty in their own newsgroups; sometimes they took gleeful joy in going out to troll other newsgroups full of innocents. Back in those days there were also sometimes gatherings of Usenet people so you could get to meet and know your fellow posters. One of the consistent themes that came out of these meetups was reports of 'oh, you know that guy? he's actually really nice and quiet in person, nothing like his persona on the net'. And in general, one of the things that some of these people said when they were called on their behavior was that they were just playing an asshole on Usenet; they weren't a real asshole, honest.

Back in those days I was younger and more foolish, and so I often at least partially bought into these excuses and reports. These days I have changed my views. Here, let me summarize them:

Even if you're only playing an asshole on the net, you're still an asshole.

It's simple. 'Playing an asshole on the net' requires being an asshole to people on the net, which is 'being an asshole' even if you're selective about it. Being a selective asshole, someone who's nasty to some people and nice to others, doesn't somehow magically make you not an asshole, although it may make you more pleasant for some people to deal with (and means that they can close their eyes to your behavior in other venues). It's certainly nicer to be an asshole only some of the time than all of the time, but it's even better if you're not an asshole at all.

This is not a new idea, of course. It's long been said that the true measure of someone's character is how they deal with people like waitresses and cashiers; if they're nasty to those people, they've got a streak of nastiness inside that may come out in other times and places. The Internet just provides another venue for that sort of thing.

In general, it's long since past time that we stopped pretending that people on the Internet aren't real people. What happens on the net is real to the people that it happens to, and nasty words hurt even if one can mostly brush off a certain amount of nasty words from strangers.

(See also, which is relevant to shoving nastiness in front of people on the grounds that they were in 'public'.)

tech/InternetAssholes written at 00:03:35; Add Comment

2016-07-23

My current set of Chrome extensions (as of July 2016)

I know, I've said bad things about Chrome extensions before. Still, I seem to have slowly accumulated a number of extensions that I use, some of which I even trust, and that means it's worth actually documenting them so I can keep track (and easily put them into a new Chrome instance someday).

Unsurprisingly, my Chrome extensions are broadly similar in spirit to my normal set of Firefox extensions. Some of these Chrome extensions date back to years ago when I made a serious attempt to set up a Chrome equivalent of my Firefox environment so I could give Chrome a fair trial.

The obvious cases:

  • uBlock Origin works just as well in Chrome as it does in Firefox, and is good for the same things and the same reason.

  • HTTPS Everywhere because I feel that this is a good idea and why not.

User interface issues:

  • The bluntly named Don't Fuck With Paste (also, and, via) makes it so that websites can't stop you from pasting things into form input fields. These websites are basically wrong and they make me very annoyed, since how I manage website passwords requires paste to work.

    (The Firefox equivalent is disabling dom.event.clipboardevents.enabled in about:config.)

  • MiddleButtonScroll makes it so that you can scroll the page around with the middle mouse button. I'm extremely used to this behavior in Firefox, so I was happy to be able to get it in Chrome too.

  • CLEAN crxMouse Gestures is a giant experiment that makes me nervous. There have historically been no good, non-spyware gestures extensions for Chrome (when one existed, it wound up corrupted). This extension at least claims and appears to be a 'clean' (ie non-spyware) version of the relatively good but spyware crxMouse Gestures extension. I'm not sure if I trust it, but I really, really want mouse gestures so I'm willing to take the chance as an experiment. If someone tells me that it too is bad, I will be sad but not surprised.

    (Adding this extension is what pushed me into writing this entry.)

Security, sort of:

  • ScriptBlock is what I'm currently using as my NoScript equivalent on Chrome. Because my only major usage of Chrome is my hack use of incognito mode, I don't actually get much exposure to this extension so I don't really have any opinions on how well it works.

  • FlashBlock for Chrome is theoretically there for the same reasons that I have it in Firefox, but again in practice I mostly use Chrome in a way that deliberately disables it so I don't really have many opinions. Plus, Chrome is increasingly disabling Flash all on its own.

Things that I should probably remove:

  • Stylish is the Chrome version of the Firefox extension of the same name, which I used to make significant use of before I sadly discovered that it was part of my big Firefox memory leaks. In practice I don't use this in Chrome, but it seems harmless (and at the time I initially set up Chrome many years ago, it seemed like something I was going to want).

  • Protect My Choices seemed like a good idea at the time that I ran across it but the more I look at it the more I'm not sure I should have this extension sitting around doing things.

    (It turns out that I only have this installed in my work Chrome, not my home Chrome. So it's going to get removed the next time I'm in the office.)

Since Chrome's incognito mode is mostly how I use Chrome, I have a number of these extensions enabled in it. Right now, the list is uBlock Origin, MiddleButtonScroll, Don't Fuck With Paste, and CLEAN crxMouse Gestures (because there isn't much point in adding a mouse gestures extension I don't entirely trust if I'm not going to use it).

In Firefox, I consider It's All Text! to be essential. Unfortunately Chrome's very different extension model means that the Chrome equivalents have always had to do terribly awkward things that made them unattractive to me.

Since incognito mode discards cookies when I close it down, I haven't tried to find any sort of cookie management extension. As usual, this might be a bit of a mistake, as I do use non-incognito Chrome just a bit and so I've probably picked up a certain amount of cookie lint in the process.

(For me, Linux Chrome is significantly faster than Linux Firefox for Flickr. Since logging in all the time is annoying, I use the regular Chrome in order to retain the necessary login cookies.)

web/ChromeExtensions2016-07 written at 02:18:12; Add Comment

2016-07-22

Ubuntu 16.04's irritatingly broken MySQL updates

I tweeted:

So, Ubuntu 16.04 can't apply MySQL server updates if you have the server installed but have disabled it running. Good show, you lot.

We install the MySQL server package on a few machines but deliberately don't start the daemon. In older versions of Ubuntu, this worked reasonably well; you could do it, you could keep the daemon from starting on boot, and you could apply updates (although doing so generally started the daemon up, so you had to remember to then go stop it). In 16.04, if you've disabled the daemon your attempts to apply updates will error out:

mysql_upgrade: Got error: 2002: Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2) while connecting to the MySQL server
Upgrade process encountered error and will not continue.
mysql_upgrade failed with exit status 11

The direct cause of this problem is that the mysql-server-5.7 postinstall script needs to run mysql_upgrade, which requires the server to be running. Perhaps at this point you sigh, run 'service mysql start', and try the upgrade again. It'll still fail, because the postinstall script is more complicate and more wrong than that.

The postinstall script needs to stop the MySQL daemon, do some things, and then start the daemon again and run mysql_upgrade (and then restart the daemon yet again). It does all of this starting and restarting by running invoke-rc.d, and invoke-rc.d specifically refuses to start disabled daemons. In the grand Unix tradition, this behavior is burried in an innocuous phrasing in the invoke-rc.d manpage:

invoke-rc.d is a generic interface to execute System V style init script /etc/init.d/name actions, obeying runlevel constraints as well as any local policies set by the system administrator.

Via a complex chain of actions, what 'obeying runlevel constraints' translates to here is that if you do 'systemctl disable <whatever>', invoke-rc.d will decided that <whatever> is specifically blocked from running and not start it.

(Invoke-rc.d in general is the wrong tool in Ubuntu 16.04, because it's actually fairly tied to the System V init framework. The system goes through ugly hacks in order to make it work 'right' on things that are actually native systemd .service units, as the MySQL daemon is.)

This selectivity is the wrong approach, or at least it's in the wrong place. What the postinst script should really be doing is unconditionally shutting down the server, unconditionally starting it to run mysql_upgrade, unconditionally shutting it down again, and only then using invoke-rc.d to conditionally start it again. This would achieve the twin goals of upgrading MySQL while not leaving the daemon running if it's disabled. This would actually be an improvement over the 14.04 situation, instead of a massive headache.

(Of course I expect that the real answer is simply that no one thought about this possibility, and that if we were to file a bug report we'd be told that disabling the daemon is not a supported configuration.)

The workaround is simple. Before you try to apply a MySQL server pack update, do 'systemctl enable mysql'. After it's done, do 'systemctl disable mysql; systemctl stop mysql' to return to the original state.

Sidebar: 'Runlevel constraints' and invoke-rc.d

Invoke-rc.d checks to see whether something is enabled or disabled by looking for S* and K* symlinks in /etc/rc<runlevel>.d. In 16.04, the 'runlevel' is arbitrary and is reported as '5', so we're looking at /etc/rc5.d. When you do 'systemctl disable' or 'systemctl enable' on an Ubuntu 16.04 system and the service also has an /etc/init.d file, systemctl helpfully maintains rcN.d S* and K* symlinks for you. So running 'systemctl disable mysql' also creates a /etc/rc5.d/K02mysql, which invoke-rc.d will then see as saying that mysql is specifically constrained to not start in runlevel 5, and so should not be started.

(If there was no /etc/rc5.d symlink at all, invoke-rc.d would also conclude that it shouldn't start the MySQL daemon.)

linux/Ubuntu1604MySQLUpdatePain written at 02:20:33; Add Comment

2016-07-21

My current set of essential extensions for Firefox profiles

For reasons beyond the scope of this entry, I recently set up a new Firefox profile. As a new profile it needs to be (re) customized for my preferences, including extensions. I can't just use my regular browser's set of extensions because this profile is for staying logged in to Twitter and needs to be set up so that Twitter is usable. Also, I don't want to spend a lot of time maintaining it, because I'm only going to use it once in a while.

Since there may be more single-use profiles like this in my future, I'm writing down my current set of essential Firefox extensions for this sort of environment. They are:

Although I've considered it, I'm not currently using NoScript in this profile. In theory NoScript shouldn't be blocking anything because I only use this profile on Twitter and I'd have to whitelist Twitter's JavaScript; in practice, it's likely to need at least periodic attention to whitelist some new domain that now has JavaScript that's necessary to keep Twitter working.

(If I follow links from Twitter to other sites in this profile, I may regret this choice. But I generally don't do that, since I read Twitter almost entirely through choqok and choqok is set to open links in my regular browser session.)

A number of my regular extensions are okay here and potentially useful, but aren't compelling enough to bother installing in a new profile that I'm using this way. Part of this is that Twitter allows you to control video autoplay in your settings, so I have it turned off; if that changed, FlashStopper or some equivalent might become important.

(Things would be a bit different if you could build bundles of extensions and extension preferences that would all install together. Then I'd probably roll in more of my standard extensions, because each additional extension wouldn't be a little additional drip of hassle to get and customize.)

web/FirefoxProfilesCoreExtensions written at 00:23:27; Add Comment

2016-07-20

Official source release builds should not abort on (compilation) warnings

I will put my conclusion up front:

Official source releases should not build software with -Werror or the equivalent.

(By official source releases I mean things like 'we are releasing version 1.x.y of our package today'.)

Perhaps you disagree. Then the rest of this entry is for you.

As I write this, the current release version of Rust is 1.10.0. This version (and probably all previous ones) won't build on the recently released Fedora 24 because of a C compiler issue. This isn't because of bug in the Fedora 24 version of gcc, and it's not due to the Fedora 24 gcc uncovering a previously unrecognized bug in Rust. Instead it's because of, well, this:

src/rt/miniz.c: In function ‘tinfl_decompress’:
src/rt/miniz.c:578:9: error: this ‘for’ clause does not guard... [-Werror=misleading-indentation]
         for ( i = 0; i <= 143; ++i) *p++ = 8; for ( ; i <= 255; ++i) *p++ = 9; for ( ; i <= 279; ++i) *p++ = 7; for ( ; i <= 287; ++i) *p++ = 8;
         ^~~
src/rt/miniz.c:578:47: note: ...this statement, but the latter is misleadingly indented as if it is guarded by the ‘for’
         for ( i = 0; i <= 143; ++i) *p++ = 8; for ( ; i <= 255; ++i) *p++ = 9; for ( ; i <= 279; ++i) *p++ = 7; for ( ; i <= 287; ++i) *p++ = 8;
                                               ^~~
[...]
cc1: all warnings being treated as errors

Rust has opted to compile much or all of the C in its source tree with the gcc options -Wall -Werror, which mean 'emit more or less all warnings that you can, and if you emit any warnings consider this an error and stop'. Fedora 24 is one of the first Linux distributions to ship with gcc 6, and gcc 6 has added some more warnings, which now pick up more nits in the C code, and now Rust doesn't build.

It would be one thing if this was pointing out a previously undiscovered error. But it's not. The code in the Rust 1.10.0 release is merely not ideal, and this example neatly illustrates the problem with making 'not ideal' a fatal error in official source releases. Put simply, the compiler's definition of 'not ideal' changes over time.

When you set your official releases to build with -Wall -Werror or the equivalent, you're putting yourself on quicksand. It's basically guaranteed that future compiler versions will have different opinions on what to warn about, which means that official releases of your software are going to stop building at some point in the future for no good reason. Having your software stop building for no good reason is not helping people.

(I say 'for no good reason' because if it actually built, the release would be no more broken than it was before the new nits were reported. The nits were always there, after all, even on all the systems that didn't report them.)

I think it's fine to use -Werror in development if you want to. But the moment you make a release, my very strong sysadmin opinion is that the job of that release is to build correctly in as many environments as possible, including future environments. An attempt to build a release should fail on a system only if the end result would be incorrect in a new way.

(If a release is incorrect on both system X and system Y but this is only uncovered on system Y, that should not be a fatal error. It's sufficient for release builds to be bug for bug equivalent to each other. This is probably too tricky to do in most cases, although maybe you should provide a 'I don't care, just force it to build' configuration option that suppresses as many compiler errors as possible.)

Another way to put this is that -Wall -Werror is a service for the developers; it surfaces nits and forces developers to fix them. However, releases are not made for developers, they're made for outside people. Forcing developer issues on outside people is both futile (since outside people are not developers) and annoying. Thus actual releases should have things like compiler options reoriented to serve the needs of outside people instead of developers.

programming/ReleaseBuildsNoAbortOnWarnings written at 01:51:08; Add Comment

2016-07-19

How not to set up your DNS (part 23)

Presented in the traditional illustrated form, more or less:

; dig ns megabulkmessage218.com @a.gtld-servers.net.
[...]
megabulkmessage218.com. IN NS ns1.megabulkmessage218.com.
megabulkmessage218.com. IN NS ns2.megabulkmessage218.com.
[...]
ns1.megabulkmessage218.com. IN A 5.8.32.218
ns2.megabulkmessage218.com. IN A 8.8.8.8
[...]

One of these two listed nameservers is not like the other.

8.8.8.8 is of course the famous open resolving DNS server that Google operates. It is in no way an authoritative DNS server for anyone, even if you try to use it as one. Lookups will probably fail, because I believe that most DNS resolvers set the 'no recursion' flag in their queries to what they believe are authoritative DNS servers and when it sees that, 8.8.8.8 doesn't answer even when it almost certainly has the data in cache (instead it returns a SERVFAIL).

(This is thus an extreme case of an informal secondary, although I suppose it was probably inevitable and there are likely plenty of other people using 8.8.8.8 this way with other domains. After all, it appears to work if you test it by hand, since tools like dig normally set the recursive flag on their queries.)

Since this is a spammer's DNS server (as you might have guessed from the domain name), things are a little bit peculiar with its results.

; dig ns megabulkmessage218.com. @5.8.32.218
[nothing; we get the standard 'no such data' response]
; sdig a gadswoonsg.megabulkmessage218.com. @5.8.32.218
178.235.61.115
; sdig mx gadswoonsg.megabulkmessage218.com. @5.8.32.218
10 mail.megabulkmessage218.com.
; sdig a mail.megabulkmessage218.com. @5.8.32.218
149.154.64.43

(The MX target is SBL295728, the A record is in the SBL CSS and listed in the CBL and so on. Basically, you name a DNS blocklist and 178.235.61.115 is probably in it. And the domain name is currently in the Spamhaus DBL.)

But:

; dig a randomname.megabulkmessage218.com. @5.8.32.218
[nothing; we get the standard 'no such data' response]

So this spammer is clearly making up random names for their spam run and running a very custom nameserver that only responds to them. Anything else gets a no such data response, including SOA and NS queries for the domain itself. Since there's nothing entirely new under the sun, we've seen this sort of DNS server cleverness before.

It's interesting that trying to get the NS records for the domain from your local resolving DNS server will fail even after you've looked up the A record for the hostname. The NS records (and glue) from the .com nameservers don't have particularly low TTLs, and given that the A record resolves your local DNS server was able to get and use them. But these days clearly it immediately throws them away again to avoid cache poisoning attacks (or at least won't return them for direct queries).

sysadmin/HowNotToDoDNSXXIII written at 14:24:05; Add Comment

An interesting (and alarming) Grub2 error and its cause

I upgraded my office workstation from Fedora 23 to Fedora 24 today, following my usual procedure of doing a live upgrade with dnf. Everything went smoothly, which is normal, and it was pretty fast, which isn't my normal experience but was probably because my root filesystem is now on SSDs. After the updates finished, I ran the grub2-install command that you're instructed to do and rebooted. My machine made it into Grub's menu but trying to start any kernel immediately halted with an error about the symbol grub_efi_secure_boot not being found (as in this Fedora 24 bug report or this old Ubuntu one).

This could politely be called somewhat alarming. Since it seemed to involve (U)EFI booting in some way, I went through the BIOS settings for my motherboard to see if I could turn that off and force a pure BIOS boot to make things work. Naturally I wound up looking through the boot options screen, at which point I noticed that the boot order looked a little odd. The BIOS's boot list didn't have enough room to display full names for drives, but the first and second drives had names that started with 'ST31000', and things called 'Samsung ...' were way down the list at the bottom.

At this point the penny dropped: my BIOS was still booting from my older hard drives, from before I'd moved the root filesystem to the SSDs. The SSDs were definitely considered sda and sdb by Linux and they're on the first two SATA links, but the BIOS didn't care; as far as booting went, it was sticking to its old disk ordering. When I'd updated the Grub2 boot blocks with grub2-install, I'd of course updated the SSD boot blocks because that's what I thought I was booting from; I hadn't touched the HD boot blocks. As a result the old Fedora 23 Grub boot blocks were trying to load Fedora 24 Grub modules, which apparently doesn't work very well and is a classic cause of these Grub 'undefined symbol' errors.

Once I realized this the fix was pleasantly simple; all I had to do was put the SSDs in their rightful place at the top of the (disk) boot priority list. Looking at the dates, this is the first Fedora version upgrade I've done since I added the SSDs, which explains why I didn't see it before now.

There's an argument that the BIOS's behavior here is sensible. If I'm correct about what's going on, it has essentially adopted a 'persistent boot order' in the same way that Linux (and other Unixes) are increasingly adopting persistent device names. I can certainly see people being very surprised if they add an extra SSD and suddenly their system fails to boot or boots oddly because the SSD is on a channel that the BIOS enumerates first. However, it's at least surprising for someone like me; I'm used to BIOSes cheerfully renumbering everything just because you stuck something into a previously unused SATA channel. A BIOS that doesn't do that for boot ordering is a bit novel.

(This may be especially likely on motherboards with a mix of 6G and 3G SATA ports. You probably want the 6G SATA ports enumerated first, and even if HDs live there for now, they're going to wind up being used for SSDs sooner or later.)

In the process of writing this entry I've also discovered that while I moved my root filesystem over to the SSDs, I seem to never have moved /boot; it's still a mirrored partition on the HDs. I'm not sure if this was something I deliberately planned, if I was going to move /boot later but forgot, or if I just plain overlooked the issue. I have some notes from my transition planning, but they're silent on this.

(Since /boot is still on the HDs, I'm now uncertain both about how the BIOS is actually numbering my drives and how Grub2 is finding /boot. Maybe the Grub boot blocks (technically the core image) have a hard-coded UUID for /boot instead of looking at specific BIOS disks.)

linux/GrubDiskMismatchError written at 00:02:06; Add Comment

2016-07-17

A good solution to our Unbound caching problem that sadly won't work

In response to my entry on our Unbound caching problem with local zones, Jean Paul Galea left a comment with the good suggestion of running two copies of Unbound with different caching policies. One instance, with normal caching, would be used to resolve everything but our local zones; the second instance, with no caching, would simply forward queries to either the authoritative server for our local zones or the general resolver instance, depending on what the query was for.

(Everything would be running on a single host, so the extra hops queries and replies take would be very fast.)

In many organizational situations, this is an excellent solution. Even in ours, at first glance it looks like it should work perfectly, because the issue we'd have is pretty subtle. I need to set the stage by describing a bit of our networking.

In our internal networks we have some machines with RFC 1918 addresses that need to be publicly reachable, for example so that research groups can expose a web server on a machine that they run in their sandbox. This is no problem; our firewalls can do 'bidirectional NAT' to expose each such machine on its own public IP. However, this requires that external people see a different IP address for the machine's official name than internal people do, because internal people are behind the BINAT step. This too is no problem, as we have a full 'split horizon' DNS setup.

So let's imagine that a research group buys a domain name for some project or conference and has the DNS hosted externally. In that domain's DNS, they want to CNAME some name to an existing BINAT'd server that they have. Now have someone internally do a lookup on that name, say 'www.iconf16.org':

  1. the frontend Unbound sees that this is a query for an external name, not one of our own zones, so it sends it to the general resolver Unbound.
  2. the general resolver Unbound issues a query to the iconf16.org nameservers and gets back a CNAME to somehost.cs.toronto.edu.
  3. the general resolver must now look up somehost.cs itself and will wind up caching the result, which is exactly what we want to avoid.

This problem happens because DNS resolution is not segmented. Once we hand an outside query to the general resolver, there's no guarantee that it stays an outside query and there's no mechanism I know of to make the resolving Unbound stop further resolution and hot-potato the CNAME back to the frontend Unbound. We can set the resolving Unbound instance up so that it gives correct answers here, but since there's no per-zone cache controls we can't make it not cache the answers.

This situation can come up even without split horizon DNS (although split horizon makes it more acute). All you need is for outside people to be able to legitimately CNAME things to your hosts for names in DNS zones that you don't control and may not even know about. If this is forbidden by policy, then you win (and I think you can enforce this by configuring the resolving Unbound to fail all queries involving your local zones).

sysadmin/UnboundZoneRefreshProblemII written at 23:05:07; Add Comment

DNS resolution cannot be segmented (and what I mean by that)

Many protocols involve some sort of namespace for resources. For example, in DNS this is names to be resolved and in HTTP, this is URLs (and distinct hosts). One of the questions you can ask about such protocols is this:

When a request enters a particular part of the namespace, can handling it ever require the server to go back outside that part of the namespace?

If the answer is 'no, handling the request can never escape', let's say that the protocol can be segmented. You can divide the namespace up into segments, have different segments handled by a different servers, and each server only ever deals with its own area; it will never have to reach over to part of the namespace that's really handled by another server.

General DNS resolution for clients cannot be segmented this way, even if you only consider the answers that have to be returned to clients and ignore NS records and associated issues. The culprit is CNAME records, which both jump to arbitrary bits of the DNS namespace and force that information to be returned to clients. In a way, CNAME records act similarly to symlinks in Unix filesystems. The overall Unix filesystem is normally segmented (for example at mount points), but symlinks escape that; they mean that looking at /a/b/c/d can actually wind up in /x/y/z.

(NS records can force outside lookups but they don't have to be returned to clients, so you can sort of pretend that their information doesn't exist.)

Contrasting DNS with HTTP is interesting here. HTTP has redirects, which are its equivalent of CNAMEs and symlinks, but it still can be segmented because it explicitly pushes responsibility for handling the jump between segments all the way back to the original client. It's as if resolving DNS servers just returned the CNAME and left it up to client libraries to issue a new DNS request for information on the CNAME's destination.

(HTTP servers can opt to handle some redirects internally, but even then there are HTTP redirects which must be handled by the client. Clients don't ever get to slack on this, which means that servers can count on clients supporting redirects. Well, usually.)

I think this protocol design decision makes sense for DNS, especially at the time that DNS was created, but I'm not going to try to justify it here.

tech/DNSResolutionIsNotSegmented written at 01:03:58; Add Comment

(Previous 11 or go back to July 2016 at 2016/07/16)

Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.