Wandering Thoughts archives

2010-01-30

I'm puzzled about DNS glue records in the modern world

I recently wound up reading a message about upcoming behavior changes to the .com/.net/.edu nameservers (via Dawn Keenan). The short summary of that message is that as of March 1st, those nameservers will mostly stop returning information about out of zone glue records.

Up until I read the message, I had not realized that these nameservers were still carrying (and in fact returning) out of zone glue records and it was just that modern DNS systems were not paying attention to them; I had innocently thought that out of zone glue was entirely gone, due to the era of glue record hell being long over. In fact, it turns out that they were doing so in a way that let people create circular loops of nameservers, which are going to break on March 1st.

(A longer explanation of the situation is in the message; see there for the details, which make my head hurt.)

Some experimentation shows that the out of zone glue records do not have to be from in-zone glue records for their own domain. As an example, suppose you have A.com, with NS records n1.a.com and n2.a.com, and B.com, with NS records n1.a.com and n3.a.com. If you do an NS lookup for B.com at the .com nameservers, you'll get back IP addresses for both n1.a.com, which is in-zone glue for A.com and thus necessary to carry in the .com zone, and n3.a.com, which is not. This means that something in the whole nameserver and zone management process knows or is looking up n3.a.com's IP address, and is adding it to the .com DNS information.

All of this leaves me puzzled about what the rules are for out of zone glue records in the modern world, although it probably varies by DNS zone (and maybe even registrar).

(The client-side resolver rules are clearer; apparently pretty much you always ignore any out of zone A records that you get back.)

PuzzlingModernDNSGlue written at 02:40:15; Add Comment

2010-01-20

The argument for not managing systems via packages

Although I haven't changed my mind in general, I think that there are arguments for configuration management systems like Cfengine and Puppet over packaging systems. Apart from their actual existence in usable form, one of them is that CM systems are going to be inherently more agile than packaging systems.

The drawback of packaging systems is that they fundamentally work on packages. This means that doing things via them requires a multi-step process; you must assemble your files, build one or more packages from them, and finally propagate and install the packages. Because they work directly, CM systems generally require less overhead to propagate a file (or to add a file to be propagated); you put the file in the appropriate place, modify your master configuration, and things go immediately.

(Because it does not have to build static packages, a CM system can also pull various tricks to make your life simpler, such as supporting not just static files but various sorts of templated files that are expanded for each specific system.)

A CM system also gives you a natural place to express the meta-information about what files get to which machines, letting you do so directly and without adding any extra mechanisms. A packaging system can express all of this (for example, you can have meta-packages for various sorts of machines and create dependencies and conflicts), but you have to do so indirectly, which involves increasing amounts of somewhat shaky magic.

WhyNotManageWithPackages written at 23:04:43; Add Comment

2010-01-16

Different visions of what packaging systems are for

I think I should write down one of my background views on what packaging systems are for, and what jobs they do, because there are competing (or somewhat clashing) visions about what is within their scope.

One view is that the job of a packaging system is more or less confined to properly installing, removing, and upgrading packages. This need not exclude things like apt-get; if you take a broad view of 'installing packages', this can easily include fetching packages and dependencies over the Internet. Additional features (such as asking the packaging system about file state) can be supported if they don't require any deep changes in the system.

(A packaging system that supports removing files needs to keep track of what files belong to what packages, and at that point it might as well let you look at that data.)

Another view is that packaging systems have at least two jobs; they should both deal with packages and also keep track of the system state. Features such as querying about file state are not 'if you can support it' things, they are a fundamental mission; they need to be supported even if they require deep changes to the system.

As you can tell from the previous entries, I am fairly strongly of the latter view. To put it one way, I think that something should be in charge of keeping track of the system state, and the packaging system is already doing this work for most of the files on the system. Doing it in anything other than the packaging system means redoing a lot of that work (especially in packaging systems that do not expose APIs for this).

To stereotype it, the first view is the developer view and the second is the sysadmin view. Developers mostly care about packaging things and installing them (and somewhat about clean removals); sysadmins spend relatively little time actually installing and removing packages but quite a lot of time trying to manage and sort out systems, so they care more about querying and fixing things that have gone wrong in existing package installations.

PackagingSystemVisions written at 03:35:14; Add Comment

2010-01-15

Packaging systems should support overlays

One of the very common things that sysadmins do when dealing with more than one system is that we develop a common set of files that we slap on every system (or in large environments, on subsets of our systems). Sometimes this is done by hand, sometimes this is done with hand-built tools, and sometimes this is done with things like Cfengine or Puppet.

If you accept that packaging systems should be comprehensive, then packaging systems should also support this common need. Specifically, packaging systems should have the idea of 'overlays', packages that replace files from other packages; this would allow sysadmins to build one or more overlay packages that contained the various collections of their necessary customizations and localizations.

(Ideally overlays would be slightly more general, allowing you to at least remove files as well.)

There are several advantages of doing this through the packaging system. The first is that all of the package manager querying and verification tools will now work on your files as well; you can see that a particular file comes from this version of your localizations and has not been changed from what it should be, you can see what all of your localizations on this system are, and so on.

(As far as I know, Puppet and Cfengine do not support this sort of state querying.)

The next is that upgrading system packages can now be much more aware of your local customizations; the package manager does not have to guess what to do when upgrading things you've changed (or just ignore your changes). It becomes trivial to maintain your local overlays across base package upgrades and to block these upgrades if your local overlays aren't ready for them. Equally, you can use the power of the package manager when changing your local overlays, doing them as overlay package upgrades. Thus you can require base package upgrades, add and change dependencies, detect conflicts, and so on.

(The implication of this is that the package manager can also show you the differences between the overlay-supplied version of a file and the base version, since the package manager needs to keep the base version around to restore it in case you remove your overlay or the next version of the overlay package does not overlay this particular file any more.)

PackagingOverlays written at 00:17:03; Add Comment

2010-01-14

Packaging systems should be comprehensive

I have a number of peculiar opinions about packaging systems in general. One of them is that if you have a good packaging system, you should be able to manage your machine entirely through it; every modification that you need to make should be doable as part of a package.

(Okay, not quite every modification; I'll exclude files that are in constant flux outside of your control, like /etc/passwd and /etc/shadow, and operations like rebooting the machine due to new kernel installations.)

This is a large systems attitude. If you're running a lot of machines you want to manage them through one mechanism, not many, and it needs to be a mechanism that can be mostly or entirely automated. You almost always have to deal with the packaging system to some degree; the logical conclusion is that you should be able to do everything with the packaging system. Further, if you are managing a lot of machines you have a great deal of configuration elements that are common across all of them (or subsets of them), and this sounds exactly like a package that you install on all of those machines.

The other large systems choice is to have nothing to do with the package system at all; to do everything outside of it, to control it with your real management system, and to never interact with it. There are several serious drawbacks of this approach, since not only are you throwing out all of the work the packaging system does to maintain state information and make operations safe, you're actively going behind its back and fighting it. When you and the packaging system disagree about who is in charge, sooner or later both of you lose.

(However, it is much easier to take this approach, especially since it doesn't require the cooperation of the package management system. Hence it is quite popular for system management programs.)

Accepting this idea has significant implications for the features your packaging system needs. For that matter, so does thinking about this issue but disagreeing; if you reject comprehensiveness but accept the problem, you need to design a package management system that (to put it one way) does not believe that it's in charge.

(My strong impression is that current packaging systems have not really thought about this; instead, they have opted to believe that they live in a world where they are more or less in charge and that they can fix up the gaps with spackle and hand-waving.)

ComprehensivePackaging written at 01:36:54; Add Comment

2010-01-12

How systems get created in the real world

From my previous entry on our DHCP portal, I may have left you with the impression that it sprang into existence one day, fully formed and ready to go. I regret to inform people that this was not at all the case; instead, our DHCP portal is the end stage of a long chain of evolutionary steps.

It goes like this:

  • in the beginning, there were not even Points of Contact; there was just a general computing support organization. Users contacted us directly to get their machines on the laptop network, and we edited the DHCP config files by hand. We had a large section of our support website devoted to how to find out your machine's Ethernet address.

  • when the Points of Contact model came in, users now contacted their Point of Contact, who worked with them to handle things like getting their Ethernet address and then emailed us the details. We still hand edited the information into the DHCP config files.

  • users strongly advocated for something that had better out of hours service and was faster. The Points of Contact already had access to the DHCP server so they could troubleshoot things for their users; as a quick fix, we decided that it would be simple enough to write a script to add entries and let them run it via sudo, which would at least get us out of the loop. As a bonus, we could use the same script most of the time and stop having to hand-edit the DHCP config files ourselves.

    The script is very, very cautious because it is directly editing the DHCP configuration files and restarting the DHCP server; it verifies all its parameters, does locking, and so on.

  • we realized that we could use an existing authenticated local documents area on our internal https server to host a simple web form and CGI that asked the user for all of the necessary information (getting the user's identity itself from the HTTP authentication), then ssh'd off to the DHCP server to run the script (via sudo) to actually add the entry.

    The script's paranoia meant that we didn't have to worry too much about the CGI getting things wrong. (The CGI still did validation itself, if only because it could give the users much better error messages.)

  • we cannibalized that CGI to run on another system that had direct access to the laptop network, so that it could just pluck the Ethernet address from the ARP cache instead of having to ask the user for it. Everything else was just cloned for the new system, including both the Apache configuration (clunky HTTP Basic authentication over https and all) and the ssh-based access to the DHCP server to run the script.

And that was how our DHCP portal came into existence, at the end of all of these incremental steps. The result is not as clean as if we had started from scratch with what we wanted, but if we had tried to do that we might not have gotten there at all.

The result is not perfect. For example, the script on the DHCP server was originally written for sysadmins who were nervous about it failing, so it has quite verbose output that's completely unsuitable for showing to users; as a result, sometimes all the CGIs can say is 'something went wrong, try submitting your information again'. (Fortunately ssh will pass the exit status back or the CGIs would be even less informative.)

This is something that I find typical for systems in the real world. Almost nothing ever appears from nowhere; things are almost always the end product of a long series of small changes and adaptations and tweaks, cannibalizations of earlier projects and programs, things adopted to different purposes, and so on. The result is imperfect and sometimes the cracks show, with little bits of obsolete functionality and odd gaps and limitations.

SystemEvolution written at 00:44:00; Add Comment

2010-01-09

Patch management is part of package management

There's two approaches to distributing updates; you can distribute entire new versions of packages, or you can use some sort of 'patches'. For essentially historical reasons, commercial Unix vendors generally use the latter approach.

Here is something important about this: if you have patches, patch management needs to be part of the packaging infrastructure, even if it is not done with the package management system and commands. It cannot and should not be something that is just slapped on over top.

Here's why:

From a sysadmin perspective, the purpose of package management is to keep track of what is on your system; what is there, where it comes from, what it should look like, and so on. (This applies whether or not the packaging system is used for extra software or is only for the base operating system.)

The purpose of patching is to change what's on your system. This is exactly what package management is supposed to keep track of, so that it can give you correct answers to your questions about what should be there and where it should have come from. Ergo, patching needs to be part of package management so that package management can continue to give you these answers, instead of out of date lies.

(Patching is not the only such thing, mind you. As a general rule, current package management systems are utterly lacking in ways for sysadmins to tell them about changes. If it's not a new version of a package, they don't know about it and they don't want to hear from you. I maintain that this is a mistake, but that's another entry.)

PatchesAndPackaging written at 02:25:24; Add Comment

2010-01-06

The department's model for providing computing support

In many university departments, there's a constant tension between working on the computing infrastructure that is used by the entire department, and working on things for individual professors and research groups. To put it more concretely, do you work on upgrading the departmental mailserver or do you set up Professor X's new cluster that just got dropped off at the loading dock?

(This tension is increased when Professor X's grant funding is helping pay for computing support, as it often is. Professor X may well feel that, well, she'd like some of what her money bought. Now.)

Locally, we (the Department of Computer Science) have evolved an interesting approach to deal with this problem that we call the 'Point of Contact' model. In this model, departmental computing support is split into two parts, Core and Points of Contact.

(There is a third and entirely separate group that supports undergraduate Computer Science students.)

The Core group is responsible for all of the common departmental infrastructure; we run the network, the firewalls, the general login servers and the fileservers, the general user webserver, the mailserver, and so on. It's funded from base budget funds, and does whatever the departmental computing committee thinks should be centrally provided.

By contrast, each Point of Contact works directly with specific research groups and researchers (and their grad students) to do, well, whatever those professors want them, whether this is setting up compute clusters, configuring Windows machines, or building custom programs for grad students. Points of Contact are paid for the grant funding of their professors and research groups and effectively work for them. While it's a departmental mandate that everyone has a Point of Contact (and helps pay for theirs), it's up to the research groups to decide things like how senior a person they need and how much time they can pay for and so on.

Points of Contact are called that partly because they are everyone's first level of contact for all computing support; the Core group is not supposed to deal directly with users. Instead, people go to their local person (most of the Points of Contact have office areas that are physically close to most of their professors and grad students), their Point of Contact does the initial troubleshooting and diagnosis, and only then is the distilled problem passed to the Core group if it turns out to be a problem involving Core stuff instead of a purely local issue.

(As you can imagine this works slightly better in theory than it does in practice, although it actually does mostly work in practice.)

From my perspective, I feel that this does two really useful things. First and obviously, it solves the 'what do we work on' problem; Core works on central things and Points of Contact work on whatever their professors want them to. As part of this, I think that it makes professors feel that they really do own and control what their grant money is buying; while they have to spend it, they get transparent and direct results for what they spend.

Second, it personalizes the computing environment as a whole. Computing and support doesn't come from some faceless or hard to remember revolving group of people that you deal with at a distance, mostly through email. Instead, it comes from a specific person that you know and that you deal with all of the time, frequently in person (and certainly it's expected that you can drop by their office and talk to them directly), and they're your agent (or at least your professor's agent) when something needs to get done beyond them. In a visceral way, they're your person.

Now, the necessary disclaimer: I had nothing to do with coming up with this model; it was developed before I came here. I'm just writing it up for WanderingThoughts because I think it's nifty (and a cool solution to the overall problem).

CSDeptSupportModel written at 01:01:51; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.