|
2013-06-19 Our approach to configuration managementA commentator on yesterday's entry suggested that we're already using automated configuration management, just a home-grown version of it. To explain why I mostly disagree I need to run down the different sorts of configuration management as I see them:
For the most part we have centralized configuration management, with the master copies of all configuration files living on our central administrative filesystem, but not automated configuration management. Only a few things like passwords and NFS mounts propagate automatically; everything else has to be copied around in an explicit step if we change it (sometimes by hand, sometimes with the details wrapped up in a script we run by hand). (Actually now that I think about it we have a surprising amount of automatic propagation going on. It's just all in little special cases that we usually don't think about because, well, they're automated and they just work.) I could give you a whole list of nominally good reasons why we aren't automatically propagating various things, but here's what it boils down to: if sysadmins are the only people changing whatever it is, it doesn't change very often, and it doesn't have to go everywhere, we haven't bothered to automate things because it doesn't annoy us too much to do it by hand. When one or more of those conditions changes we almost invariably automate away. (That actually suggests a number of openings for a system like Puppet. For a start it can probably handle the actual propagation on command instead of having us manually copy around files.)
2013-06-18 What's in the way of us using automated configuration managementEvery so often I poke at Puppet or Chef or one of the other automation systems and consider if we could really use it. And every time I do I find it a hard sell, even to myself (much less hypothetically to my co-workers). Today I've decided to try to write down my collection of technical reasons (to go with other reasons):
The short version is that we've already automated most everything we commonly do to more than one machine. I don't seem much room for an automated configuration management system to come in and do new things for us; this means that it would have to replace existing, already developed and working automation. There's some benefit to using standard tools for automation but I'm not convinced that there's a lot. It's possible that I'm missing things that Puppet, Chef, et al could do for us because the usual examples I read are bright cheerful 'let's deploy a canned web server configuration on to a random machine' ones and we don't have that problem. In a chicken and egg problem, I can't find the energy to read the documentation for Puppet because I suspect that I won't find anything we can use. (3 comments.)
AutomationBadFitHere written at 01:05:16; Add Comment
2013-06-17 My job versus my career: some thoughtsOne of the things I've said to people in the past is that while I have a job I didn't really have a career. Not in the sense that my job is unstable or unsettled (it's actually been rock-solid) but in the sense that I had no idea of where I wanted to go, no particular vision of my future. With no real view of what I wanted I've never had any strong basis to do things like evaluate my current situation, consider other options, or assess my progress towards, well, what should I be progressing towards? To be progressing towards something implies having some objective and I've never had any grounds to establish such a thing. This has led to me having a huge inertia in my job (which has generally been both pleasant and interesting). I've changed jobs here at the university only once and even then it took a huge upheaval to do it; this puts me way out on the 'time at a single employer' and 'time at a single job' curves for computer people. One consequence that I've been thinking about off and on is that I don't even know what it would take to attract me away from my current job, since I don't have anything I'm aiming for and I'm not sure there's anything in particular missing or wrong about my current job (things where another job would simply make me happier). (It's possible that I'm deluding myself here for various reasons. Universities are comfortable places but at the same time they're places that basically can't value IT as much as some places do out in the outside world (cf).) Writing Wandering Thoughts has made me somewhat more aware about this (because it's prompted a certain amount of self-reflection), but what's done more to bring this to mind is reading about the interesting sysadmin-related things that other organizations are doing (Twitter has been especially good for exposing me to this). I still don't know where (if anywhere) I want to go but at least it gives me more of an idea of what's out there. By the way I have no idea if it's important to actually have a career as such, in this sense. If you're happy with your job (and paid well enough), do you really need anything more or would it just add extra stress to your life? And balancing the relative happiness of the known present versus uncertain potential futures is a hard problem. (If you're bored or unhappy in your job it's another matter, of course. Then you at least want to figure out what'd make you happier and move towards it (which is easy to say but potentially hard to do).) (This is one of the entries in which I ramble, partly writing to myself.) (5 comments.)
MyJobVsMyCareer written at 01:01:23; Add Comment
2013-05-20 A serious potential danger with Exim host lists in ACLsSuppose that you have an Exim installation and you want to support some sort of source host based blocking (selective or otherwise) of incoming connections. The obvious way is to create an ACL section that looks something like this:
(This one is a selective, per destination address host block list, hence
the fun and games with This looks great and generally works but you've just armed a ticking time bomb, one that can blow your incoming email up with permanent temporary deferrals. The first problem is that Exim has no way in a host list to say 'this domain and any of its subdomains', in the way that the TCP wrappers '.host.com' will match both 'host.com' and 'fred.host.com'. If you want to match this case, the obvious way is to write two entries: *.host.com host.com The first matches any subdomains of the domain; the second matches the domain itself. But you've just put the fuse in the bomb, because of just how plain host and domain names work in host lists. From the Exim specification with the emphasis being mine:
So here's what happens. You list '*.spammer.com' and 'spammer.com' in your blocklist. Spammer.com turns off their DNS (or their DNS server turns it off because hey, they're a spammer) but doesn't de-register their domain, so DNS queries to their nominal authoritative DNS servers either don't get answers or get non-authoritative 'look elsewhere' results. Although this is a permanent condition, it's considered a temporary failure in DNS resolution. Exim now defers all SMTP connections that consult this host blocklist, regardless of where they are from. For ever, or at least until you notice. Now that I've read the Exim documentation in detail, it spells out
that you can turn this behavior off with the special option
My feeling is that you want to do this for every host list anywhere except ones used for real, strong access control (which probably don't want to be using DNS names anyways). Consider, for example, a host list used for exceptions to greylisting; you probably don't want that ACL to defer the connection if you can't resolve a domain in it. Sidebar: the other surprise in Exim host listsSuppose that you have a host list like this: *.spammer.com192.168.0.0/16 Surprise: any connection from a host in 192.168/16 that does not have valid reverse DNS will not match the list. The moment you list a hostname wildcard in a host list, any IP address without a hostname automatically fails to match that entry or anything later in the list (or file if the list is in a file). It will match IP address patterns that are earlier in the list, though, so you get to remember to list all IP address patterns first. This behavior is documented if you read the documentation carefully. Per the fine documentation
this behavior can be turned off with This is generally less dangerous than the host list defer time bomb, but it depends on what you're using the host list for. If you have a locked down configuration where you're using the host list for strong access control, well, you have potential issues here.
2013-05-18 A little habit of our documentation: how we write loginsOve the years, we've developed a number of local conventions for our
local documentation. One of them is that we
always write Unix logins with Writing logins this way does two things. The first is that they become
completely unambiguous. This is not much of an issue with a login like
'cks', but we have any number of logins that are (or could be) people's
first or last names, and vice versa. Consistently writing the login with
<> around it removes that ambiguity and uncertainty. The second thing it
does is that it makes it much easier to search for a particular login in
old messages and documentation. Searching for 'chris' may get all sorts
of hits that are not actually talking about the login (Well, sort of. The reality is that we sometimes wind up quoting various sorts of system messages and system logs in our messages and of course these messages generally don't use the '<login>' form. However, often excluding these messages from a later search is good enough because we're mostly interested in the record of active things we did to an account.) There's a corollary to the convenience of <login>: right now we have no similar notation convention for Unix groups. We write less about Unix groups than about Unix logins (and groups generally have more distinct names), but it would still be nice to have some convention so we could do unambiguous searches and so on.
2013-05-10 Disk IO is what shatters the VM illusion for me right nowI use VMs on my office workstation as a far more convenient substitute for real hardware. In theory I could assemble a physical test machine or a group of them, hook them all up, install things on them, and so on; in practice I virtualize all of that. This means that what I want is the illusion of separate machines and for the most part that's what I get. However, there's one area where the illusion breaks down and exposes that all of these machines are really just programs on my workstation, and that's disk IO. Because everything is on spinning rust right now (and worse, most of it is on a common set of spinning rust), disk IO in a VM has a clear and visible impact on me trying to do things on my workstation (and vice versa but I generally don't care as much about that). Unfortunately doing things like (re)installing operating systems and performing package updates do a lot of disk IO, often random disk IO. (In practice neither RAM nor CPU usage break the illusion, partly because I have a lot of both in practice and VMs don't claim all that much of either. It also helps that the RAM is essentially precommitted the moment I start a VM.) The practical effect is that I generally have to restrict myself to one disk IO intensive thing at once, regardless of where it's happening. This is not exactly a fatal problem, but it is both irritating and a definite crack in the otherwise pretty good illusion that those VMs are separate machines. (The illusion is increased because I don't interact with them with their nominal 'hardware' console, I do basically everything by ssh'ing in to them. This always seems a little bit Ouroboros-recursive, especially since they have an independent network presence.) (4 comments.)
ShatteringVMIllusion written at 02:26:02; Add Comment
2013-05-03 Virtual disks should be treated as 4k 'Advanced Format' drivesHere's something that's potentially very important, as it was for me today: if you're using an ordinary basic virtualization system (where you have guest OSes on top of a regular host OS), your virtual disks almost certainly have the performance characteristics of 4K physical sector size disks. (In some situations they may have even bigger effective physical sector sizes.) This happens because in a standard basic virtualization system, the guest OS disks are just files in a host OS filesystem and that host OS filesystem almost certainly has at least a 4 Kbyte basic block size. Sure the files are byte-addressable, but writing less than 4 Kb or writing things not aligned on 4Kb boundaries means that the host filesystem will generally have to do a read modify write cycle to actually write the guest's data out to its disk file (and then later do unaligned reads to get it back). Depending on how the virtualization system is implemented this can also require a whole bunch more memory copies as the VM hypervisor re-blocks and de-blocks data flowing between it and the host filesystem instead of just handing nice aligned 4 Kb pages off to the filesystem, where they will flow straight through to the hardware disk driver. Under a lot of circumstances this won't actually matter. Many (guest) filesystems have a 4 Kb or bigger basic block size themselves and issue aligned IO in general, regardless of what they think the disk's (physical) block size is; if this is the case, the IO to the disk files in the host filesystem will generally wind up being in aligned 4 Kb blocks anyways. But if you have a guest OS filesystem that does not necessarily issue aligned writes, well, then this can make a real difference. ZFS is such a filesystem; if it thinks it's dealing with 512 byte sector disks it will issue all sorts of unaligned writes and reads. The punchline to this is that today I doubled the streaming write and read IO speeds for my ZFS on Linux test VM with a simple configuration change (basically telling it that it was dealing with 4K disks). The IO speeds went from kind of uninspiring to almost equal to ext4 on the same (virtual) disk. (My Illumos/OmniOS test VM also had its ZFS IO speed jump, although not as much; it was faster to start with for some reason.) (4 comments.)
VirtualDisksAre4KDisks written at 01:29:09; Add Comment
2013-04-29 My practical problem with preconfigured virtual machine templatesIn comments on my entry on some VM mistakes I made, people suggested setting up template VM images that I would clone or copy to create live images. With that, every time I wanted a new VM I'd at least have a chance to think about its settings and if cloning was easy enough I'd avoid the temptation to reuse an existing VM for some theoretically quick (and not very important) test. As it happens I've sort of started to toy with this idea but I think there's a practical roadblock in our environment: OS package updates. Most of the actual machines that I deal with (and thus most of the test images I build) are not frozen at a point in time but instead are continuously kept up to date with Ubuntu's package updates. If I have base starter images I need to either keep those base starter images up to date or immediately apply all of the pending updates after I clone the base image to make a new working VM. Neither of those options seem entirely attractive, although I should probably give it a real try just to see. (There is also the subtle issue that cloning a preconfigured base image and then updating the packages is not quite the same thing as a from-scratch rebuild. If I want to be absolutely sure that a machine can be rebuilt from scratch, I'm not sure I trust anything short of a from-scratch build. But I could probably save a bunch of time by doing the preliminary build testing with cloned images and only doing a from-scratch reinstall in the final test run.) PS: every so often we make a meaningful change to the base install scripts and system; such changes would force me to rebuild all of the preconfigured images (well, strongly push me towards doing so). But I suppose those are relatively rare changes so I'm kind of making excuses to not try this. (Which argues that I should try it, if only to understand what I really don't like about the idea instead of what I tell myself my concerns are. Yes, I'm sort of using my blog to talk out loud to myself.) Sidebar: if you're working on VMs, give yourself more disk spaceOne of the smartest things I did recently for encouraging this sort of playing around with VMs is that I threw a third drive into my office workstation (to go along with the main mirrored system disks). Honestly, I should have done this ages ago; having to worry about how much disk space I had to spare for VMs is for the birds when 500 GB SATA drives are basically popcorn. (The drive is unmirrored and un-backed-up, but if it dies I'll just lose expendable VM images and related bits and pieces. I keep the important VMs on my mirrored drives.)
2013-04-24 Two mistakes I made with VMs todayFor reasons that kind of boil down to 'laziness', I only rarely delete or create VMs in my use of virtualization. Instead I mostly recycle or re-purpose already created VMs by reinstalling OSes on them, or sometimes not even reinstalling but just slapping some additional packages on to the existing VM image. When I'm in much doubt about the state of a VM or need it to be in a different one, I reinstall. Usually this works well, but today I discovered that I'd wound up with two accidents. The smaller discovery was that both of my primary VMs (currently both being used for some testing) had lingering disk snapshots from months ago when each of them was being used for very different things. At a minimum this was taking up extra disk space. It may also have been slowing down disk IO due to copy-on-write issues, although after months of churn and OS reinstalls that may have wound up a non-issue. The larger discovery was, well, let me put it this way: ZFS on Linux turns out not to be very happy when you try to run it on a VM with a 32-bit kernel and only 512 MB of RAM, especially if you're also using multipathed iSCSI. In retrospect I'm kind of impressed that the ZFS code didn't detonate on contact with that environment (although it did start producing kernel panics when I put it under enough load). Such is the drawback of repurposing existing VMs without paying much attention to their configuration. (You might wonder how I could possibly get into that situation. The short answer is that it all started when I was doing some testing of low-memory web server setups and reused the same VM for a quick 'does it actually run' test of ZFS on Linux. Then later I came back to do more testing without actively noticing the VM's configuration and thinking about it.) I don't have any clever ways of avoiding this in the future; it's just something that I'll have to keep an eye out for every so often, especially if I (temporarily) configure a VM into an unusual state (such as having low memory). (2 comments.)
TwoVMOversights written at 01:01:52; Add Comment
2013-04-23 Goodbye, djb dnscacheI've been using djb's This weekend, I turned dnscache off on my home machine (it's been off on my office machine for some time). There wasn't any particular immediate reason to do so, no specific thing I cared about the dnscache was failing me at, no unpatched security hole (that I know about), nothing like that. My direct reason for making the switch was that I've been worried for some time about how dnscache was going to deal with the growing new worlds of IPv6 and DNSSEC, or more accurately I was pretty sure that it wasn't going to do so very well. But the larger reason is that djb's software is effectively dead software, dnscache included. Perhaps there are some people hacking on it somewhere, but the canonical source (djb himself) has walked away from it. As I wrote about qmail, the reality is that software on the Internet rots if not actively maintained because the Internet itself keeps changing. It was clear to me that I could either wait quietly until dnscache blew up in some obvious way or I could change over to something else. The something else might not be as pure or as minimal as dnscache but it wouldn't be quietly rotting, and some of the minimal purity of dnscache no longer matters on today's machines. On my office machine I made the switch in late 2010 (judging from the last timestamps on dnscache's query logs and, now that I look, this old entry). I dragged my feet on my home machine for various reasons, partly laziness, but finally decided that it was time this weekend. There's a part of me that regrets this because it likes the purity and minimalism of dnscache, but the greater part of me knows that this is the sensible course. Still, I'll miss dnscache a bit. And it certainly served faithfully for all of these years. (For those that are curious, I switched to Unbound, as suggested in that old entry.) PS: I'm still running djb's tinydns for some primary DNS serving, but I suppose I should look into a replacement. It's just that I've hated all of the primary DNS servers I've ever looked at even more than I hate the various recursive caching nameservers. And there's also the security issues. My recursive nameservers are not exposed to the Internet; my primary DNS servers necessarily are. (One comment.)
GoodbyeDnscache written at 00:41:16; Add Comment
|
These are my WanderingThoughts GettingAround This is part of CSpace, and is written by ChrisSiebenmann. * * * Atom feeds are available; see the bottom of most pages. Categories: links, linux, programming, python, snark, solaris, spam, sysadmin, tech, unix, web |