Wandering Thoughts archives


The temptation of LVM mirroring

One of the somewhat obscure things that LVM can do is mirroring. If you mention this, most people will probably ask why on earth you'd want to use it; mirroring is what software RAID is for, and then you can stack LVM on top if you want to. Well, yes (and I agree with them in general). But I have an unusual situation that makes LVM mirroring very tempting right now.

The background is that I'm in the process of migrating my office workstation from a pair of old 320 GB drives to a pair of somewhat newer 750 GB drives, and it's reached the time to move my LVM setup to the 750s (it's currently on a RAID-1 array on the two 320s). There are at least three convenient ways of doing this:

  1. add the appropriate partitions from the 750s as two more mirrors to the existing RAID-1 array. There are at least two drawbacks to this; I can't change the raid superblock format, and growing the LVM volume afterwards so that I can actually use the new space will be somewhat of a pain.

    (I suppose that a single pvresize is not that much of a pain, provided that it works as advertised.)

  2. create a new RAID-1 on the 750s, add it as a new physical volume, and pvmove from the old RAID-1 physical volume to the new RAID-1 PV.

    (I did pilot trials of pvmove in a virtual machine and it worked fine even with a significant IO load on the LVM group being moved, which gives me the confidence to think about this even after my bad experience many years ago.)

  3. as above, but set up LVM mirroring between the old and the new disks instead of immediately pvmove'ing to the new disks and using them alone.

    (Done with the right magic this might leave an intact, usable copy of all of the logical volumes behind on the 320 GB drives when I finally break the mirror.)

The drawback of the second approach is that if the 750 GB drives turn out to be flaky or have errors, I don't have a quick and easy way to go back to the current situation; I would have to re-pvmove back in the face of disk problems. And, to make me nervous, I already had one 750 become flaky after it was just sitting in my machine for a bit of time.

(I've already changed to having the root filesystem on the new drives, but I have an acceptable fallback for that and anyways it's less important than my actual data.)

The drawback of the third approach is that I would have to trust LVM mirroring, which is undoubtedly far less widely used than software RAID-1. But it's temptingly easier (and better) than just adding two more mirrors to the current RAID-1 array. If it worked without problems, it would clearly be the best answer; it has the best virtues of both of the other two solutions.

(This would be a terrible choice for a production server unless we really needed to change the RAID superblock format and couldn't afford any downtime. But this is my office workstation, so the stakes are lower.)

I suppose the right answer is to do a trial run of LVM mirroring in a virtual machine, just as I did a pilot run of pvmove. The drawback of that is having to wait longer to migrate to the 750s and ironically a significant reason for the migration is so that I can have more space for virtual machine images.

linux/LVMMirroringTemptation written at 18:02:58; Add Comment

A downside of automation

Right now in the sysadmin world it probably qualifies as heresy to say bad things about the idea of automating your work. But unfortunately for us, there actually are downsides to doing so even if we don't notice them a lot of the time.

The one I'm going to talk about today is that when you automate something, you increase the number of things that people in your team need to know. Suppose that you get tired of maintaining your Apache configuration files by hand, so now you put them in a Chef configuration. You've gone from a situation where all you need to know to configure your Apache is Apache configuration itself to a situation where now you need to know Apache configuration, using Chef, and how you're using Chef to configure your Apache. Any time you automate you go from just needing to know one thing, the underlying thing you're dealing with, to needing to know three or so; you still need to know the underlying thing, but now you also need to know the automation system in general and how you're using it in specific.

(You can condense this by one layer of knowledge if you're not using a general automation system, because then the last two bits condense to one. But you probably don't want to do that.)

This can of course be compounded on itself further. Are you auto-generating DHCP configurations from an asset database and then distributing them through Puppet? Well, you've got a lot of layers to know about.

Some people will say that you don't need to really know all of these layers (especially once you reach the level of auto-generated things and other multi-layer constructs). The drawback of this is that not knowing all of the layers turns you into a push-button monkey; you don't actually understand your system any more, you can just push buttons to get results as long as everything works (or doesn't go too badly wrong).

All of this suggests a way to decide when automation is going to be worth it: just compare the amount of time that it'll take for people to learn the automation system and how you're using it with how much time they would spend doing things by hand. You can also compare more elaborate automation systems to less elaborate ones this way (and new elaborate 'best practices' systems to the simple ones you already have).

(One advantage of using a well known automation system such as Chef or Puppet is that you can hope to hire people who already know the automation system in general, cutting out one of the levels of learning. This is also a downside of having your own elaborate automation system; you are guaranteed that new people will have to learn it.)

By the way (and as is traditional), the people who designed and built the automation system are in a terrible position to judge how complex it is and how hard it is to learn, or even to see this issue. You're usually not going to see the system as complex or hard to keep track of, because to you it isn't; as the builder, you're just too close to the system and too immersed in it to see it from an outside perspective.

PS: Automation can have other benefits in any particular situation that are strong enough to overcome this disadvantage (including freeing sysadmins from drudgery that will burn them out). But it's always something to remember.

(This is closely related to the cost of automation but is not quite the same thing; in that entry I was mostly talking about locally developed automation instead of using standard automation tools.)

sysadmin/AutomationDownside written at 01:17:14; Add Comment

Page tools: See As Normal.
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.