Wandering Thoughts archives


Shrinking the partitions of a software RAID-1 swap partition

A few days ago I optimistically talked about my plans for a disk shuffle on my office workstation, by replacing my current 1 TB pair of drives (one of which had failed) with a 1.5 TB pair. Unfortunately when I started putting things into action this morning, one of the 1.5 TB drives failed immediately. We don't have any more spare 1.5 TB drives (at least none that I trust), but we did have what I believe is a trustworthy 1 TB drive, so I pressed that into service and changed my plans around to be somewhat less ambitious and more lazy. Rather than make a whole new set of RAID arrays on the new disks (and go through the effort of adding them to /etc/mdadm.conf and so on), I opted to just move most of the existing RAID arrays over to the new drives by attaching and detaching mirrors.

This presented a little bit of a problem for my mirrored swap partition, which I wanted to shrink from 4 GB to 1 GB. Fortunately it turns out that it's actually possible to shrink a software RAID-1 array these days. After some research, my process went like this:

  • Create the new 1 GB partitions for swap on the new disks as part of partitioning them. We can't directly add these to the existing swap array, /dev/md14, because they're too small.

  • Stop using the swap partition because we're about to drop 3/4ths of it. This is just 'swapoff -a'.

  • Shrink the amount of space to use on each drive of the RAID-1 array down to an amount of space that's smaller than the new partitions:

    mdadm --grow -z 960M /dev/md14

    I first tried using -Z (aka --array-size) to shrink the array size non-persistently, but mdadm still rejected adding a too-small new array component. I suppose I can't blame it.

  • Add in the new 1 GB partitions and pull out the old 4 GB partition:

    mdadm --add /dev/md14 /dev/sdc3
    # (wait for the resync to finish)
    mdadm --add /dev/md14 /dev/sdd3
    mdadm --repace /dev/md14 /dev/sde4
    # (wait for the resync to finish)
    madam -r /dev/md14 /dev/sde4

  • Tell software RAID to use all of the space on the new partitions:

    mdadm --grow -z max /dev/md14

At this point I almost just swapon'd the newly resized swap partition. Then it occurred to me that it probably still had a swap label that claimed it was a 4 GB swap area, and the kernel would probably be a little bit unhappy with me if I didn't fix that with 'mkswap /dev/md14'. Indeed mkswap reported that it was replacing an old swap label with a new one.

My understanding is that the same broad approach can be used to shift a software RAID-1 array for a filesystem to smaller partitions as well. For a filesystem that you want to keep intact, you first need to shrink the filesystem safely below the size you'll shrink the RAID array to, then at the end grow the filesystem back up. All things considered I hope that I never have to shrink or reshape the RAID array for a live filesystem this way; there are just too many places where I could blow my foot off.

(Life is easier if the filesystem is expendable and you're going to mkfs a new one on top of it later.)

You might ask why it's worth going through all of this instead of just making a new software RAID-1 array. That's a good question, and for me it comes down to how much of a pain it often is to set up a new array. These days I prefer to change /etc/mdadm.conf, /etc/fstab and so on as little as possible, which means that I really want to preserve the name and MD UUID of existing arrays when feasible instead of starting over from scratch.

This is also where I have an awkward admission: for some reason, I thought that you couldn't use 'mdadm --detail --scan' on a single RAID array, to conveniently generate the new line you need for mdadm.conf when you create a new array. This is wrong; you definitely can, so you can just do things like 'mdadm --detail --scan /dev/mdNN >>/etc/mdadm.conf' to set it up. Of course you may then have to regenerate your initramfs in order to make life happy.

(I hope I never need to do this sort of thing again, but if I do I want to have some notes about it. Sadly someday we may need to use a smaller replacement disk in a software RAID mirror in an emergency situation and I may get to call on this experience.)

linux/ShrinkingSoftwareRAIDSwap written at 23:18:30; Add Comment

Link: Introduction to Certificate Transparency for Server Operators

Alex Gaynor's Introduction to Certificate Transparency for Server Operators (via) is what it says in the title, and taught me some things about certificate transparency in practice. Sadly, one of the things it taught me is that once again Lighttpd seems to be coming up short as far as modern TLS goes. I really should switch over my personal site to using Apache, even if it will kind of be a pain because Fedora fumbled good Apache configuration.

(I also hadn't known about Cert Spotter, which has the advantage that you don't need a Facebook login to use it and thus don't have to helpfully (for Facebook) tie one or more domains to your Facebook login. All you need is an email address and on the modern Internet, you already need a source of individualized disposable ones.)

links/CertTransForServerOps written at 13:50:48; Add Comment

What I mostly care about for speeding up our Python programs

There are any number of efforts and technologies around these days that try to speed up Python, starting with the obvious PyPy and going on to things like Cython and grumpy. Every so often I think about trying to apply one of them to the Python code I deal with, and after doing this a few times (and even making some old experiments with PyPy) I've come to a view of what's important to me in this area.

(This has come to be more and more on my thoughts because these days we run at least one Python program for every incoming email from the outside world. Sometimes we run more than that.)

What I've come around to caring about most is reducing the overall resource usage of short running programs that mostly use the Python standard library and additional pure-Python modules. By 'resource usage' I mean a combination of both CPU usage and memory usage; in our situation it's not exactly great if I make a program run twice as fast but use four times as much memory. In fact for some programs I probably care more about memory usage than CPU, because in practice our Python-based milter system probably spends most of its time waiting for our commercial anti-spam system to digest the email message and give it a verdict.

(Meanwhile, our attachment logger program is probably very close to being CPU bound. Yes, it has to read things off disk, but in most cases those files have just been written to disk so they're going to be in the OS's disk cache.)

I'm also interested in making DWiki (the code behind Wandering Thoughts) faster, but again I actually want it to be less resource-intensive on the systems it runs on, which includes its memory usage too. And while DWiki can run in a somewhat long-running mode, most of the time it runs as a short-lived CGI that just serves a single request. DWiki's long-running daemon mode also has some features that might make it play badly with PyPy, for example that it's a preforking network server and thus that PyPy is probably going to wind up doing a lot of duplicate JIT translation.

I think that all of this biases me towards up-front approaches like Cython and grumpy over on the fly ones such as PyPy. Up-front translation is probably going to work better for short running programs (partly because I pay the translation overhead only once, and in advance), and the results are at least reasonably testable; I can build a translated version and see in advance whether the result is probably worth it. I think this is a pity because PyPy is likely to be both the easiest to use and the most powerful accelerator, but it's not really aimed at my usage case.

(PyPy's choice here is perfectly sensible; bigger, long-running programs that are actively CPU intensive for significant periods of time are where there's the most payoff for speeding things up.)

PS: With all of this said, if I was serious here I would build the latest version of PyPy by hand and actually test it. My last look and the views I formed back then were enough years ago that I'm sure PyPy has changed significantly since then.

python/FasterPythonInterests written at 02:05:16; Add Comment

Page tools: See As Normal.
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.