2017-04-17
Shrinking the partitions of a software RAID-1 swap partition
A few days ago I optimistically talked about my plans for a disk
shuffle on my office workstation,
by replacing my current 1 TB pair of drives (one of which had failed)
with a 1.5 TB pair. Unfortunately when I started putting things
into action this morning, one of the 1.5 TB drives failed immediately.
We don't have any more spare 1.5 TB drives (at least none that I
trust), but we did have what I believe is a trustworthy 1 TB drive,
so I pressed that into service and changed my plans around to be
somewhat less ambitious and more lazy. Rather than make a whole new
set of RAID arrays on the new disks (and go through the effort of
adding them to /etc/mdadm.conf
and so on), I opted to just move
most of the existing RAID arrays over to the new drives by attaching
and detaching mirrors.
This presented a little bit of a problem for my mirrored swap partition, which I wanted to shrink from 4 GB to 1 GB. Fortunately it turns out that it's actually possible to shrink a software RAID-1 array these days. After some research, my process went like this:
- Create the new 1 GB partitions for swap on the new disks as part
of partitioning them. We can't directly add these to the existing
swap array,
/dev/md14
, because they're too small. - Stop using the swap partition because we're about to drop 3/4ths
of it. This is just '
swapoff -a
'. - Shrink the amount of space to use on each drive of the RAID-1
array down to an amount of space that's smaller than the new
partitions:
mdadm --grow -z 960M /dev/md14
I first tried using
-Z
(aka--array-size
) to shrink the array size non-persistently, butmdadm
still rejected adding a too-small new array component. I suppose I can't blame it. - Add in the new 1 GB partitions and pull out the old 4 GB partition:
mdadm --add /dev/md14 /dev/sdc3 # (wait for the resync to finish) mdadm --add /dev/md14 /dev/sdd3 mdadm --repace /dev/md14 /dev/sde4 # (wait for the resync to finish) madam -r /dev/md14 /dev/sde4
- Tell software RAID to use all of the space on the new partitions:
mdadm --grow -z max /dev/md14
At this point I almost just swapon
'd the newly resized swap
partition. Then it occurred to me that it probably still had a
swap label that claimed it was a 4 GB swap area, and the kernel
would probably be a little bit unhappy with me if I didn't fix
that with 'mkswap /dev/md14
'. Indeed mkswap reported that it
was replacing an old swap label with a new one.
My understanding is that the same broad approach can be used to shift a software RAID-1 array for a filesystem to smaller partitions as well. For a filesystem that you want to keep intact, you first need to shrink the filesystem safely below the size you'll shrink the RAID array to, then at the end grow the filesystem back up. All things considered I hope that I never have to shrink or reshape the RAID array for a live filesystem this way; there are just too many places where I could blow my foot off.
(Life is easier if the filesystem is expendable and you're going to mkfs a new one on top of it later.)
You might ask why it's worth going through all of this instead of
just making a new software RAID-1 array. That's a good question,
and for me it comes down to how much of a pain it often is to set
up a new array. These days I prefer to change /etc/mdadm.conf
,
/etc/fstab
and so on as little as possible, which means that I
really want to preserve the name and MD UUID of existing arrays
when feasible instead of starting over from scratch.
This is also where I have an awkward admission: for some reason, I
thought that you couldn't use 'mdadm --detail --scan
' on a single
RAID array, to conveniently generate the new line you need for
mdadm.conf
when you create a new array. This is wrong; you
definitely can, so you can just do things like 'mdadm --detail
--scan /dev/mdNN >>/etc/mdadm.conf
' to set it up. Of course
you may then have to regenerate your initramfs in order to make
life happy.
(I hope I never need to do this sort of thing again, but if I do I want to have some notes about it. Sadly someday we may need to use a smaller replacement disk in a software RAID mirror in an emergency situation and I may get to call on this experience.)
Link: Introduction to Certificate Transparency for Server Operators
Alex Gaynor's Introduction to Certificate Transparency for Server Operators (via) is what it says in the title, and taught me some things about certificate transparency in practice. Sadly, one of the things it taught me is that once again Lighttpd seems to be coming up short as far as modern TLS goes. I really should switch over my personal site to using Apache, even if it will kind of be a pain because Fedora fumbled good Apache configuration.
(I also hadn't known about Cert Spotter, which has the advantage that you don't need a Facebook login to use it and thus don't have to helpfully (for Facebook) tie one or more domains to your Facebook login. All you need is an email address and on the modern Internet, you already need a source of individualized disposable ones.)
What I mostly care about for speeding up our Python programs
There are any number of efforts and technologies around these days that try to speed up Python, starting with the obvious PyPy and going on to things like Cython and grumpy. Every so often I think about trying to apply one of them to the Python code I deal with, and after doing this a few times (and even making some old experiments with PyPy) I've come to a view of what's important to me in this area.
(This has come to be more and more on my thoughts because these days we run at least one Python program for every incoming email from the outside world. Sometimes we run more than that.)
What I've come around to caring about most is reducing the overall resource usage of short running programs that mostly use the Python standard library and additional pure-Python modules. By 'resource usage' I mean a combination of both CPU usage and memory usage; in our situation it's not exactly great if I make a program run twice as fast but use four times as much memory. In fact for some programs I probably care more about memory usage than CPU, because in practice our Python-based milter system probably spends most of its time waiting for our commercial anti-spam system to digest the email message and give it a verdict.
(Meanwhile, our attachment logger program is probably very close to being CPU bound. Yes, it has to read things off disk, but in most cases those files have just been written to disk so they're going to be in the OS's disk cache.)
I'm also interested in making DWiki (the code behind Wandering Thoughts) faster, but again I actually want it to be less resource-intensive on the systems it runs on, which includes its memory usage too. And while DWiki can run in a somewhat long-running mode, most of the time it runs as a short-lived CGI that just serves a single request. DWiki's long-running daemon mode also has some features that might make it play badly with PyPy, for example that it's a preforking network server and thus that PyPy is probably going to wind up doing a lot of duplicate JIT translation.
I think that all of this biases me towards up-front approaches like Cython and grumpy over on the fly ones such as PyPy. Up-front translation is probably going to work better for short running programs (partly because I pay the translation overhead only once, and in advance), and the results are at least reasonably testable; I can build a translated version and see in advance whether the result is probably worth it. I think this is a pity because PyPy is likely to be both the easiest to use and the most powerful accelerator, but it's not really aimed at my usage case.
(PyPy's choice here is perfectly sensible; bigger, long-running programs that are actively CPU intensive for significant periods of time are where there's the most payoff for speeding things up.)
PS: With all of this said, if I was serious here I would build the latest version of PyPy by hand and actually test it. My last look and the views I formed back then were enough years ago that I'm sure PyPy has changed significantly since then.