Wandering Thoughts archives

2019-03-31

Our likely ZFS fileserver upgrade plans (as of March 2019)

Our third generation of ZFS fileservers are now in full production, although we're less than half way through migrating all of our filesystems from our second generation fileservers. As peculiar as it sounds, this makes me think ahead to what our likely upgrade plans are.

Our current generation ZFS fileservers are running Ubuntu 18.04 LTS with the Ubuntu version of ZFS (with a frozen kernel version). Given our past habits, it's unlikely that we'll want to upgrade them to Ubuntu 20.04 LTS when that comes out in a year or so, unless there's some important ZFS bugfix or feature that's present in 20.04 (which is possible, cf, although serious bugs will hopefully be fixed in the 18.04 version of ZFS). Instead, we'll only start looking at upgrades when 18.04 goes on its end of life countdown when Ubuntu 22.04 LTS comes out, which historically will be in April of 2022, three years from now.

In 2022, our current server hardware and 2TB data SSDs will be about four years old; based on our past habits, this will not be old enough that we consider them in urgent need of replacement. I hope that we'll turn over the SSDs for new ones with larger capacity (and without four years of write wear), but we might not do it in 2022 at the same time as we execute an upgrade to 22.04. If we have money, we might refresh the servers with new hardware, but if so I think we'd mostly be doing it to have hardware that hadn't been used for four years, instead of more powerful hardware, and in general our SuperMicro servers have been very reliable; our OmniOS generation are now somewhere around five years old and show no signs of problems anywhere. The one exception is that maybe RAM prices will finally have gone down substantially by 2022 so we can afford to put a lot more memory in a new generation of servers.

(We will definitely be upgrading from Ubuntu 18.04 when it starts going out of support, and it's probable that it will be to the current Ubuntu LTS instead of to, say, CentOS. Hardware upgrades are much more uncertain.)

Frankly, next time around I would like us not to have to move our ZFS pools and filesystems over to new fileservers; it takes a lot of work and a lot of time. An 'in place' upgrade for the ZFS pools is now at least possible and I hope that we do it, either by reusing the current servers and swapping in new system disks set up with Ubuntu 22.04, or by moving the data SSDs from one physical server to another and then re-importing the pools and so on.

(We did a 'swap the system disks' upgrade on our OmniOS fileservers when we moved from r151010 to r151014 and it went okay. It turns out that we also did this for a Solaris 10 upgrade many years ago.)

sysadmin/ZFSFileserverUpgradePlans written at 21:47:49; Add Comment

Erasing SSDs with blkdiscard (on Linux)

Our approach to upgrading servers by reinstalling them from scratch on new hardware means that we have a slow flow of previously used servers that we're going to reuse, and thus that need their disks cleaned up from their previous life. Some places would do this for data security reasons, but here we mostly care that lingering partitioning, software RAID superblocks, and so on don't cause us problems on new OS installs.

In the old days of HDs, we generally did this by zeroing out the old drives with dd (on a machine dedicated to the purpose which was just left running in the corner, since this takes some time with HDs), or sometimes with a full badblocks scan. When we started using SSDs in our servers, this didn't seem like such a good idea any more. We didn't really want to use up some of the SSD write endurance just to blank them out or worse, to write over them repeatedly with badblocks.

Our current solution to this is blkdiscard, which basically sends a TRIM command to the SSD. Conveniently, the Ubuntu 18.04 server CD image that we use as the base for our install images contains blkdiscard, so we can boot a decommissioned server from install media, wait for the Ubuntu installer to initialize and find all the disks, and then switch over to a text console to blkdiscard its SSDs. In the process of doing this a few times, I have developed a process and learned some useful lessons.

First, just to be sure and in an excess of caution, I usually explicitly zero the very start of each disk with 'dd if=/dev/zero of=/dev/sdX bs=1024k count=128; sync' (the count can vary). This at least zeroes out the MBR partition no matter what. Then when I use blkdiscard, I generally background it because I've found that it can take a while to finish and I may have more than one disk to blank out:

# blkdiscard /dev/sda &
# blkdiscard /dev/sdb &
# wait

I could do them one at a time, but precisely because it can take a while I usually wander away from the server to do other things. This gets everything done all at once, so I don't have to wait twice.

Finally, after I've run blkdiscard and it's finished, I usually let the server sit there running for a while. This is probably superstition, but I feel like giving the SSDs time to process the TRIM operation before either resetting them with a system reboot or powering the server off (with a 'poweroff', which is theoretically orderly). If I had a bunch of SSDs to work through this would be annoying, but usually we're only recycling one server at a time.

I don't know if SSDs commonly implement TRIM to return zero sectors for the TRIM'd space, but for our purposes it's sufficient if they're random garbage that won't be recognized as anything meaningful. And I think that SSDs do do that, at least so far, and that we can probably count on them to do it.

(SSDs might be smart enough to recognize blocks of zeros and turn them into TRIM, but why take chances and if nothing else, blkdiscard is easier and faster, even with the waiting afterward.)

linux/ErasingSSDsWithBlkdiscard written at 00:58:06; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.