Erasing SSDs with blkdiscard (on Linux)

March 31, 2019

Our approach to upgrading servers by reinstalling them from scratch on new hardware means that we have a slow flow of previously used servers that we're going to reuse, and thus that need their disks cleaned up from their previous life. Some places would do this for data security reasons, but here we mostly care that lingering partitioning, software RAID superblocks, and so on don't cause us problems on new OS installs.

In the old days of HDs, we generally did this by zeroing out the old drives with dd (on a machine dedicated to the purpose which was just left running in the corner, since this takes some time with HDs), or sometimes with a full badblocks scan. When we started using SSDs in our servers, this didn't seem like such a good idea any more. We didn't really want to use up some of the SSD write endurance just to blank them out or worse, to write over them repeatedly with badblocks.

Our current solution to this is blkdiscard, which basically sends a TRIM command to the SSD. Conveniently, the Ubuntu 18.04 server CD image that we use as the base for our install images contains blkdiscard, so we can boot a decommissioned server from install media, wait for the Ubuntu installer to initialize and find all the disks, and then switch over to a text console to blkdiscard its SSDs. In the process of doing this a few times, I have developed a process and learned some useful lessons.

First, just to be sure and in an excess of caution, I usually explicitly zero the very start of each disk with 'dd if=/dev/zero of=/dev/sdX bs=1024k count=128; sync' (the count can vary). This at least zeroes out the MBR partition no matter what. Then when I use blkdiscard, I generally background it because I've found that it can take a while to finish and I may have more than one disk to blank out:

# blkdiscard /dev/sda &
# blkdiscard /dev/sdb &
# wait

I could do them one at a time, but precisely because it can take a while I usually wander away from the server to do other things. This gets everything done all at once, so I don't have to wait twice.

Finally, after I've run blkdiscard and it's finished, I usually let the server sit there running for a while. This is probably superstition, but I feel like giving the SSDs time to process the TRIM operation before either resetting them with a system reboot or powering the server off (with a 'poweroff', which is theoretically orderly). If I had a bunch of SSDs to work through this would be annoying, but usually we're only recycling one server at a time.

I don't know if SSDs commonly implement TRIM to return zero sectors for the TRIM'd space, but for our purposes it's sufficient if they're random garbage that won't be recognized as anything meaningful. And I think that SSDs do do that, at least so far, and that we can probably count on them to do it.

(SSDs might be smart enough to recognize blocks of zeros and turn them into TRIM, but why take chances and if nothing else, blkdiscard is easier and faster, even with the waiting afterward.)

Written on 31 March 2019.
« Our current approach for significantly upgrading or modifying servers
Our likely ZFS fileserver upgrade plans (as of March 2019) »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sun Mar 31 00:58:06 2019
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.