A tiny systemd convenience: it can reboot the system from RAM alone

June 20, 2016

One of the things I do a fair bit of is building and testing from-scratch system installs. Not being crazy, I do this in virtual machines (it's much faster that way). When you do this sort of work, you live in a constant cycle of installing a machine from scratch, testing it, and then damaging the install enough so that when you reboot, your VM will repeat the 'install from scratch' part. Most of the time, the most convenient way to damage the install is with dd:

dd if=/dev/zero of=/dev/sda bs=1024k count=32; sync
reboot

(The sync can be important.)

Dd'ing over the start of the (virtual) disk makes sure that there isn't a partition table and a bootloader any more, and it also generally prevents the install CD environment from sniffing around and finding too many traces of your old installed OS.

On normal System V init or Upstart based systems, this sequence has a minor little irritation: the reboot will usually fail. This is because the reboot process needs to read files off the filesystem, which you've overwritten and corrupted with the dd. Then you (I) get to go off to the VM menus and say 'power cycle the machine', which is just a tiny little interruption.

With systemd, at least in Ubuntu 16.04, this doesn't happen. Sure, a number of things run during the reboot process will spit out various errors, but systemd continues driving everything onwards anyways and will successfully reboot my virtual machine with no further activity on my part. The result is every so slightly more convenient for my peculiar usage case.

I believe that systemd can do this for several reasons. First, systemd parses and loads all unit files into memory when it starts up (or you tell it 'systemctl daemon-reload'), which means that it doesn't have to read anything from disk in order to know what needs to be done to shut the system down. Second, systemd mostly terminates processes itself; it doesn't need to repeatedly get scripts to run kill and the like, which could fail if kill or other necessary bits have been damaged by that dd. Finally, I think that systemd can handle calling reboot() internally, instead of having to run an executable (which might not be there) in order to do this.

(Systemd clearly has internal support in PID 1 for rebooting the system under some circumstances. I'm not quite clear if this is the path that a normal reboot eventually takes; it's a bit tangled up in units and handling this and that and so on.)

PS: Possibly there is a better way to damage a system this way than dd. dd has the (current) virtue of being easy to remember and clearly sufficient. And small variants of this dd command work on any Unix, not just Linux (or a particular Linux).


Comments on this page:

By Diego at 2016-06-21 08:54:59:

Systemd shutdown seems to require to execute an additional executable for normal shutdown (/usr/lib/systemd/systemd-shutdown in my system), not sure if they have . There is an interesting description of the systemd shutdown approach here https://plus.google.com/+LennartPoetteringTheOneAndOnly/posts/LjkLwkeDiLc

By Anon at 2016-06-21 14:22:39:

Why not just echo b into /proc/sysrq-trigger after doing a blkdiscard (assuming your VM's disk supports discard)?

(PS you can skip the sync if you use oflag=direct with your dd)

By Alan at 2016-06-22 14:32:49:

Fun stuff.

Bet you still rely on reboot^W systemctl not being on the first 32M of the disk. reboot -f should work about as well on any init system.

The landmine that is wipefs -a -f was created specifically for this purpose. (Brought to you by the same package the installer probably uses to detect the filesystems). It's supposed to only erase signatures ("magic numbers"), suggesting it won't affect the running system.

Obligatory WARNING: experiments with wipefs are best performed in a VM which doesn't contain any valuable data :).

By cks at 2016-06-22 15:07:42:

Alan: unfortunately wipefs seems to want to only work on one area at a time, ie if I run 'wipefs -a -f /dev/sda' it only wipes out the partition table signature, not the root filesystem's signature. I actively want to take out as many signatures as possible and relatively fast.

Your point about reboot et al not being early on the disk is a good one. I guess I've just been lucky so far (although it's odd that I've been lucky with reboot and not with shutdown scripts and so on).

As far as 'reboot -f' and using /proc/sysrq-trigger go, they both work and would fix this issue but in practice it's easier for me to not remember special steps for this situation (as shown by the fact that I didn't get around to doing anything special on pre-16.04 Ubuntu VMs, even though I often had to manually power cycle them).

(It's possible that I'll change my ways now, since writing about this has surfaced the whole issue and exposed me to other alternatives. Sometimes I just drift along even in the face of friction until something jars me out of my path.)

By darkfader at 2016-06-26 07:59:18:

Hi,

reboot -f was only so tricky on what you call "SysVInit" systems because the implementation of /sbin/reboot had been a very bad and naive piece of shell script wrapping around the reboot. I once worked on a traffic control-ish system, where we had a RHLE-style base. It didn't take long till we wondered why the heck a Linux system can't even hard-reboot if the root disk failed. Like, what is the darn point of a reboot -f if not that there's something wrong? So why the hell would you implement reboot in a way that doesn't work if something is wrong?

We rewrote it and that's it. Even more important if you consider kexec reboots that could work around almost any issue. It's nice that systemd has this now, but I would really like to emphasize that this isn't a systemV problem but just with the incredibly bad code that Linux calls SysV Init. Much of this is actually just contribs that they got in the mid-to-end 90s and RedHat never brought them to shape production code should be in. All the real SysV unix flavours had higher quality code, Linux distros chose to go with subpar contributions instead of polishing them. (The rewrite was less than 4 hours work. But even the month-ish time you'd need for a full rewrite and QA would have been OK for this goal) Nonetheless, it's nice that it's improved now and that SystemD does something better. I just hate when it's smeared on SystemV when it's really an original Linux issue.

By Anon at 2016-07-02 10:20:11:

The one advantage that wipefs has when destroying partition tables is that it can take out the backup at the end of the disk in addition to the area at the start when the partition is GPT...

Written on 20 June 2016.
« A lesson to myself: know your emergency contact numbers
Moving from Python 2 to Python 3 calls for a code inventory »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Mon Jun 20 22:08:22 2016
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.