2014-08-13
Bind mounts with systemd and non-fstab
filesystems
Under normal circumstances the way you deal with Linux bind mounts on a systemd
based system is the same as always: you
put them in /etc/fstab
and systemd makes everything work just
like normal. If you can deal with your bind mounts this way, I
recommend that you do it and keep your life simple. But sometimes
life is not simple.
Suppose, not entirely hypothetically, that
you are dealing with base filesystems that aren't represented in
/etc/fstab
for one reason or another; instead they appear through
other mechanisms. For example, perhaps they appear when you import a
ZFS pool. You want to use these filesystems as the source of bind
mounts.
The first thing that doesn't work is leaving your bind mounts in
/etc/fstab
. There is no way to tell systemd to not create them
until something else happens (eg your zfs-mount.service
systemd
unit finishes or their source directory appears), so this is basically
never going to do the right thing. If you get bind mounts at all
they are almost certainly not going to be bound to what you want.
At this point you might be tempted to think 'oh, systemd makes
/etc/fstab
mounts into magic <name>.mount systemd units, I can
just put files in /etc/systemd/system
to add some extra dependencies
to those magic units'. Sadly this doesn't work; the moment you have
a real <name>.mount unit file it entirely replaces the information
from /etc/fstab
and systemd will tell you that your <name>.mount
file is invalid because it doesn't specify what to mount.
In short, you need real .mount units for your bind mounts. You also need to force the ordering, and
here again we run into something that would be nice but doesn't
work. If you run 'systemctl list-units -t mount
', you will see
that there are units for all of your additional non-fstab
mounts.
It's tempting to make your bind mount unit depend on an appropriate
mount unit for its source filesystem, eg if you have a bind mount
from /archive/something
you'd have it depend on archive.mount
.
Unfortunately this doesn't work reliably because systemd doesn't
actually know about these synthetic mount units before the mount
appears. Instead you can only depend on whatever .service
unit
actually does the mounting, such as zfs-mount.service
.
(In an extreme situation you could create a service unit that just
used a script to wait for the mounts to come up. With a Type=oneshot
service unit, systemd won't consider the service successful until
the script exits.)
The maximally paranoid set of dependencies and guards is something like this:
[Unit] After=zfs-mount.service Requires=zfs-mount.service RequiresMountsFor=/var ConditionPathIsDirectory=/local/var/local
(This is for a bind mount from /local/var/local
to /var/local
.)
We can't use a RequiresMountsFor
on /local/var
, because as far
as systemd is concerned it's on the root filesystem and so the
dependency would be satisfied almost immediately. I don't think the
Condition will cause systemd to wait for /local/var/local
to
appear, just stop the bind mount from trying to be done if ZFS
mounts happened but they didn't managed to mount a /local/var
for some reason (eg a broken or missing ZFS pool).
(Since my /var
is actually on the root filesystem, the
RequiresMountsFor
is likely gilding the lily; I don't think there's
any situation where this unit can even be considered before the
root filesystem is mounted. But if it's a separate filesystem you
definitely want this and so it's probably a good habit in general.)
I haven't tested using local-var.mount
in just the Requires
here but I'd expect it to fail for the same reason that it definitely
doesn't work reliably in an After
. This is kind of a pity, but
there you go and the Condition is probably good enough.
(If you don't want to make a bunch of .mount
files, one for each
mount, you could make a single .service
unit that has all of the
necessary dependencies and runs appropriate commands to do the bind
mounting (either directly or by running a script). If you do this,
don't forget to have ExecStop
stuff to also do the unmounts.)
Sidebar: the likely non-masochistic way to do this for ZFS on Linux
If I was less stubborn, I would have set all of my ZFS filesystems
to have 'mountpoint=legacy
' and then explicitly mentioned and
mounted them in /etc/fstab
. Assuming that it worked (ie that
systemd didn't try to do the mounts before the ZFS pool came up),
this would have let me keep the bind mounts in fstab
too and
avoided this whole mess.
How you create a systemd .mount file for bind mounts
One of the types of units that systemd supports is mount units (see
'man systemd.mount'). Normally you set up all your mounts with
/etc/fstab
entries and you don't have to think about them, but
under some specialized circumstances you can wind up needing to
create real .mount
service files for some mounts.
How to specify most filesystems is pretty straightforward, but it's
not quite clear how you specify Linux bind mounts.
Since I was just wrestling repeatedly with this
today, here is what you need to put in a systemd .mount
file to
get a bind mount:
[Mount] What=/some/old/dir Where=/the/new/dir Type=none Options=bind
This corresponds to the mount command 'mount --bind /some/old/dir
/the/new/dir
' and an /etc/fstab
line of '/some/old/dir /some/new/dir
none bind
'. Note that the type of the mount is none
, not bind
as you might expect. This works because current versions of mount
will accept arguments of '-t none -o bind
' as meaning 'do a bind
mount'.
(I don't know if you can usefully add extra options to the Options
setting or if you'd need an actual script if you need to, eg, make
a bind mountpoint read-only. If you can do it in /etc/fstab
you
can probably do it here.)
A fully functioning .mount
unit will generally have other stuff
as well. What I've wound up using on Fedora 20 (mostly copied from
the standard tmp.mount
) is:
[Unit] DefaultDependencies=no Conflicts=umount.target Before=local-fs.target umount.target [Mount] [[ .... whatever you need ...]] [Install] WantedBy=local-fs.target
Add additional dependencies, documentation, and so on as you need
or want them. For what it's worth, I've also had bind mount units
work without the three [Unit]
bits I have here.
Note that this assumes a 'local' filesystem, not a network one. If
you're dealing with a network filesystem or something depending on
one, you'll need to change bits of the targets (systemd documentation
suggests to remote-fs.target
).