Wandering Thoughts archives

2015-03-12

My feelings about GRUB 1 versus GRUB 2

I have been dealing with CentOS 6 recently (since I didn't give in to temptation), which has been an interesting experience. In many ways CentOS 6 is a real blast from the past, with all sorts of packages that I just don't use any more. One of those is that CentOS 6 still uses GRUB 1 (which is really just plain 'GRUB') instead of GRUB 2, which basically all of our other Linux systems use.

Boy was fiddling with the machine's boot configuration an eye-opening experience, in a good way. I've become so used to GRUB 2's insane level of complications and opacity that I'd forgotten how pleasant and simple GRUB 1 is by comparison. You have menu entries. They say things. Normally it boots the first menu entry. Your entire GRUB file probably fits on a screen (certainly if you have only one or two kernels and a 50-line xterm window). There is not a shell language in sight.

GRUB 2? Well, I was going to quote a bit of the start of one of our grub.cfgs, but it's way too long. You don't edit GRUB 2 config files, or even look at them; they are simultaneously verbose and opaque, generated by scripts (often scripts that leave you lies in comments). GRUB 2 has an entire Bourne shell like programming language (and the Bourne shell is not a good programming language), for what I'm sure is reasons that makes sense to the GRUB 2 maintainers. The result is the traditional new Linux pile of mud where you make any changes (very) indirectly, magic happens, everything is supposed to work, and if it doesn't you are up the creek.

In case it's not obvious, I don't particularly like GRUB 2. No doubt it helps someone, but on all of our machines it just complicates my life (and this includes my desktops and my laptop). Using the original GRUB again was a breath of fresh air, one that I'll now be sad to give up when I work on our other machines.

(I was going to say that some of the complexity of GRUB 2 grub.cfg files was partly the fault of distributions but no, this appears to be from a standard config builder that's part of GRUB 2 itself.)

PS: Even if GRUB 1 is still available and supported on our Linux distributions and our hardware, it is not worth fighting city hall on this issue (and then finding out all of the things that are undoubtedly broken despite GRUB 1 theoretically being supported). This nicely illustrates how you lose by an inch at a time and then wind up with an entire collection of sprawling mudpiles.

Grub1VsGrub2 written at 00:02:14; Add Comment

2015-02-25

My current issues with systemd's networkd in Fedora 21

On the whole I'm happy with my switch to systemd-networkd, which I made for reasons covered here; my networking works and my workstation boots faster. But right now there are some downsides and limitations to networkd, and in the interests of equal time for the not so great bits I feel like running down them today. I covered some initial issues in my detailed setup entry; the largest one is that there is no syntax checker for the networkd configuration files and networkd itself doesn't report anything to the console if there are problems. Beyond that we get into a collection of operational issues.

What I consider the largest issue with networkd right now is that it's a daemon (as opposed to something that runs once and stops) but there is no documented way of interacting with it while it's running. There are two or three sides to this: information, temporary manipulation, and large changes. On the information front, networkd exposes no good way to introspect its full running state, including what network devices it's doing what to, or to wait for it to complete certain operations. On the temporary manipulation front, there's no way I know of to tell networkd to temporarily take down something and then later bring it back (the equivalent of ifdown and ifup). Perhaps you're supposed to do those with manual commands outside of networkd. Finally, on more permanent changes, if you add or remove or modify a configuration file in /etc/systemd/network and want networkd to notice, well, I don't know how you do that. Perhaps you restart networkd; perhaps you shut networkd down, modify things, and restart it; perhaps you reboot your machine. Perhaps networkd notices some changes on its own.

(Okay, it turns out that there's a networkctl command that queries some information from networkd, although it's not actually documented in the Fedora 21 version of systemd. This still doesn't allow you to poke networkd to do various operations.)

This points to a broader issue: there's a lot about networkd that's awfully underdocumented. I should not have to wonder about how to get networkd to notice configuration file updates; the documentation should tell me one way or another. As I write this the current systemd 219 systemd-networkd manpage is a marvel of saying very litte, and there's also omissions and lack of clarity in the manpages for the actual configuration files. All told networkd's documentation is not up to the generally good systemd standards.

The next issue is that networkd has forgotten everything that systemd learned about the difference between present configuration files and active configuration files. To networkd those are one and the same; if you have a file in /etc/systemd/network, it is live. Want it not to be live? Better move it out of the directory (or edit it, although there is no explicit 'this is disabled' option you can set). Want to override something in /usr/lib/systemd/network? I'm honestly not sure how you'd do that short of removing it or editing it. This is an unfortunate step backwards.

(It's also a problem in some situations where you have multiple configurations for a particular port that you want to swap between. In Fedora's static configuration world you can have multiple ifcfg-* files, all with ONBOOT=no, and then ifup and ifdown them as you need them; there is no networkd equivalent.)

I'm not going to count networkd's lack of general support for 'wait for specific thing <X> to happen' as an issue. But it certainly would be nice if systemd-networkd-wait-online was more generic and so could be more easily reused for various things.

I do think (as mentioned) that some of networkd's device and link configuration is unnecessarily tedious and repetitive. I see why it happened, but it's the easy way instead of the best way. I hope that it can be improved and I think that it can be. In theory I think you could go as far as optionally merging .link files with .network files to cover many cases much simpler, as the sections in each file today basically don't clash with each other.

In general I certainly hope that all of these issues will get better over time, although some of them will inevitably make networkd more complicated. Systemd's network configuration support is relatively young and I'm willing to accept some rough edges under the circumstances. I even sort of accept that networkd's priority right now probably needs to be supporting more types of networking instead of improving the administration experience, even if it doesn't make me entirely happy (but I'm biased, as my needs are already met there).

(To emphasize, my networkd issues are as of the state of networkd in Fedora 21, which has systemd 216, with a little bit of peeking at the latest systemd 219 documentation. In a year the situation may look a lot different, and I sure hope it does.)

SystemdNetworkdFlaws written at 23:04:06; Add Comment

My Linux container temptation: running other Linuxes

We use a very important piece of (commercial) software that is only supported on Ubuntu 10.04 and RHEL/CentOS 6, not anything later (and it definitely doesn't work on Ubuntu 12.04, we've tried that). It's currently on a 10.04 machine but 10.04 is going to go out of support quite soon. The obvious alternative is to build a RHEL 6 machine, except I don't really like RHEL 6 and it would be our sole RHEL 6 host (well, CentOS 6 host, same thing). All of this has led me to a temptation, namely Linux containers. Specifically, using Linux containers to run one Linux as the host operating system (such as Ubuntu 14.04) while providing a different Linux to this software.

(In theory Linux containers are sort of overkill and you could do most or all of what we need in a chroot install of CentOS 6. In practice it's probably easier and surer to set up an actual container.)

Note that I specifically don't want something like Docker, because the Docker model of application containers doesn't fit how the software natively works; it expects an environment with cron and multiple processes and persistent log files it writes locally and so on and so forth. I just want to provide the program with the CentOS 6 environment it needs to not crash without having to install or actually administer a CentOS 6 machine more than a tiny bit.

Ubuntu 14.04 has explicit support for LXC with documentation and appears to support CentOS containers, so that's the obvious way to go for this. It's certainly a tempting idea; I could play with some interesting new technology while getting out of dealing with a Linux that I don't like.

On the other hand, is it a good idea? This is certainly a lot of work to go to in order to avoid most of running a CentOS 6 machine (I think we'd still need to watch for eg CentOS glibc security updates and apply them). Unless we make more use of containers later, it would also leave us with a unique and peculiar one-off system that'll require special steps to administer. And virtualization has failed here before.

(I'd feel more enthused about this if I thought we had additional good uses for containers, but I don't see any other ones right now.)

ContainerOtherLinuxTemptation written at 01:39:40; Add Comment

2015-02-21

It turns out that I routinely use some really old Linux binaries

With all of the changes that Linux has gone through over the past ten or fifteen years and with them all of the things that have stopped working, it's easy to wind up feeling that Linux doesn't have a really good story about backwards compatibility. Certainly this is true to some degree and I have various programs that have broken over this time, sometimes to my significant irritation. But at the same time it turns out that some parts of the Linux binary world have been remarkably stable.

How stable? Well, let me give you two stories.

The oldest Linux binary that I'm sure I use on a routine basis was compiled on January 29th 1998. This was almost certainly very shortly after I installed my very first Linux machine, as the source code itself is older than that (it's a standard helper for my dotfiles that dates back to at least 1995). I've faithfully carried my $HOME forward from then onwards and with it this program, which has just kept on working. It's a 32-bit program of course, dynamically linked against glibc, and strings suggests it was compiled with GCC 2.7.2.3.

(I know it was compiled on Red Hat Linux, and given the date it would have been Red Hat 5.0.)

The second impressively old binary that I still use regularly is an X-based program that was compiled on June 12th 2000. As an X program it's dynamically linked against not just glibc but a whole series of additional X libraries. All of them have kept ABI compatibility and have not changed their .so versions. In fact now that I look I see that I routinely use an even older X-based program, compiled May 5th 1999, which is actually a core part of my automation to do things with the current X selection.

(My personal binaries directory is overgrown and many of the contents are utility programs used by scripts, so I don't always remember off the top of my head which programs are still in use by scripts that I still use. I really should weed it out a lot, but that would take time and energy.)

There are plenty of Linux binaries that would and did not survive that long. Offhand, anything written in C++ (due to repeated C++ ABI shifts), anything using termcap and (n)curses (due to .so version changes), and anything using Berkeley DB (ditto) would have been lost at some point. And of course many high level GUI toolkits are hopeless; ABI compatibility is nil over time and distributions just don't carry old versions forward in compatibility packages. But apparently basic X is just low level enough that it hasn't been impacted.

(It turns out I have binaries from 2000 that use libXaw, the old Athena widget set. Once I actually fetched the 32-bit libXaw for Fedora 21, they still ran. I guess no one's been fiddling around with Athena widgets any more than they've been meddling with the core X libraries.)

BinaryLongevity written at 01:17:58; Add Comment

2015-02-07

Trying to move towards Ed25519 OpenSSH host keys: a stumbling block

Now that I've upgraded to Fedora 21 on my main machines and actually have it available, I've decided to start shifting my OpenSSH usage towards considering ed25519 my primary and preferred public key authentication system (out of all of the ones that OpenSSH offers). Moving towards Ed25519 for my own keypairs was and is simple; I generated some new keypairs (one for each master machine), loaded them into ssh-agent first, and started adding them to authorized_keys on machines that have a modern enough version of OpenSSH. I expect to be using RSA keys for a long time given that eg CentOS 7 is not Ed25519 enabled, but at least this transition is basically automatic from now onwards.

(Well, I can add my Ed25519 keys to authorized_keys pretty much anywhere but they won't do me any good except on modern machines.)

But I also want to use Ed25519 host keys where machines have them available, and this turns out to be more tricky than I was expecting (at least on the version of OpenSSH on Fedora 21, which is OpenSSH 6.6.1p1). If you read the ssh_config manpage you'll soon run across a description of the HostKeyAlgorithms option, which to quote the manual 'specifies the protocol version 2 host key algorithms that the client wants to use in order of preference'. This sounds like just the thing; I could specify it explicitly in .ssh/config with the Ed25519 options first and everything would work right.

Well, sadly, no. The manual also claims:

If hostkeys are known for the destination host then [the HostKeyAlgorithms] default is modified to prefer their algorithms.

In other words: if you say that you prefer Ed25519 keys, the host has both an Ed25519 key and an RSA key available, and you already have the host's RSA key in .ssh/known_hosts, ssh should authenticate the host with the existing RSA key.

This appears to be what they call 'inoperative' if you specify an explicit HostKeyAlgorithms setting, even if this setting just shuffles the priority order. If you put the Ed25519 options first and you know an RSA key for a host that also offers Ed25519, ssh complains about a host key mismatch between the Ed25519 key being offered and the RSA key you know and does various things in reaction (including turning off X forwarding, which is fatal in my environment at work).

As far as I can tell, the only way to get this to really work is to not set HostKeyAlgorithms. Instead, you have to manually gather Ed25519 keys for anything that has them (perhaps using 'ssh-keyscan -t ed25519'), add them to your known_hosts, and purge any other host key entries for those hosts. This works right in that anything with a known Ed25519 key will be verified against the key, but it won't remember Ed25519 keys for new hosts; instead you'll probably wind up with ECDSA keys for them. You'll want to periodically look for signs that new hosts support Ed25519 and upgrade your known host keys for them.

(Future versions of OpenSSH will apparently support recording multiple host keys for a remote host at once as part of host key rotation. That won't be on all of our Linux machines any time soon, much less other things in the corner.)

I'd say that hopefully this issue is or will be fixed in future versions of OpenSSH, but I'm honestly a little worried that people will say it's actually working as intended (and at most fix the manual page). Even if they do fix it I don't expect to see the fix appearing in Fedora any time soon, given the usual release process and delays.

(It's possible that setting HostKeyAlgorithms in /etc/ssh/ssh_config will work and that only .ssh/config is special here, but I haven't tested this, it's not always feasible, and I'm not holding my breath.)

OpenSSHEd25519HostKeys written at 01:36:55; Add Comment

2015-01-25

The long term problem with ZFS on Linux is its license

Since I've recently praised ZFS on Linux as your only real choice today for an advanced filesystem, I need to bring up the long term downside because, awkwardly, I do believe that btrfs is probably going to be the best pragmatic option in the long term and is going to see wider adoption once it works reliably.

The core of the problem is ZFS's license, which I've written about before. What I didn't write about back then because I didn't know enough at the time was the full effects on ZoL of not being included in distributions. The big effect is it will probably never be easy or supported to make your root filesystem a ZFS pool. Unless distributions restructure their installers (and they have no reason to do so), a ZFS root filesystem needs first class support in the installer and it will almost certainly be rather difficult (both politically and otherwise) to add this. This means no installer-created filesystem can be a ZFS one, and the root filesystem has to be created in the installer.

(Okay, you can shuffle around your root filesystem after the basic install is done. But that's a big pain.)

In turn this means that ZFS on Linux is probably always going to be a thing for experts. To use it you need to leave disk space untouched in the installer (or add disk space later), then at least fetch the ZoL packages from an additional repository and have them auto-install on your kernel. And of course you have to live with a certain amount of lack of integration in all of the bits (especially if you go out of your way to use a ZFS root filesystem).

(And as I've seen there are issues with mixing ZFS and non-ZFS filesystems. I suspect that these issues will turn out to be relatively difficult to fix, if they can be at all. Certainly things seem much more likely to work well if all of your filesystems are ZFS filesystems.)

PS: Note that in general having non-GPLv2, non-bundled kernel modules is not an obstacle to widespread adoption if people want what you have to offer. A large number of people have installed binary modules for their graphics cards, for one glaring example. But I don't think that fetching these modules has been integrated into installers despite how popular they are.

(Also, I may be wrong here. If ZFS becomes sufficiently popular, distributions might at least make it easy for people to make third party augmented installers that have support for ZFS. Note that ZFS support in an installer isn't as simple as the choice of another filesystem; ZFS pools are set up quite differently from normal filesystems and good ZFS root pool support has to override things like setup for software RAID mirroring.)

ZFSOnLinuxRootFSProblem written at 04:20:46; Add Comment

2015-01-23

A problem with gnome-terminal in Fedora 21, and tracking it down

Today I discovered that Fedora 21 subtly broke some part of my environment to the extent that gnome-terminal refuses to start. More than that, it refuses to start with a completely obscure error message:

; gnome-terminal
Error constructing proxy for org.gnome.Terminal:/org/gnome/Terminal/Factory0: Error calling StartServiceByName for org.gnome.Terminal: GDBus.Error:org.freedesktop.DBus.Error.Spawn.ChildExited: Process org.gnome.Terminal exited with status 8

If you're here searching for the cause of this error message, let me translate it: what it really means is that your session's dbus-daemon could not start /usr/libexec/gnome-terminal-server when gnome-terminal asked it to. In many cases, it may be because your system's environment has not initialized $LC_CTYPE or $LANG to some UTF-8 locale at the time that your session was being set up (even if one of these environment variables gets set later, by the time you're running gnome-terminal). In the modern world, increasing amount of Gnome bits absolutely insist on being in a UTF-8 locale and fail hard if they aren't.

Some of you may be going 'what?' here. What you suspect is correct; the modern Gnome 3 'gnome-terminal' program is basically a cover script rather than an actual terminal emulator. Instead of opening up a terminal window itself, it exists to talk over DBus to a master gnome-terminal-server process (which will theoretically get started on demand). It is the g-t-s process that is the actual terminal emulator, creates the windows, starts the shells, and all. And yes, one process handles all of your gnome-terminal windows; if that process ever hits a bug (perhaps because of something happening in one window) and dies, all of them die. Let's hope g-t-s doesn't have any serious bugs.

To find the cause of this issue, well, if I'm being honest a bunch of this was found with an Internet search of the error message. This didn't turn up my exact problem but it did turn up people reporting locale problems and also a mention of gnome-terminal-server, which I hadn't known about before. For actual testing and verification I did several things:

  • first I used strace on gnome-terminal itself, which told me nothing useful.

  • I discovered that starting gnome-terminal-server by hand before running gnome-terminal made everything work.

  • I used dbus-monitor --session to watch DBus messages when I tried to start gnome-terminal. This didn't really tell me anything that I couldn't have seen from the error message, but it did verify that there was really a DBus message being sent.

  • I found the dbus-daemon process that was handling my session DBus and used 'strace -f -p ...' on it while I ran gnome-terminal. This eventually wound up with it starting gnome-terminal-server and g-t-s exiting after writing a message to standard error. Unfortunately the default strace settings truncated the message, so I reran strace while adding '-e write=2' to completely dump all messages to standard error. This got me the helpful error message from g-t-s:
    Non UTF-8 locale (ANSI_X3.4-1968) is not supported!

    (If you're wondering if dbus-daemon sends standard error from either itself or processes that it starts to somewhere useful, ha ha no, sorry, we're all out of luck. As far as I can tell it specifically sends standard error to /dev/null.)

  • I dumped the environment of the dbus-daemon process with 'tr '\0' '\n' </proc/<PID>/environ | less' and inspected what environment variables it had set. This showed that it had been started without my usual $LC_CTYPE setting (cf).

With this in hand I could manually reproduce the problem by trying to start gnome-terminal-server with $LC_CTYPE unset, and then I could fix up my X startup scripts to set $LC_CTYPE before they ran dbus-launch.

(This entry is already long enough so I am going to skip my usual rant about Gnome and especially Gnome 3 making problems like this very difficult for even experienced system administrators to debug because there are now so many opaque moving parts to even running Gnome programs standalone, much less in a full Gnome environment. How is anyone normal supposed to debug this when gnome-terminal can't even be bothered to give you a useful error summary in addition to the detailed error report from DBus?)

GnomeTerminalUTF8Required written at 01:54:19; Add Comment

2015-01-22

How to set up static networking with systemd-networkd, or at least how I did

I recently switched my Fedora 21 office workstation from Fedora's old /etc/init.d/network init script based method of network setup to using the (relatively new) systemd network setup functionality, for reasons that I covered yesterday. The systemd documentation is a little bit scant and not complete, so in the process I accumulated some notes that I'm going to write down.

First, I'm going to assume that you're having networkd take over everything from the ground up, possibly including giving your physical network devices stable names. If you were previously doing this through udev, you'll need to comment out bits of /etc/udev/rules.d/70-persistent-net.rules (or wherever your system put it).

To configure your networking you need to set up two files for each network connection. The first file will describe the underlying device, using .link files for physical devices and .netdev files for VLANs, bridges, and so on. For physical links, you can use various things to identify the device (I use just the MAC address, which matches what I doing in udev) and then set its name with 'Name=' in the '[Link]' section. Just to make you a bit confused, the VLANs set up on a physical device are not configured in its .link file.

The second file describes the actual networking on the device (physical or virtual), including virtual devices associated with it; this is done with .network files. Again you can use various things to identify which device you want to operate on; I used the name of the device (a [Match] section with Name=<whatever>). Most of the setup will be done in the [Network] section, including telling networkd what VLANs to create. If you want IP aliases on a give interface, specify multiple addresses. Although it's not documented, experimentally the last address specified becomes the primary (default) address of the interface, ie the default source address for traffic going out that interface.

(This is unfortunately reversed from what I expected, which was that the first address specified would be the primary. Hopefully the systemd people will not change this behavior but document it, and then provide a way of specifying primary versus secondary addresses.)

If you're setting up IP aliases for an interface, it's important to know that ifconfig will now be misleading. In the old approach, alias interfaces got created (eg 'em0:0') and showed the alias IP. In the networkd world those interfaces are not created and you need to turn to 'ip addr list' in order to see your IP aliases. Not knowing this can be very alarming, since in ifconfig it looks like your aliases disappeared. In general you can expect networkd to give you somewhat different ifconfig and ip output because it does stuff somewhat differently.

For setting up VLANs, the VLAN= name in your physical device's .network file is paired up with the [NetDev] Name= setting in your VLAN's .netdev file. You then create another .network file with a [Match] Name= setting of your VLAN's name to configure the VLAN interface's IP address and so on. Unfortunately this is a bit tedious, since your .netdev VLAN file basically exists to set a single value (the [VLAN] Id= setting); it would be more convenient (although less pure) if you could just put that information into a new [VLAN] section in the .network file that specified Name and Id together.

If you're uniquely specifying physical devices in .link files (eg with a MAC address for all of them, with no wildcards) and devices in .network files, I believe that the filenames of all of these files are arbitrary. I chose to give my VLANs filenames of eg 'em0.151.netdev' (where em0.151 is the interface name) just in case. As you can see, there seems to be relatively little constraint on the interface names and I was able to match the names required by my old Fedora ifcfg-* setup so that I didn't have to change any of my scripts et al.

You don't need to define a lo interface; networkd will set one up automatically and do the right thing.

Once you have everything set up in /etc/systemd/network, you need to enable this by (in my case) 'chkconfig --del network; systemctl enable systemd-networkd' and then rebooting. If you have systemd .service units that want to wait for networking to be up, you also want to enable the systemd-networkd-wait-online.service unit, which does what it says in its manpage, and then make your units depend on it in the usual way. Note that this is not quite the same as setting your SysV init script ordering so that your init scripts came after network, since this service waits for at least one interface to be plugged in to something (unfortunately there's no option to override this). While systemd still creates the 'sys-subsystem-net-devices-<name>.device' pseudo-devices, they will now appear faster and with less configured than they did with the old init scripts.

(I used to wait for the appearance of the em0.151 device as a sign that the underlying em0 device had been fully configured with IP addresses attached and so on. This is no longer the case in the networkd world, so this hack broke on me.)

In another unfortunate thing, there's no syntax checker for networkd files and it is somewhat hard to get warning messages. networkd will log complaints to the systemd journal, but it won't print them out on the console during boot or anything (at least not that I saw). However I believe that you can start or restart it while the system is live and then see if things complain.

(Why yes I did make a mistake the first time around. It turns out that the Label= setting in the [Address] section of .network files is not for a description of what the address is and does not like 'labels' that have spaces or other funny games in them.)

On the whole, systemd-networkd doesn't cover all of the cases but then neither did Fedora ifcfg-* files. I was able to transform all of my rather complex ifcfg-* setup into networkd control files with relatively little effort and hassle and the result came very close to working the first time. My networkd config files have a few more lines than my ifcfg-* files, but on the other hand I feel that I fully understand my networkd files and will in the future even after my current exposure to them fades.

(My ifcfg-* files also contain a certain amount of black magic and superstition, which I'm happy to not be carrying forward, and at least some settings that turn out to be mistakes now that I've actually looked them up.)

SystemdNetworkdSetup written at 00:43:05; Add Comment

2015-01-21

Why I'm switching to systemd's networkd stuff for my networking

Today I gave in to temptation and switched my Fedora 21 office workstation from doing networking through Fedora's old /etc/rc.d/init.d/network init script and its /etc/sysconfig/network-scripts/ifcfg-* system to using systemd-networkd. Before I write about what you have to set up to do this, I want to ramble a bit about why I even thought about it, much less went ahead.

The proximate cause is that I was hoping to get a faster system boot. At some point in the past few Fedora versions, bringing up my machine's networking through the network init script became the single slowest part of booting by a large margin, taking on the order of 20 to 30 seconds (and stalling a number of downstream startup jobs). I had no idea just what was taking so long, but I hoped that by switching to something else I could improve the situation.

The deeper cause is that Fedora's old network init script system is a serious mess. All of the work is done by a massive set of intricate shell scripts that use relatively undocumented environment variables set in ifcfg-* files (and the naming of the files themselves). Given the pile of scripts involved, it's absolutely no surprise to me that it takes forever to grind through processing all of my setup. In general the whole thing has all of the baroque charm of the evolved forms of System V init; the best thing I can say about it is that it generally works and you can build relatively sophisticated static setups with it.

(While there is some documentation for what variables can be set hiding in /usr/share/doc/initscripts/sysconfig.txt, it's not complete and for some things you get to decode the shell scripts yourself.)

What systemd's networkd stuff brings to the table for this is the same thing that systemd brings to the table relative to SysV init scripts: you have a well documented way of specifying what you want, which is then directly handled instead of being run through many, many layers of shell scripts. As an additional benefit it gets handled faster and perhaps better.

(I firmly believe that a mess of fragile shell scripts that source your ifcfg-* files and do magic things is not the right architecture. Robust handling of configuration files requires real parsing and so on, not shell script hackery. I don't really care who takes care of this (I would be just as happy with a completely separate system) and I will say straight up that systemd-networkd is not my favorite implementation of this idea and suffers from various flaws. But I like it more than the other options.)

In theory NetworkManager might fill this ecological niche already. In practice NetworkManager has never felt like something that was oriented towards my environment, instead feeling like it targeted machines and people who were going to do all of this through GUIs, and I've run into some issues with it. In particular I'm pretty sure that I'd struggle quite a bit to find documentation on how to set up a NM configuration (from the command line or in files) that duplicates my current network setup; with systemd, it was all in the manual pages. There is a serious (re)assurance value from seeing what you want to configure be clearly documented.

(My longer range reason for liking systemd's move here is that it may bring more uniformity to how you configure networking setups across various Linux flavours.)

SystemdNetworkdWhy written at 02:08:42; Add Comment

2015-01-16

Using systemd-run to limit something's RAM consumption on the fly

A year ago I wrote about using cgroups to limit something's RAM consumption, for limiting the resources that make'ing Firefox could use when I did it. At the time my approach with an explicitly configured cgroup and the direct use of cgexec was the only way to do it on my machines; although systemd has facilities to do this in general, my version could not do this for ad hoc user-run programs. Well, I've upgraded to Fedora 21 and that's now changed, so here's a quick guide to doing it the systemd way.

The core command is systemd-run, which we use to start a command with various limits set. The basic command is:

systemd-run --user --scope -p LIM1=VAL1 -p LIM2=VAL2 [...] CMD ARG [...]

The --user makes things run as ourselves with no special privileges, and is necessary to get things to run. The --scope basically means 'run this as a subcommand', although systemd considers it a named object while it's running. Systemd-run will make up a name for it (and report the name when it starts your command), or you can use --unit NAME to give it your own name.

The limits you can set are covered in systemd.resource-control. Since systemd is just using cgroups, the limits you can set up are just the cgroup limits (and the documentation will tell you exactly what the mapping is, if you need it). Conveniently, systemd-run allows you to specify memory limits in Gb (or Mb), not just bytes. The specific limits I set up in the original entry give us a final command of:

systemd-run --user --scope -p MemoryLimit=3G -p CPUShares=512 -p BlockIOWeight=500 make

(Here I'm once again running make as my example command.)

You can inspect the parameters of your new scope with 'systemctl show --user <scope>', and change them on the fly with 'systemctl set-property --user <scope> LIM=VAL'. I'll leave potential uses of this up to your imagination. systemd-cgls can be used to show all of the scopes and find any particular one that's running this way (and show its processes).

(It would be nice if systemd-cgtop gave you a nice rundown of what resources were getting used by your confined scope, but as far as I can tell it doesn't. Maybe I'm missing a magic trick here.)

Now, there's a subtle semantic difference between what we're doing here and what I did in the original entry. With cgexec, everything that ran in our confine cgroup shared the same limit even if they were started completely separately. With systemd-run, separately started commands have separate limits; if you start two makes in parallel, each of them can use 3 GB of RAM. I'm not sure yet how you fix this in the official systemd way, but I think it involves defining a slice and then attaching our scopes to it.

(On the other hand, this separation of limits for separate commands may be something you consider a feature.)

Sidebar: systemd-run versus cgexec et al

In Fedora 20 and Fedora 21, cgexec works okay for me but I found that systemd would periodically clear out my custom confine cgroup and I'd have to do 'systemctl restart cgconfig' to recreate it (generally anything that caused systemd to reload itself would do this, including yum package updates that poked systemd). Now that the Fedora 21 version of systemd-run supports -p, using it and doing things the systemd way is just easier.

(I wrap the entire invocation up in a script, of course.)

SystemdForMemoryLimiting written at 02:00:50; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.