Wandering Thoughts archives


How to set up static networking with systemd-networkd, or at least how I did

I recently switched my Fedora 21 office workstation from Fedora's old /etc/init.d/network init script based method of network setup to using the (relatively new) systemd network setup functionality, for reasons that I covered yesterday. The systemd documentation is a little bit scant and not complete, so in the process I accumulated some notes that I'm going to write down.

First, I'm going to assume that you're having networkd take over everything from the ground up, possibly including giving your physical network devices stable names. If you were previously doing this through udev, you'll need to comment out bits of /etc/udev/rules.d/70-persistent-net.rules (or wherever your system put it).

To configure your networking you need to set up two files for each network connection. The first file will describe the underlying device, using .link files for physical devices and .netdev files for VLANs, bridges, and so on. For physical links, you can use various things to identify the device (I use just the MAC address, which matches what I doing in udev) and then set its name with 'Name=' in the '[Link]' section. Just to make you a bit confused, the VLANs set up on a physical device are not configured in its .link file.

The second file describes the actual networking on the device (physical or virtual), including virtual devices associated with it; this is done with .network files. Again you can use various things to identify which device you want to operate on; I used the name of the device (a [Match] section with Name=<whatever>). Most of the setup will be done in the [Network] section, including telling networkd what VLANs to create. If you want IP aliases on a give interface, specify multiple addresses. Although it's not documented, experimentally the last address specified becomes the primary (default) address of the interface, ie the default source address for traffic going out that interface.

(This is unfortunately reversed from what I expected, which was that the first address specified would be the primary. Hopefully the systemd people will not change this behavior but document it, and then provide a way of specifying primary versus secondary addresses.)

If you're setting up IP aliases for an interface, it's important to know that ifconfig will now be misleading. In the old approach, alias interfaces got created (eg 'em0:0') and showed the alias IP. In the networkd world those interfaces are not created and you need to turn to 'ip addr list' in order to see your IP aliases. Not knowing this can be very alarming, since in ifconfig it looks like your aliases disappeared. In general you can expect networkd to give you somewhat different ifconfig and ip output because it does stuff somewhat differently.

For setting up VLANs, the VLAN= name in your physical device's .network file is paired up with the [NetDev] Name= setting in your VLAN's .netdev file. You then create another .network file with a [Match] Name= setting of your VLAN's name to configure the VLAN interface's IP address and so on. Unfortunately this is a bit tedious, since your .netdev VLAN file basically exists to set a single value (the [VLAN] Id= setting); it would be more convenient (although less pure) if you could just put that information into a new [VLAN] section in the .network file that specified Name and Id together.

If you're uniquely specifying physical devices in .link files (eg with a MAC address for all of them, with no wildcards) and devices in .network files, I believe that the filenames of all of these files are arbitrary. I chose to give my VLANs filenames of eg 'em0.151.netdev' (where em0.151 is the interface name) just in case. As you can see, there seems to be relatively little constraint on the interface names and I was able to match the names required by my old Fedora ifcfg-* setup so that I didn't have to change any of my scripts et al.

You don't need to define a lo interface; networkd will set one up automatically and do the right thing.

Once you have everything set up in /etc/systemd/network, you need to enable this by (in my case) 'chkconfig --del network; systemctl enable systemd-networkd' and then rebooting. If you have systemd .service units that want to wait for networking to be up, you also want to enable the systemd-networkd-wait-online.service unit, which does what it says in its manpage, and then make your units depend on it in the usual way. Note that this is not quite the same as setting your SysV init script ordering so that your init scripts came after network, since this service waits for at least one interface to be plugged in to something (unfortunately there's no option to override this). While systemd still creates the 'sys-subsystem-net-devices-<name>.device' pseudo-devices, they will now appear faster and with less configured than they did with the old init scripts.

(I used to wait for the appearance of the em0.151 device as a sign that the underlying em0 device had been fully configured with IP addresses attached and so on. This is no longer the case in the networkd world, so this hack broke on me.)

In another unfortunate thing, there's no syntax checker for networkd files and it is somewhat hard to get warning messages. networkd will log complaints to the systemd journal, but it won't print them out on the console during boot or anything (at least not that I saw). However I believe that you can start or restart it while the system is live and then see if things complain.

(Why yes I did make a mistake the first time around. It turns out that the Label= setting in the [Address] section of .network files is not for a description of what the address is and does not like 'labels' that have spaces or other funny games in them.)

On the whole, systemd-networkd doesn't cover all of the cases but then neither did Fedora ifcfg- files. I was able to transform all of my rather complex ifcfg- setup into networkd control files with relatively little effort and hassle and the result came very close to working the first time. My networkd config files have a few more lines than my ifcfg-* files, but on the other hand I feel that I fully understand my networkd files and will in the future even after my current exposure to them fades.

(My ifcfg-* files also contain a certain amount of black magic and superstition, which I'm happy to not be carrying forward, and at least some settings that turn out to be mistakes now that I've actually looked them up.)

linux/SystemdNetworkdSetup written at 00:43:05; Add Comment


Why I'm switching to systemd's networkd stuff for my networking

Today I gave in to temptation and switched my Fedora 21 office workstation from doing networking through Fedora's old /etc/rc.d/init.d/network init script and its /etc/sysconfig/network-scripts/ifcfg-* system to using systemd-networkd. Before I write about what you have to set up to do this, I want to ramble a bit about why I even thought about it, much less went ahead.

The proximate cause is that I was hoping to get a faster system boot. At some point in the past few Fedora versions, bringing up my machine's networking through the network init script became the single slowest part of booting by a large margin, taking on the order of 20 to 30 seconds (and stalling a number of downstream startup jobs). I had no idea just what was taking so long, but I hoped that by switching to something else I could improve the situation.

The deeper cause is that Fedora's old network init script system is a serious mess. All of the work is done by a massive set of intricate shell scripts that use relatively undocumented environment variables set in ifcfg-* files (and the naming of the files themselves). Given the pile of scripts involved, it's absolutely no surprise to me that it takes forever to grind through processing all of my setup. In general the whole thing has all of the baroque charm of the evolved forms of System V init; the best thing I can say about it is that it generally works and you can build relatively sophisticated static setups with it.

(While there is some documentation for what variables can be set hiding in /usr/share/doc/initscripts/sysconfig.txt, it's not complete and for some things you get to decode the shell scripts yourself.)

What systemd's networkd stuff brings to the table for this is the same thing that systemd brings to the table relative to SysV init scripts: you have a well documented way of specifying what you want, which is then directly handled instead of being run through many, many layers of shell scripts. As an additional benefit it gets handled faster and perhaps better.

(I firmly believe that a mess of fragile shell scripts that source your ifcfg-* files and do magic things is not the right architecture. Robust handling of configuration files requires real parsing and so on, not shell script hackery. I don't really care who takes care of this (I would be just as happy with a completely separate system) and I will say straight up that systemd-networkd is not my favorite implementation of this idea and suffers from various flaws. But I like it more than the other options.)

In theory NetworkManager might fill this ecological niche already. In practice NetworkManager has never felt like something that was oriented towards my environment, instead feeling like it targeted machines and people who were going to do all of this through GUIs, and I've run into some issues with it. In particular I'm pretty sure that I'd struggle quite a bit to find documentation on how to set up a NM configuration (from the command line or in files) that duplicates my current network setup; with systemd, it was all in the manual pages. There is a serious (re)assurance value from seeing what you want to configure be clearly documented.

(My longer range reason for liking systemd's move here is that it may bring more uniformity to how you configure networking setups across various Linux flavours.)

linux/SystemdNetworkdWhy written at 02:08:42; Add Comment


A gotcha with Python tuples

Here's a little somewhat subtle Python syntax issue that I recently got to relearn (or be reminded of) by stubbing my toe on it. Let's start with an example, taken from our Django configuration:

# Tuple of directories to find templates

This looks good (and used to be accepted by Django), but it's wrong. I'm being tripped up by the critical difference in Python between '(A)' and '(A,)'. While I intended to define a one-element tuple, what I've actually done is set TEMPLATE_DIRS to a single string, which I happened to write in parentheses for no good reason (as far as the Python language is concerned, at least). This is still the case even though I've split the parenthesized expression over three lines; Python doesn't care about how many lines I use (or even how I indent them).

(Although it is not defined explicitly in the not a specification, this behavior is embedded in CPython; CPython silently ignores almost all newlines and whitespace inside ('s, ['s, and {'s.)

I used to be very conscious of this difference and very careful about putting a , at the end of my single-element tuples. I think I got into the habit of doing so when I at least thought that the % string formatting operation only took a tuple and would die if given a single element. At some point % started accepting bare single elements (or at least I noticed it did) and after that I got increasing casual about "..." % (a,) versus "..." % (a) (which I soon changed to "..." % a, of course). Somewhere along this the reflexive add-a-comma behavior fell out of my habits and, well, I wound up writing the example above.

(And Django accepted it for years, probably because any number of people wrote it like I did so why not be a bit friendly and magically assume things. Note that I don't blame Django for tightening up their rules here; it's probably a good idea as well as being clearly correct. Django already has enough intrinsic magic without adding more.)

As a side note, I think Python really has to do things this way. Given that () is used for two purposes, '(A)' for a plain A value is at least ambiguous. Adopting a heuristic that people really wanted a single element tuple instead of a uselessly parenthesized expression strikes me as too much magic for a predictable language, especially when you can force the tuple behavior with a ','.

python/TupleSingleElementGotcha written at 23:19:52; Add Comment

Why user-hostile policies are a bad thing and a mistake

One reasonable reaction to limited email retention policies being user-hostile is to say basically 'so what'. It's not really nice that policies make work for users, but sometimes that's just life; people will cope. I feel that this view is a mistake.

The problem with user-hostile policies is that users will circumvent them. Generously, let's assume that you enacted this policy to achieve some goal (not just to say that you have a policy and perhaps point to a technical implementation as proof of it). What you really want is not for the policy to be adhered to but to achieve your goal; the policy is just a tool in getting to the goal. If you enact a policy and then your users do things that defeat the goals of the policy, you have not actually achieved your overall goal. Instead you've made work, created resentment, and may have deluded yourself into thinking that your goal has actually been achieved because after all the policy has been applied.

(Clearly you won't have inconvenient old emails turn up because you're deleting all email after sixty days, right?)

In extreme cases, a user-hostile policy can actually move you further away from your goal. If your goal is 'minimal email retention', a policy that winds up causing users to automatically archive all emails locally because that's the most convenient way to handle things is actually moving you backwards. You were probably better off letting people keep as much email on the server as they wanted, because at least they were likely to delete some of it.

By the way, I happen to think that threatening punishment to people who take actions that go against the spirit or even the letter of your policy is generally not an effective thing from a business perspective in most environments, but that's another entry.

(As for policies for the sake of having policies, well, I would be really dubious of the idea that saying 'we have an email deletion policy so there's only X days of email on the mail server' will do you much good against either attackers or legal requests. To put it one way, do you think the police would accept that answer if they thought you had incriminating email and might have saved it somewhere?)

tech/UserHostilePolicyWhyBad written at 00:22:52; Add Comment


Limited retention policies for email are user-hostile

I periodically see security people argue for policies and technology to limit the retention of email and other files, ie to enact policies like 'all email older than X days is automatically deleted for you'. Usually the reasons given are that this limits the damage done in a compromise (for example), as attackers can't copy things that have already been deleted. The problem with this is that limited retention periods are clearly user hostile.

The theory of limited retention policies is that people will manually save the small amount of email that they really need past the retention period. The reality is that many people can't pick out in advance all of the email that will be needed later or that will turn out to be important. This is a lesson I've learned over and over myself; many times I've fished email out of my brute force archive that I'd otherwise deleted because I had no idea I'd want it later. The inevitable result is that either people don't save email and then wind up wanting it or they over-save (up to and including 'everything') just in case.

Beyond that, such policies clearly force make-work on people in order to deal with them. Unless you adopt an 'archive everything' policy that you can automate, you're going to spend some amount of your time trying to sort out which email you need to save and then saving it off somewhere before it expires. This is time that you're not doing your actual job and taking care of your actual work. It would clearly be much less work to keep everything sitting around and not have to worry that some of your email will be vanishing out from underneath you.

The result is that a limited retention policy is a classical 'bad' security policy in most environments. It's a policy that wouldn't exist without security (or legal) concerns, it makes people's lives harder, and it actively invites people to get around it (in fact you're supposed to get around it to some extent, just not too much).

(I can think of less user hostile ways to deal with the underlying problem, but what you should do depends on what you think the problem is.)

sysadmin/LimitedRetentionUserHostile written at 03:15:14; Add Comment


Node.js is not for me (and why)

I've been aware of and occasionally poking at node.js for a fairly long time now, and periodically I've considered writing something in it; I also follow a number of people on Twitter who are deeply involved with and passionate about node.js and the whole non-browser Javascript community. But I've never actually done anything with node.js and more or less ever since I got on Twitter and started following those node enthusiasts I've been feeling increasingly like I never would. Recently all of this has coalesced and now I think I can write down why node is not for me.

(These days there is also io.js, which is a compatible fork split off from node.js for reasons both technical and political.)

Node is fast server-side JavaScript in an asynchronous event based environment that uses callbacks for most event handling; a highly vibrant community and package ecosystem has coalesced around it. It's probably the fastest dynamic language you can run on servers.

My disengagement with node is because none of those appeal to me at all. While I accept that JavaScript is an okay language it doesn't appeal to me and I have no urge to write code in it, however fast it might be on the server once everything has started. As for the rest, I think that asynchronous event-based programming that requires widespread use of callbacks is actively the wrong programming model for dealing with concurrency, as it forces more or less explicit complexity on the programmer instead of handling it for you. A model of concurrency like Go's channels and coroutines is much easier to write code for, at least for me, and is certainly less irritating (even though the channel model has limits).

(I also think that a model with explicit concurrency is going to scale to a multi-core environment much better. If you promise 'this is pure async, two things never happen at once' you're now committed to a single thread of control model, and that means only using a single core unless your language environment can determine that two chunks of code don't interact with each other and so can't tell if they're running at the same time.)

As for the package availability, well, it's basically irrelevant given the lack of the appeal of the core. You'd need a really amazingly compelling package to get me to use a programming environment that doesn't appeal to me.

Now that I've realized all of this I'm going to do my best to let go of any lingering semi-guilty feelings that I should pay attention to node and maybe play around with it and so on, just because it's such a big presence in the language ecosystem at the moment (and because people whose opinions I respect love it). The world is a big place and we don't have to all agree with each other, even about programming things.

PS: None of this means that node.js is bad. Lots of people like JavaScript (or at least have a neutral 'just another language' attitude) and I understand that there are programming models for node.js that somewhat tame the tangle of event callbacks and so on. As mention, it's just not for me.

programming/NodeNotForMe written at 23:06:08; Add Comment

Using systemd-run to limit something's RAM consumption on the fly

A year ago I wrote about using cgroups to limit something's RAM consumption, for limiting the resources that make'ing Firefox could use when I did it. At the time my approach with an explicitly configured cgroup and the direct use of cgexec was the only way to do it on my machines; although systemd has facilities to do this in general, my version could not do this for ad hoc user-run programs. Well, I've upgraded to Fedora 21 and that's now changed, so here's a quick guide to doing it the systemd way.

The core command is systemd-run, which we use to start a command with various limits set. The basic command is:

systemd-run --user --scope -p LIM1=VAL1 -p LIM2=VAL2 [...] CMD ARG [...]

The --user makes things run as ourselves with no special privileges, and is necessary to get things to run. The --scope basically means 'run this as a subcommand', although systemd considers it a named object while it's running. Systemd-run will make up a name for it (and report the name when it starts your command), or you can use --unit NAME to give it your own name.

The limits you can set are covered in systemd.resource-control. Since systemd is just using cgroups, the limits you can set up are just the cgroup limits (and the documentation will tell you exactly what the mapping is, if you need it). Conveniently, systemd-run allows you to specify memory limits in Gb (or Mb), not just bytes. The specific limits I set up in the original entry give us a final command of:

systemd-run --user --scope -p MemoryLimit=3G -p CPUShares=512 -p BlockIOWeight=500 make

(Here I'm once again running make as my example command.)

You can inspect the parameters of your new scope with 'systemctl show --user <scope>', and change them on the fly with 'systemctl set-property --user <scope> LIM=VAL'. I'll leave potential uses of this up to your imagination. systemd-cgls can be used to show all of the scopes and find any particular one that's running this way (and show its processes).

(It would be nice if systemd-cgtop gave you a nice rundown of what resources were getting used by your confined scope, but as far as I can tell it doesn't. Maybe I'm missing a magic trick here.)

Now, there's a subtle semantic difference between what we're doing here and what I did in the original entry. With cgexec, everything that ran in our confine cgroup shared the same limit even if they were started completely separately. With systemd-run, separately started commands have separate limits; if you start two makes in parallel, each of them can use 3 GB of RAM. I'm not sure yet how you fix this in the official systemd way, but I think it involves defining a slice and then attaching our scopes to it.

(On the other hand, this separation of limits for separate commands may be something you consider a feature.)

Sidebar: systemd-run versus cgexec et al

In Fedora 20 and Fedora 21, cgexec works okay for me but I found that systemd would periodically clear out my custom confine cgroup and I'd have to do 'systemctl restart cgconfig' to recreate it (generally anything that caused systemd to reload itself would do this, including yum package updates that poked systemd). Now that the Fedora 21 version of systemd-run supports -p, using it and doing things the systemd way is just easier.

(I wrap the entire invocation up in a script, of course.)

linux/SystemdForMemoryLimiting written at 02:00:50; Add Comment


Link: Against DNSSEC by Thomas Ptacek

Against DNSSEC by Thomas Ptacek (@tqbf) is what it says in the title; lucid and to my mind strong reasons against using or supporting DNSSEC. I've heard some of these from @tqbf before in Tweets (and others are ambient knowledge in the right communities), but now that he's written this I don't have to try to dig those tweets out and make a coherent entry out of them.

For what it's worth, from my less informed perspective I agree with all of this. It would be nice if DNSSEC could bootstrap a system to get us out of the TLS CA racket but I've become persuaded (partly by @tqbf) that this is not viable and the cure is at least as bad as the disease. See eg this Twitter conversation.

(You may know of Thomas Ptacek from the days when he was at Matasano Security, where he was the author of such classics as If You're Typing the Letters A-E-S Into Your Code You're Doing It Wrong. See also eg his Hacker News profile.)

Update: there's a Hacker News discussion of this with additional arguments and more commentary from Thomas Ptacek here.

links/AgainstDNSSECTqbf written at 15:19:05; Add Comment

General ZFS pool shrinking will likely be coming to Illumos

Here is some great news. It started with this tweet from Alex Reece (which I saw via @bdha):

Finally got around to posting the device removal writeup for my first open source talk on #openzfs device removal! <link>

'Device removal' sounded vaguely interesting but I wasn't entirely sure why it called for a talk, since ZFS can already remove devices. Still, I'll read ZFS related things when I see them go by on Twitter, so I did. And my eyes popped right open.

This is really about being able to remove vdevs from a pool. In its current state I think the code requires all vdevs to be bare disks, which is not too useful for real configurations, but now that the big initial work has been done I suspect that there will be a big rush of people to improve it to cover more cases once it goes upstream to mainline Illumos (or before). Even being able to remove bare disks from pools with mirrored vdevs would be a big help for the 'I accidentally added a disk as a new vdev instead of as a mirror' situation that comes up periodically.

(This mistake is the difference between 'zpool add POOL DEV1 DEV2' and 'zpool add POOL mirror DEV1 DEV2'. You spotted the one word added to the second command, right?)

While this is not quite the same thing as an in-place reshape of your pool, a fully general version of this would let you move a pool from, say, mirroring to raidz provided that you had enough scratch disks for the transition (either because you are the kind of place that has them around or because you're moving to new disks anyways and you're just arranging them differently).

(While you can do this kind of 'reshaping' today by making a completely new pool and using zfs send and zfs receive, there are some advantages to being able to do it transparently and without interruptions while people are actively using the pool).

This feature has been a wishlist item for ZFS for so long that I'd long since given up on ever seeing it. To have even a preliminary version of it materialize out of the blue like this is simply amazing (and I'm a little bit surprised that this is the first I heard of it; I would have expected an explosion of excitement as the news started going around).

(Note that there may be an important fundamental limitation about this that I'm missing in my initial enthusiasm and reading. But still, it's the best news about this I've heard for, well, years.)

solaris/ZFSPoolShrinkingIsComing written at 00:25:11; Add Comment


What /etc/shells is and isn't

In traditional Unix, /etc/shells has only one true purpose: it lists programs that chsh will let you change your shell to (if it lets you do anything). Before people are tempted to make other programs use this file for something else, it is important to understand the limits of /etc/shells. These include but are not limited to:

  • Logins may have /etc/passwd entries that list other shells. For example, back when restricted shells were popular it was extremely common to not list them in /etc/shells so you couldn't accidentally chsh yourself into a restricted shell and then get stuck.

    Some but not all programs have used the absence of a shell from /etc/shells as a sign that it is a restricted shell (or not a real shell at all) and they should restrict a user with that shell in some way. Other programs have used different tests, such as matching against specific shell names or name prefixes.

    (It's traditional for the FTP daemon to refuse access for accounts that do not have a shell that's in /etc/shells and so this is broadly accepted. Other programs are on much thinner ice.)

  • On the other hand, sometimes you can find restricted shells in /etc/shells; a number of systems (Ubuntu and several FreeBSD versions) include rbash, the restricted version of Bash, if it's installed.

  • Not all normal shells used in /etc/passwd or simply installed on the system necessarily appear in /etc/shells for various reasons. In practice there are all sorts of ways for installed shells to fall through the cracks. Of course this makes them hard to use as your login shell (since you can't chsh to them), but this can be worked around in various ways.

    For example, our Ubuntu systems have /bin/csh and /bin/ksh (and some people use them as their login shells) but neither are in /etc/shells.

  • The (normal and unrestricted) shell someone's actually using isn't necessarily in either /etc/shells or their /etc/passwd entry. Unix is flexible and easily lets you use $SHELL and some dotfile hacking to switch basically everything over to running your own personal choice of shell, per my entry on using an alternate shell.

    (Essentially everything on Unix that spawns what is supposed to be your interactive shell has been clubbed into using $SHELL, partly because the code to use $SHELL is easier to write than the code to look up someone's /etc/passwd entry to find their official login shell. This feature probably came into Unix with BSD Unix, which was basically the first Unix to have two shells.)

  • Entries in /etc/shells don't necessarily exist.
  • Entries in /etc/shells are not necessarily shells. Ubuntu 14.04 lists screen.

  • Not all systems even have an /etc/shells. Solaris and derivatives such as Illumos and OmniOS don't.

In the face of all of this, most programs should simply use $SHELL and assume that it is what the user wants and/or what the sysadmin wants the user to get. It's essentially safe to assume that $SHELL always exists, because it is part of the long-standing standard Unix login environment. As a corollary, a program should not change $SHELL unless it has an excellent reason to do so.

Note particularly that a user's $SHELL not being listed in /etc/shells means essentially nothing. As outlined above, there are any number of non-theoretical ways that this can and does happen on real systems that are out there in the field. As a corollary your program should not do anything special in this case unless it has a really strong reason to do so, generally a security-related reason. Really, you don't even want to look at /etc/shells unless you're chsh or ftpd or sudo or the like.

(This entry is sadly brought to you by a program getting this wrong.)

unix/EtcShellsUsage written at 01:23:10; Add Comment

Page tools: See As Normal.
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.