Wandering Thoughts archives

2019-01-30

ZFS On Linux's kernel modules issues are not like NVidia's

In the Hacker News discussion of my entry on the general risk to ZFS from the shift in its userbase towards ZFS on Linux, a number of people suggested that the risks to ZFS on Linux were actually low since proprietary kernel modules such as NVidia's GPU drivers don't seem to have any problems dealing with things like the kernel making various functions inaccessible or GPL-only. I have two views on this, because I think that the risks are a bit different and less obvious than what they initially look like.

On a purely technical level, ZFS on Linux probably has it easier and is at less risk than NVidia's GPU drivers. Advanced GPU drivers deal with hardware that only they can work with and may need to do various weird sorts of operations for memory mapping, DMA control, and so on that aren't needed or used by existing kernel modules. It's at least possible that the Linux kernel could not support access to this sort of stuff from any kernel module (in tree or out, regardless of license), and someday leave a GPU driver up the creek.

By contrast, most in-tree filesystems are built as loadable modules (for good reasons) so the kernel already provides to modules everything necessary to support a filesystem, and it's likely that ZFS on Linux could survive with just this. What ZFS on Linux needs from the kernel is likely to be especially close to what BTRFS needs, since both are dealing with very similar core issues like talking to multiple disks at once, checksum computation, and so on, and there is very little prospect that BTRFS will either be removed from the kernel tree or only be supported if built into the kernel itself.

But on the political and social level it's another thing entirely. NVidia and other vendors of proprietary kernel modules have already decided that they basically don't care about anything except what they can implement. Licenses and people's views of their actions are irrelevant; if they can do it technically and they need to in order to make their driver work, they will. GPL shim modules to get access to GPL-only kernel symbols are just the starting point.

Most of the people involved in ZFS on Linux are probably not going to feel this way. Sure, ZFS on Linux could implement shim modules and other workarounds if the kernel cuts off full access to necessary things, but I don't think they're going to. ZFS on Linux developers are open source developers in a way that NVidia's driver programmers are not, and if the Linux kernel people yell at them hard enough they will likely go away, not resort to technical hacks to get around the technical barriers.

In other words, the concern with ZFS on Linux is not that it will become technically unviable, because that's unlikely. The concern is that it will become socially unviable, that to continue on a technical level its developers and users would have to become just as indifferent to the social norms of the kernel license as NVidia is.

(And if that did happen, which it might, I think it would make ZFS on Linux much more precarious than it currently is, because ZoL would be relying on its ability to find and keep both good kernel developers and sources of development funding that are willing to flout social norms in a way that they don't have to today.)

ZFSOnLinuxNotLikeNVidia written at 23:58:15; Add Comment

2019-01-25

The Linux kernel's pstore error log capturing system, and ACPI ERST

In response to my entry yesterday on enabling reboot on panic on your servers, a commentator left the succinct suggestion of 'setup pstore'. I had never heard of pstore before, so this sent me searching and what I found is actually quite interesting and surprising, with direct relevance to quite a few of our servers.

Pstore itself is a kernel feature that dates to 2011. It provides a generic interface to storage that persists across reboots and gets used to save kernel messages during a crash, as covered in LWN's Persistent storage for a kernel's "dying breath" and the kernel documentation. Your kernel very likely has pstore built in and your Linux probably mounts the pstore filesystem at /sys/fs/pstore.

(The Ubuntu 16.04 and 18.04 kernels, the CentOS 7 kernel, and the Fedora kernel all have it built in. If in doubt, check your kernel's configuration, which is often found in /boot/conf-*; you're looking for CONFIG_PSTORE and associated things.)

By itself, pstore does nothing for you because it needs a chunk of storage that persists across reboots, and that's up to your system to provide in some way. One such source of this storage is in an optional part of ACPI called the Error Record Serialization Table (ERST). Not all machines have an ERST (it's apparently most common in servers), but if you do have one, pstore will probably automatically use it. If you have ERST at all, it will normally show up in the kernel's boot time messages about ACPI:

ACPI: ERST 0x00000000BF7D6000 000230 (v01 DELL   PE_SC3   00000000 DELL 00040000)

If pstore is using ERST, you will get some additional kernel messages:

ERST: Error Record Serialization Table (ERST) support is initialized.
pstore: using zlib compression
pstore: Registered erst as persistent store backend

Some of our servers have ACPI ERST and some of them have crashed, so out of idle curiosity I went and looked at /sys/fs/pstore on all of them. This led to a big surprise, which is that there may be nothing in your Linux distribution that checks /sys/fs/pstore to see if there are captured kernel crash logs. Pstore is persistent storage, and so it does what it says on the can; if you don't move things out of /sys/fs/pstore, they stay there, possibly for a very long time (one of our servers turned out to have pstore ERST captures from a year ago). This is especially important because things like ERST only have so much space, so lingering old crash logs may keep you from saving new ones, ones that you may discover you very much would like records of.

(The year-old pstore ERST captures are especially ironic because the machine's current incarnation was reinstalled this September, so they are from its previous life as something else entirely, making them completely useless to us.)

Another pstore backend that you may have on some machines is one that uses UEFI variables. Unfortunately, you need to have booted your system using UEFI in order to have access to UEFI services, including UEFI variables (as I found out the hard way once), so even on a UEFI-capable system you may not be able to use this backend because you're still using MBR booting. It's possible that using UEFI variables for pstore is disabled by some Linux distributions, since actually using UEFI variables has caused UEFI BIOS problems in the past.

(This makes it somewhat more of a pity that I failed to migrate to UEFI booting, since I would actually potentially get something out of it on my workstations. Also, although many of our servers are probably UEFI capable, they all use MBR booting today.)

Given that nothing in our Ubuntu 18.04 server installs seems to notice /sys/fs/pstore and we have some machines with things in it, we're probably going to put together some shell scripting of our own to at least email us if something shows up.

(Additional references: Matthew Garrett's A use for EFI, CoreOS's Collecting crash logs, which mentions the need to clear out /sys/fs/pstore, and abrt's pstore oops wiki page, which includes a list of pstore backends.)

PS: The awkward, brute force way to get pstore space is with the ramoops backend, which requires fencing off some section of your RAM from your kernel (it should be RAM that your BIOS won't clear on reboot for whatever reason). This is beyond my enthusiasm level on my machines, despite some recent problems, and I have the impression that ramoops is usually used on embedded ARM hardware where you have little or no other options.

PstoreAndACPIERST written at 00:27:25; Add Comment

2019-01-23

Consider setting your Linux servers to reboot on kernel problems

As I sort of mentioned when I wrote about things you can do to make your Linux servers reboot on kernel problems, the Linux kernel normally doesn't reboot if it hits kernel problems. Problems like OOPSes and RCU stalls generally kill some processes and try to continue on; more serious issues cause panics, which freeze the machine entirely.

If your goal is to debug kernel problems, this is great because it preserves as much of the evidence as possible (although you probably also want things like a serial console or at least netconsole, to capture those kernel crash messages). If your goal is to have your servers running, it is perhaps not as attractive; you may quite reasonably care more about returning them to service as soon as possible than trying to collect evidence for a bug report to your distribution.

(Even if you do care about collecting information for a bug report, there are probably better ways than letting the machine sit there. Future kernels will have a kernel sysctl called panic_print to let you dump out as much information in the initial report as possible, which you can preserve through your console server system, and in general there is Kdump (also). In theory netconsole might also let you capture the initial messages, but I don't trust it half as much as I do a serial console.)

My view is that most people today are in the second situation, where there's very little you're going to do with a crashed server except reboot or power cycle it to get it back into service. If this is so, you might as well cut out the manual work by configuring your servers to reboot on kernel problems, at least as their initial default settings. You do want to wait just a little bit after an OOPS to reboot, in the hopes that maybe the kernel OOPS message will be successfully written to disk or transmitted off to your central syslog server, but that's it; after at most 60 seconds or so, you should reboot.

(If you find that you have a machine that is regularly OOPSing and you want to diagnose in a more-hands on way, you can change the settings on it as needed.)

We have traditionally not thought about this and so left our servers in the standard default 'lock up on kernel problems' configuration, which has gone okay because kernel problems are very rare in the first place. Leaving things as they are would still be the least effort approach, but changing our standard system setup to enable reboots on panics would not be much effort (it's three sysctls in one /etc/sysctl.d file), and it's probably worth it, just in case.

(This is the kind of change that you hope not to need, but if you do wind up needing it, you may be extremely thankful that you put it into place.)

PS: Not automatically rebooting on kernel panics is pretty harmless for Linux machines that are used interactively, because if the machine has problems there's a person right there to immediately force a reboot. It's only unattended machines such as servers where this really comes up. For desktop and laptop focused distributions it probably makes troubleshooting somewhat easier, because at least you can ask someone who's having crash problems to take a picture of the kernel errors with their phone.

ServerRebootOnPanics written at 23:24:23; Add Comment

2019-01-22

Things you can do to make your Linux servers reboot on kernel problems

One of the Linux kernel's unusual behaviors is that it often doesn't reboot after it hits an internal problem, what is normally called a kernel panic. Sometimes this is a reasonable thing and sometimes this is not what you want and you'd like to change it. Fortunately Linux lets you more or less control this through kernel sysctl settings.

(The Linux kernel differentiates between things like OOPSes and RCU stalls, which it thinks it can maybe continue on from, and kernel panics, which immediately freeze the machine.)

What you need to do is twofold. First, you need to make it so that the kernel reboots when it considers itself to have paniced. This is set through the kernel.panic sysctl, which is a number of seconds. Some sources recommend setting this to 60 seconds under various circumstances, but in limited experience we haven't found that to do anything for us except delay reboots, so we now use 10 seconds. Setting kernel.panic to 0 restores the default state, where panics simply hang the machine.

Second, you need to arrange for various kernel problems to trigger panics. The most important thing here is usually for kernel OOPS messages or BUG messages to trigger panics; the kernel considers these nominally recoverable, except that they mostly aren't and will often leave your machine effectively hung. Panicing on OOPS is turned on by setting kernel.panic_on_oops to 1.

Another likely important sign of trouble is RCU stalls; you can panic on these with kernel.panic_on_rcu_stall. Note that I'm biased about RCU stalls. The kernel documentation in sysctl/kernel.txt mentions some other ones as well, currently panic_on_io_nmi, panic_on_stackoverflow, panic_on_unrecovered_nmi, and panic_on_warn. Of these, I would definitely be wary about turning on panic_on_warn; our systems appear to see a certain number of them in reasonably routine operation.

(You can detect these warnings by searching your kernel logs for the text 'WARNING: CPU: <..> PID: <...>'. One of our WARNs was for a network device transmit queue timeout, which recovered almost immediately. Rebooting the server due to this would have been entirely the wrong reaction in practice.)

Note that you can turn on any or all of the various panic_on_* settings while still having kernel.panic set to 0. If you do this, you convert OOPSes, RCU stalls, or whatever into things that are guaranteed to hang the whole machine when they happen, instead of perhaps having it continue on in partial operating order. There are systems where this may be desirable behavior.

PS: If you want to be as sure as possible that the machine reboots after hitting problems, you probably want to enable a hardware watchdog as well if you can. The kernel panic() function tries hard to reboot the machine, but things can probably go wrong. Unfortunately not all machines have hardware watchdogs available, although many Intel ones do.

Sidebar: The problem with kernel OOPSes

When a kernel oops happens, the kernel kills one or more processes. These processes were generally in kernel code at the time (that's usually what generated the oops), and they may have been holding locks or have been in the middle of modifying data structures, submitting IO operations, or doing other kernel things. However, the kernel has no idea what exactly needs to be done to safely release these locks, revert the data structure modifications, and so on; instead it just drops everything on the floor and hopes for the best.

Sometimes this works out, or at least the damage done is relatively contained (perhaps only access to one mounted filesystem starts hanging because of a lock held by the now-dead process that will never be unlocked). Often it is not and more or less everything grinds to a more or less immediate halt. If you're lucky, enough of the system survives long enough for the kernel oops message to be written to disk or sent out to your central syslog server.

RebootOnPanicSettings written at 00:44:25; Add Comment

2019-01-19

A surprise potential gotcha with sharenfs in ZFS on Linux

In Solaris and Illumos, the standard and well supported way to set and update NFS sharing options for ZFS filesystems is through the sharenfs ZFS filesystem property. ZFS on Linux sort of supports sharenfs, but it attempts to be compatible with Solaris and in practice that doesn't work well, partly because there are Solaris options that cannot be easily translated to Linux. When we faced this issue for our Linux ZFS fileservers, we decided that we would build an entirely separate system to handle NFS exports that directly invokes exportfs, which has worked well. This turns out to have been lucky, because there is an additional and somewhat subtle problem with how sharenfs is currently implemented in ZFS on Linux.

On both Illumos and Linux, ZFS actually implements sharenfs by calling the existing normal command to manipulate NFS exports; on Illumos this uses share_nfs and on Linux, exportfs. By itself this is not a problem and actually makes a lot of sense (especially since there's no official public API for this on either Linux or Illumos). On Linux, the specific functions involved are found in lib/libshare/nfs.c. When you initially share a NFS filesystem, ZFS will wind up running the following command for each client:

exportfs -i -o <options> <client>:<path>

When you entirely unshare a NFS filesystem, ZFS will wind up running:

exportfs -u <client>:<path>

The potential problem comes in when you change an existing sharenfs setting, either to modify what clients the filesystem is exported to or to alter what options you're exporting it with. ZFS on Linux implements this by entirely unexporting the filesystem to all clients, then re-exporting it with whatever options and to whatever clients your new sharenfs settings call for.

(The code for this is in nfs_update_shareopts() in lib/libshare/nfs.c.)

On the one hand this is a sensible if brute force implementation, and computing the difference in sharing (for both clients and options) and how to transform one to the other is not an easy problem. On the other hand, this means that clients that are actually doing NFS traffic during the time when you change sharenfs may be unlucky enough to try a NFS operation in the window of time between when the filesystem was unshared (to them) and when it was reshared (to them). If they hit this window, they'll get various forms of NFS permission denied messages, and with some clients this may produce highly undesirably consequences, such as libvirt guests having their root filesystems go read-only.

(The zfs-discuss re-query from Todd Pfaff today is what got several people to go digging and figure out this issue. I was one of them, but only because I rushed into exploring the code before reading the entire email thread.)

I would like to say that our system for ZFS NFS export permissions avoids this issue, but it has exactly the same problem. Rather than try to reconcile the current NFS export settings and the desired new ones, it just does a brute force 'exportfs -u' for all current clients and then reshares things. Fortunately we only very rarely change the NFS exports for a filesystem because we export to netgroups instead of individual clients, so adding and removing individual clients is almost entirely done by changing netgroup membership. The actual exportfs setting only has to change if we add or remove entire netgroups.

(Exportfs has a tempting '-r' option to just resynchronize everything, but our current system doesn't use it and I don't know why. I know that I poked around with exportfs when I was developing it but I don't seem to have written down notes about my exploration, so I don't know if I ran into problems with -r, didn't notice it, or had some other reason I rejected it. If I didn't overlook it, this is definitely a case where I should have documented why I wasn't doing an attractive thing.)

ZFSOnLinuxSharenfsGotcha written at 00:08:23; Add Comment

2019-01-18

Linux CPU numbers are not necessarily contiguous

In Linux, the kernel gives all CPUs a number; you can see this number in, for example, /proc/stat:

cpu0 [...]
cpu1 [...]
cpu2 [...]
cpu3 [...]

Under normal circumstances, Linux has contiguous CPU numbers that start at 0 and go up to however many CPUs the system has. However, this is not guaranteed and is not always the case on certain live configurations. It's perfectly possible to have a configuration where, for example, you have sixteen CPUs that are numbered 0 to 7 and 16 to 23, with 8 to 15 missing. In this situation, /proc/stat will match the kernel's numbering, with lines for cpu0 through cpu7 and cpu16 through cpu23. If your code sees this and decides to fill in the missing CPUs 8 through 15, it will be wrong.

You might think that no code could possibly make this mistake, but it's not quite that simple. If, for example, you make a straightforward array to hold CPU status, read in information from various sources, and then print out your accumulated data for CPUs 0 through the highest CPU you saw, you will invent those missing CPUs 8 through 15 (possibly with random unset data for them). In situations like this, you need to actively keep track of what CPUs in your array are valid and what ones aren't, or you need a more sophisticated data structure.

(If you've created an API that says 'I return an array of CPU information for CPUs 0 through N', well, you have a problem. You're probably going to need an API change; if this is in a structure, at least an API addition of a new field to tell people which CPUs are valid.)

I can see why people make this mistake. It's tempting to have simple code, displays, and so on, and almost all Linux machines have contiguous CPU numbering so your code will work almost everything (we only wound up with non-contiguous numbering through bad luck). But, sadly, it is a mistake and sooner or later it will bite either you or someone who uses your code.

(It's unfortunate that doing this right is more complicated. Life certainly would be simpler if Linux guaranteed that CPU numbers were always contiguous, but given that CPUs can come and go, that could cause CPU numbers to not always refer to the same actual CPU over time, which is worse.)

Sidebar: How we have non-contiguous CPU numbers

We have one dual-socket machine with hyperthreading where one socket has cooling problems and we've shut it down by offlining the CPUs. Each socket has eight cores, and Linux enumerated one side of the HT pairs for both sockets before starting on the other side of the HT pairs. CPUs 0 through 7 and 16 through 23 are the two HTs for the eight cores on the first socket; CPUs 8-15 would be the first set of CPUs for the second socket, if they were online, and then CPUs 24-32 would be the other side of the HT pairs.

In general, HT pairing is unpredictable. Some machines will pair adjacent CPU numbers (so CPU 0 and CPU 1 are a HT pair) and some machines will enumerate all of one side before they enumerate all of the other. My Ryzen-based office workstation enumerates HT pairs as adjacent CPU numbers, so CPU 0 and 1 are a pair, while my Intel-based home machine enumerates all of one HT side before flipping over to enumerate all of the other, so CPU 0 and CPU 6 are a pair.

(I prefer the Ryzen ordering because it makes life simpler.)

It's possible that we should be doing something less or other than offlining all of the CPUs for the socket with the cooling problem (perhaps the BIOS has an option to disable one socket entirely). But offlining them all seemed like the most thorough and sure option, and it certainly was simple.

CPUNumbersNotContiguous written at 00:50:14; Add Comment

2019-01-13

Two views of ZFS's GPL-incompatibility and the Linux kernel

As part of a thread on linux-kernel where ZFS on Linux's problem with a recent Linux kernel change in exported symbols was brought up, Greg Kroah-Hartman wrote in part in this message:

My tolerance for ZFS is pretty non-existant. Sun explicitly did not want their code to work on Linux, so why would we do extra work to get their code to work properly?

If one frames the issue this way, my answer would be that in today's world, Sun (now Oracle) is no longer at all involved in what is affected here. It stopped being 'Sun's code' years ago, when Oracle Solaris and OpenSolaris split apart, and it's now in practice the code of the people who use ZFS on Linux, with a side digression into FreeBSD and Illumos. The people affected by ZoL not working are completely disconnected from Oracle, and anything the Linux kernel does to make ZoL work will not help Oracle more than a tiny fraction.

In short, the reason to do extra work here is that the people affected are Linux users who are using their best option for a good modern filesystem, not giant corporations taking advantage of Linux.

(I suspect that the kernel developers are not happy that people would much rather use ZFS on Linux than Btrfs, but I assure them that it is still true. I am not at all interested in participating in a great experiment to make Btrfs sufficiently stable, reliable, and featureful, and I am especially not interested in having work participate in this for our new fileservers.)

However, there is a different way to frame this issue. If you take it as given that Sun did not want their code to be used with Linux (and Oracle has given no sign of feeling otherwise), then fundamental social respect for the original copyright holder and license means respecting their choice. If Sun didn't want ZFS to work on Linux, it's hostile to them for the kernel community to go to extra work to enable it to work on Linux. If people outside the kernel community hack it up so that it works anyway, that's one thing. But if the kernel community goes out of its way to enable these hacks, well, then the kernel community becomes involved and is violating the golden rule as applied to software licenses.

As a result, I can reluctantly and unhappily support or at least accept 'no extra work for ZFS' as a matter of principle for Linux kernel development. But if your concern is not principle but practical effects, then I think you are mistaken.

(And if Oracle actually wanted to take advantage of the Linux kernel for ZFS, they could easily do so. Whether they ever will or not is something I have no idea about, although I can speculate wildly and their relicensing of DTrace is potentially suggestive.)

ZFSLicenseTwoViews written at 23:50:46; Add Comment

The risk that comes from ZFS on Linux not being GPL-compatible

A couple of years ago I wrote about the harm of ZFS not being GPL-compatible, which was that this kept ZFS from being bundled into most Linux distributions. License compatibility is both a legal and a social thing, and the social side is quite clear; most people who matter consider ZFS's CDDL license to be incompatible with the kernel. However, it turns out that there is another issue and another side of this that I didn't realize back at the time. This issue surfaced recently with the 5.0 kernel release candidates, as I first saw in Phoronix's ZFS On Linux Runs Into A Snag With Linux 5.0.

The Linux kernel doesn't allow kernel modules to use just any internal kernel symbols; instead they must be officially exported symbols. Some symbols (often although not entirely old ones) are exported to all kernel modules, regardless of the module's license, while others are exported in a way that marks them as restricted to GPL'd kernel modules. At the same time the kernel does not have a stable API of these exported symbols and previously exported ones can be removed as code is revived. Removed symbols may have no replacement at all or the replacement may be a GPL-only one when the previous symbol was generally available.

Modules that are part of the Linux kernel source are always going to work, so the kernel always exports enough symbols for them (although possibly as GPL-only symbols, since in-tree kernel modules are all GPL'd). Out of kernel modules that do the same sort of thing as in-kernel ones are also always going to work, at least if they're GPL'd; you're always going to be able to have out kernel modules for device drivers in general, for example. But out of kernel modules for less common things are more or less at the mercy of what symbols the kernel exports, especially if they're not GPL'd modules. If you're an out of kernel module with a GPL-compatible license, you might get the kernel developers to export some symbols you needed. If your module has a license that is seen as not GPL-compatible, well, the kernel developers may not be very sympathetic.

This is what has happened with ZFS on Linux as of the 5.0 pre-release, as covered in the Phoronix story and ZoL issue #8259. This specific problem will probably be worked around, but it shows a systemic risk for ZFS on Linux (and for any unusual non-GPL'd module), which is that you are at the mercy of the Linux kernel people to keep working in some vaguely legal way. If the Linux kernel people ever decide to be hostile they can systematically start making your life hard, and they may well make your life hard just as a side effect.

Is it likely that ZFS on Linux will someday be unable to work at all with new kernels, because crucial symbols it needs are not available at all? I think it's unlikely, but it's certainly possible and that makes it a risk for long term usage of ZFS on Linux. If it happened (hopefully far in the future), at work our answer would be to replace our current Linux-based ZFS fileservers with FreeBSD ones. On my own machines, well, I'd have to figure out some way of migrating all of my data around and what I'd put it on, and it would definitely be a pain and make me unhappy.

(It wouldn't be BTRFS, unless things change a lot by that point.)

ZFSNonGPLRisk written at 23:17:29; Add Comment

2019-01-06

Linux network-scripts being deprecated is a problem for my home PPPoE link

The other day, I ran ifdown on my home machine for the first time since I upgraded it to Fedora 29 and got an unpleasant surprise:

WARN : [ifdown] You are using 'ifdown' script provided by 'network-scripts', which are now deprecated.
WARN : [ifdown] 'network-scripts' will be removed from distribution in near future.
WARN : [ifdown] It is advised to switch to 'NetworkManager' instead - it provides 'ifup/ifdown' scripts as well.

As they say, this is my unhappy face.

On both my work and my home machines, most of my network configuration is done through systemd's networkd. However, at home I also have a PPPoE DSL link. Systemd (still) doesn't handle PPPoE and I have no interest in using NetworkManager on my desktop machines, which means that currently my PPPoE link setup is still done through the good old fashioned Fedora /etc/sysconfig/network-scripts system. Since this now seems to be on a deprecation schedule of some sort (although who knows what 'near future' is here, for Fedora or in general), I'm going to need to find some sort of a replacement for my use of it.

In theory this shouldn't be too hard, because after all ifup and ifdown are just shell scripts, and for a DSL link it appears that most of what they do is delegate things to rp-pppoe's adsl-start script. In practice, these are gnarled and tangled shell scripts, with who knows what side effects and environment variable settings that adsl-start and things downstream of it are counting on, and I'm not looking forward to first reverse engineering all of the setup and then building an equivalent replacement system, just because people want to remove network-scripts.

For even more potential fun for me in the future, ifup and ifdown are provided both by the network-scripts package and by NetworkManager, with this managed by Fedora's alternatives system. I suspect that this means I won't even notice that network-scripts has been removed until my system's ifup and ifdown invocations start quietly running NetworkManager and things explode for reasons that I expect to boil down to 'because NetworkManager'.

(I don't have much optimism about NetworkManager's ability to cooperate with other parties or be modest about what it will do with your network setup; instead my impression is that NetworkManager expects to run all of your networking however it sees fit. So I expect it to try to read random bits of my very historical network-scripts configuration files, interpret them in various ways, and then probably cause my networking to explode. NetworkManager has an ifcfg-rh plugin for this, but I have no idea how well it works and it doesn't seem to support DSL PPPoE at all based on the documentation.)

PS: For what it's worth, removing the network-scripts package is not currently listed in the Fedora 30 accepted changes as far as I can see (see also).

Sidebar: How I currently have my PPPoE networking wired up

I have a system cron.d file that runs 'ifup ppp0' on boot (via a @reboot action), and then re-runs it every fifteen minutes if there's no default route, because sometimes it falls over. In a more proper systemd world, I guess I should write a service unit that runs it after my home machine's Ethernet is up and then perhaps try out a timer unit to handle the 'try again every fifteen minutes' thing.

(I normally strong prefer crontab entries over systemd timer units, but I would be interacting with other systemd units and with the overall systemd state here so timer units are probably better.)

NetworkScriptsAndPPPoE written at 01:25:09; Add Comment

2019-01-02

How I get a copy of the Ubuntu kernel source code (as of Ubuntu 18.04)

For unfortunate reasons beyond the scope of this entry, I've recently needed to once again take a look at the Ubuntu kernel source code, as they're patched and modified and so on from the upstream versions. There are some things where I'll just look at the official kernel source, but for certain sorts of issues, nothing short of the real kernel we're using (or something close to it) will do. Every time I do this I have to re-discover or re-work out how to get a copy of this source code, so this time around I'm going to write it down. The necessary disclaimer is that Ubuntu may change all of this in the future.

There are two approaches you can take here, depending on what you want. I'll start with the more obvious one, using apt-get. In theory, what you should be able to do is to just 'apt-get source' on the kernel that you're running, in other words generally:

apt-get source linux-image-$(uname -r)

If you try this, you will get a 'linux-signed' source package, which does not actually contain any kernel source. It turns out that what you really want is the source package for the linux-image-unsigned package. This is the 'linux' source package, so you can do either of the following commands:

apt-get source linux-image-unsigned-$(uname -r)
apt-get source linux

In either case, you end up with the most recent kernel source package, which is not necessarily the source code to the kernel that you're actually running. Unfortunately there's no guarantee that Ubuntu still has the linux source package for your specific kernel available as a source package; they appear to purge old ones from mirrors, just as they purge the old binary packages.

The other approach is the one recommended by Ubuntu and which a successful 'apt-get source' will nag you about, which is to clone it from Ubuntu's kernel git tree. At the moment (here at the start of 2019), you can find information on this in Ubuntu's KernelGitGuide wiki page or their page on kernel source (note that this page is incorrect about apt-get). The Ubuntu 18.04 LTS git tree is here, although per Ubuntu's wiki page, you should clone it with the git: protocol URLs. As you can see, there are a variety of tags and branches in the repo, and I think they're all reasonably obvious (and there's some explanation of them in the Ubuntu wiki pages).

In the 'apt-get source' version, the Ubuntu package changelog is in debian/changelog; this is what you want to consult if you're looking for relevant bug fixes and so on. In the git version, the changelog is debian.master/changelog and the 'debian' directory has other things. In the git version, Ubuntu generally commits individual Ubuntu changes as individual git commits, which means that you can use 'git log' to scan for Ubuntu changes to particular files or directories of interest. Because the git version has tags for specific releases, it's also the easiest way to see the Ubuntu kernel tree as of a specific linux-image version (or to see the differences between two of them, perhaps in a sub-tree).

For instance, suppose you want to see all changes since Ubuntu's 4.15.0-30 in some areas of the tree. You could do:

git log Ubuntu-4.15.0-30.32.. -- fs/sysfs fs/namei.c fs/inode.c fs/dcache.c

A specific commit can then be shown with 'git show <id>' as usual, which will show you the diff as well as its commit message.

Ubuntu's kernels have the ZFS on Linux kernel module code in top-level 'spl' and 'zfs' directories. The user level tools are in a separate set of packages with their own source package, 'zfs-linux'. Apt-getting this source package will produce a claim that it is really maintained in Debian GIT, but I'm not sure that's actually true for the Ubuntu version. While this source package still includes the kernel module source code, I believe only the user level stuff is used. I'm not sure how Ubuntu manages changes to all of this, but they appear to keep everything in sync somehow.

(In Ubuntu 20.04, there will likely only be a single 'zfs' directory, since ZFS on Linux has merged the 'spl' package into the 'zfs' one in its development version.)

In theory Ubuntu has instructions on building packages from the git version here, and they were even updated within the past year. In practice I have no idea, since I haven't ever built a new Ubuntu kernel package, but I would probably be inclined to start from 'apt-get source linux', because that's at least in a format that I sort of understand. Of course, if I had to build a modified version of a specific, no longer current linux-image version, I might have to use the git version because that's the only way I can get the exact source code.

(Perhaps this means we should be routinely saving the kernel source packages for important kernels that we use. Sadly it's now too late to do this for the kernel that our Linux fileservers use; we froze their kernel version some time ago and the 'linux' source package for it is long gone now.)

UbuntuKernelSource written at 02:55:14; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.