Wandering Thoughts archives

2019-12-18

Linux kernel Security Modules (LSMs) need their own errno value

Over on Twitter, I said something I've said before:

Once again, here I am hating how Linux introduced additional kernel security modules without also adding an errno for 'the loadable security module denied permissions'.

Lack of a LSM errno significantly complicates debugging problems, especially if you don't normally use LSMs.

Naturally there's a sysadmin story here, but let's start with the background (even if you probably know it).

SELinux and Ubuntu's AppArmor are examples of Linux Security Modules; each of them adds additional permission checks that you must pass over and above the normal Unix permissions. However, when they reject your access, they don't actually tell you this specifically; instead you get the generic Unix error of EPERM, 'operation not permitted', which is normally what you get if, say, the file is unreadable to your UID for some reason.

We have an internal primary master DNS server for our DNS zones (a so called 'stealth master'), which runs Ubuntu instead of OpenBSD for various reasons. We have the winter holiday break coming up and since we've had problems with it coming up cleanly in the past, so last week it seemed like a good time to reboot it under controlled circumstances to make sure that at least that worked. When I did that, named (aka Bind) refused to start with a 'permission denied' error (aka EPERM) when it tried to read its named.conf configuration file. For reasons beyond the scope of this entry, this file lives on our central administrative NFS filesystem, and when you throw NFS into the picture various things can go wrong with access permissions. So I spent some time looking at file and directory permissions, NFS mount state, and so on, until I remembered something my co-worker had mentioned in passing.

Ubuntu defaults to installing and using AppArmor, but we don't like it and we turn it off almost everywhere (we can't avoid it for MySQL, although we can make it harmless). That morning we had applied the pending Ubuntu packages updates, as one does, and one of the packages that got updated had been the AppArmor package. It turns out that in our environment, when an AppArmor package update is applied, AppArmor gets re-enabled (but I think not started immediately); when I rebooted our primary DNS master, it now started AppArmor. AppArmor has a profile for Bind that only allows for a configuration file in the standard place, not where we put our completely different and customized one, and so when Bind tried to read our named.conf, the AppArmor LSM said 'no'. But that 'no' was surfaced only as an EPERM error and so I went chasing down the rabbit hole of all of the normal causes for permission errors.

People who deal with LSMs all of the time will probably be familiar with this issue and will immediately move to the theory that any unfamiliar and mysterious permission denials are potentially the LSM in action. But we don't use LSMs normally, so every time one enables itself and gets in our way, we have to learn all about this all over again. The process of troubleshooting would be much easier if the LSM actually told us that it was doing things by having a new errno value for 'LSM permission denied', because then we'd know right away what was going on.

(If Linux kernel people are worried about some combination of security concerns and backward compatibility, I would be happy if they made this extra errno value an opt-in thing that you had to turn on with a sysctl. We would promptly enable it for all of our servers.)

PS: Even if we didn't have our named.conf on a NFS filesystem, we probably wouldn't want to overwrite the standard version with our own. It's usually cleaner to build your own completely separate configuration file and configuration area, so that you don't have to worry about package updates doing anything to your setup.

linux/ErrnoForLSMs written at 23:59:19; Add Comment

PCIe slot bandwidth can change dynamically (and very rapidly)

When I added some NVMe drives to my office machine and started looking into its PCIe setup, I discovered that its Radeon graphics card seemed to be operating at 2.5 GT/s (PCIe 1.0) instead of 8 GT/s (PCIe 3.0). The last time around, I thought I had fixed this just by poking into the BIOS, but in a comment, Alex suggested that this was actually a power-saving measure and not necessarily done by the BIOS. I'll quote the comment in full because it summarizes things better than I can:

Your GPU was probably running at lower speeds as a power-saving measure. Lanes consume power, and higher speeds consume more power. The GPU driver is generally responsible for telling the card what speed (and lane width) to run at, but whether that works (or works well) with the Linux drivers is another question.

It turns out that Alex is right, and what I saw after going through the BIOS didn't quite mean what I thought it did.

To start with the summary, the PCIe bandwidth being used by my graphics card can vary very rapidly from 2.5 GT/s up to 8 GT/s and then back down again based on whether or not the graphics driver needs the card to do anything (or the aggregate Linux and X software stack as a whole, since I don't know where these decisions are being made). The most dramatic and interesting difference is between two apparently very similar ways of seeing if the Radeon's bandwidth is currently downgraded, either automatically scanning through lspci's output with 'lspci -vv | fgrep downgrade' or manually looking through it with 'lspci -vv | less'. When I used less, the Radeon normally showed up downgraded to 2.5 GT/s. When I used fgrep, other things before the Radeon showed up as downgraded but the Radeon never did; it was always at 8 GT/s.

(Some of those other things have been downgraded to 'x0' lanes, which I suspect means that they've been disabled as unused.)

What I think is happening here is that when I pipe lspci to less, lspci gets the Radeon's bandwidth before any output is written to the screen (less reads it all in a big gulp and then displays it), so at the time the graphics chain is inactive. When I use the fgrep pipe, some output is written to the screen before lspci gets to the Radeon and so the graphics chain lights up the Radeon's bandwidth to display things. What this suggests is that the graphics chain can and does vary the Radeon's PCIe bandwidth quite rapidly. Another interesting case is that running the venerable glxgears doesn't bring the PCIe bandwidth up from 2.5 GT/s, but running GpuTest's 'fur' test does (it goes to 8 GT/s as you might expect).

(It turns out that nVidia's Linux drivers also do this.)

Of course all of this may make seeing whether you're getting full PCIe bandwidth a little bit interesting. It's clearly not enough to just look at your system, even when it's moderately active (I have several X programs that update once a second); you really need to put it under some approximation of full load and then check. So far I've only seen this happen with graphics cards, but who knows what's next (NVMe drives could be one candidate to drop their bandwidth to save power and thus reduce heat).

tech/PCIeVaryingBandwidth written at 00:38:31; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.