Wandering Thoughts archives

2015-09-18

Modern Linux laptop suspending and screenlocking makes me twitch

Modern Linux by and large works quite well on my (work) laptop; I use Fedora, but I have no reason to believe that things Ubuntu or Debian would be any different. However every so often there are parts of the whole system that disturb me, and one of them is definitely how suspending and screenlocking interact in, say, Cinnamon.

The brief description is that if I suspend the laptop and then resume it (such as by closing the screen and then later opening it up again), the laptop resumes with the screen unlocked and only locks it once resumed. This is especially disturbing because at least in Cinnamon there is a clear post-resume period where the unlocked screen contents are visible. As they say, I sure hope you didn't have anything sensitive on the screen before you suspended.

On the one hand, I can half see the logic for doing it this way; not waiting to start a screenlocker allows the system to suspend faster and may be required if the first sign of an impending suspend everything gets is a magic signal that says 'we are about to suspend right now'. On the other hand, this makes me nervous about the security issues involved. Are the people involved absolutely sure that there is no way to break in to the system in the time between the resume and the screenlocker activating? People have certainly screwed this one up before on other systems. What happens if you want things like a fully encrypted disk with encryption keys wiped from memory during suspends? Or, for that matter, just a small matter like removing keys from ssh-agent when your screen locker activates.

(I suspect the answer for many desktop environments is 'ha ha not a supported configuration'.)

In the grand tradition of modern Linux desktops, I suspect that there is very little I can do to reliably fix this and very little that the desktop environments are doing to change the situation. Most people don't care about security at this level (and most people probably have faster laptops than mine, so they screenlock much faster after resume).

(At one point several versions ago, Cinnamon actually broke locking the screen after resuming from suspend. It was apparently not an urgent enough bug for either Cinnamon or Fedora to push out a fix, which I suspect says it all here.)

SuspendScreenlockTwitch written at 02:19:13; Add Comment

2015-08-19

Linux's abstract namespace for Unix domain sockets

The address of ordinary Unix domain sockets for servers is the file name of a socket file that actually appears in the filesystem. This is pleasantly Unix-y on the surface but winds up requiring you to do a bunch of bureaucracy to manage these socket files, and the socket files by themselves don't actually do anything that would make it useful for them to be in the filesystem; you can't interact with them and the server behind them with normal Unix file tools, for example.

Linux offers you a second choice. Rather than dealing with socket files in the filesystem, you can use names in an abstract (socket) namespace. Each name must be unique, but the namespace is otherwise flat and unstructured, and you can call your server socket whatever you want. Conveniently and unlike socket files, abstract names vanish when the socket is closed (either by you or because your program exited).

Apart from being Linux-only, the abstract socket namespace suffers from two limitations: you have to find a way to get a unique name and it has no permissions. With regular socket files you can use regular Unix file and directory permissions to insure that only you can talk to your server socket. With abstract socket names, anyone who knows or can find the name can connect to your server. If this matters you will have to do access control yourself.

(One approach is to use getsockopt() with SO_PEERCRED to get the UID and so on of the client connecting to you. SO_PEERCRED is Linux specific as far as I know, but then so is the abstract socket namespace.)

Lsof and other tools conventionally represent socket names in the abstract socket namespace by putting an @ in front of them. This is not actually how they're specified at the C API level, but it's a distinct marker and some higher level tools follow it for, eg, specifying socket names.

(The Go net package is one such piece of software.)

As far as picking unique names goes, one trick many programs seem to use is to use whatever filename they would be using if they didn't have the abstract socket namespace available. This gives you a convenient way of expressing, eg, per-user sockets; you can just give it a name based on the user's home directory. Other programs use a hierarchical namespace of their own; Ubuntu's upstart listens on the abstract socket name '/com/ubuntu/upstart', for example.

(For personal hacks, you can of course just make up your own little short names. Little hacks don't need a big process; that's the whole attraction of the abstract namespace.)

Now that I've poked around this, I'm going to use it for future little Linux-only hacks because checking permissions (if it's even necessary) is a lot more convenient than the whole hassle of dealing with socket files. For things I write that are intended to be portable, I don't see much point; portable code has to deal with socket files so I might as well use regular Unix domain socket names and socket files all the time.

(A bunch of my personal hacks are de facto Linux only because my desktop machines are Linux. I'll regret that laziness if I ever try to move my desktop environment to FreeBSD or the like, but that seems highly unlikely at the moment.)

SocketAbstractNamespace written at 02:17:35; Add Comment

2015-08-13

Enabling services on package updates is a terrible mistake

Let's start with my tweet:

The #CentOS 6 iptables package update unconditionally (re-)enables the service and thus turns on firewalls. BRB, setting things on fire now.

Really, it does. It's right there in the RPM postinstall script:

; rpm -q --scripts iptables
[...]
postinstall scriptlet (using /bin/sh):
/sbin/ldconfig
/sbin/chkconfig --add iptables
[...]

Of course iptables by itself does nothing or rather just applies whatever rules you already have, but your CentOS 6 machine almost certainly has a restrictive set of rules in /etc/sysconfig/iptables that were written there by system-config-firewall during the system install. Turning on the iptables service will cause them to be applied at the next reboot, and in our case this took out incoming external email for more than twelve hours because of course those rules blocked incoming connections to the necessary TCP ports.

(Yes, there were two problems there. We know.)

At one level this is a straightforward total failure of good packaging; a package update should never enable a currently disabled service. Automatically enabling a service on the initial package install may or may not be a good idea, but changing the system state on a mere package upgrade is clearly utterly wrong. A deliberately disabled service suddenly turning on is generally going to do something bad to the overall state of the system; the only question is just how bad. Iptables is well placed to make this really bad.

(In turn this means that there was a major process failure here. This issue is almost certainly present in the original RHEL 6 update that CentOS 6 built their package from, and Red Hat Enterprise of all distributions should have better update validation than that. This should not have gotten through code review.)

At another level this is also a joint failure between RPM and chkconfig because both make it extra hard to do the right thing. RPM has only a single 'postinstall' script which is run after both installs and upgrades, which means that you have to remember to have your shell code explicitly check for which case you're in (and it's not at all easy to test the upgrade case). Chkconfig and in general the whole Red Hat init.d symlink system don't draw any distinction between 'what the package wants to do by default' and 'what the local sysadmin has specifically set up', which leaves packages easily able to make mistakes that override sysadmin decisions like this. Put the two together and you have an explosive mixture where any failure can blow your foot off. This is not a resilient system.

(Systemd does much better than the init.d stuff here precisely because it has a clear distinction between these two things.)

NoEnableOnUpgrade written at 00:02:15; Add Comment

2015-08-12

What common older versions of free are telling you

Yesterday I wrote about what free was really telling you, using the output (and manpage) from the version of free on Fedora 22. As Alan noted in a comment on my entry, there are actually two versions of free, with somewhat different output. The older version reports:

; free -m
             total   used   free  shared  buffers  cached
Mem:         16018  15760    258       3       72   12924
-/+ buffers/cache:   2763  13254
Swap:         3811     19   3792

This version is apparently on Debian machines and is definitely on Ubuntu up to 14.04 LTS and RHEL/CentOS up to RHEL 7, which covers most of the machines out there. The change in output turns out to have been introduced in procps-ng 3.3.10, which was released around September of 2014. Very few distributions seem to have picked up the new version, although it is in Debian testing (and Fedora, obviously).

(Although Debian and Ubuntu call the package 'procps', they're actually using procps-ng. The history here is tangled.)

The useful thing to know here is that in this older version of free, what is reported in cached is only the /proc/meminfo Cached field. The Slab memory usage is not reported anywhere, which matters because a certain amount of kernel caches are actually in slabs, not in the page cache.

So, you might ask, how much kernel memory is typically used in slabs? The answer appears to be 'generally not much'. On our fleet of machines slab usage seems to typically run in the 150 MB to 300 MB range; several machines have several hundred MB, and a few exceptions have around 1 GB to 1.5 GB of slab usage. My office workstation is a drastic outlier, but that's because ZFS on Linux currently puts much of the ZFS ARC into the slab allocation system instead of having it as part of the page cache.

(On our big-slab machines, 'slabtop -s c' says that most of the slab usage seems to be eaten up by dentry and inode slabs, either for local filesystems or for NFS depending on the machine.)

(This is yet another case where writing entries here has turned out to be educational.)

FreeAndMeminfoII written at 01:38:49; Add Comment

2015-08-11

What free is really telling you, and some bits from /proc/meminfo

The free command is a pretty venerable Linux sysadmin tool; often it's one of the first things I reach for when I suspect that a system might be having RAM problems. Its normal output certainly looks straightforward enough, but as I discovered recently it's always good to actually look up the details of this stuff.

So let's start with the normal output:

; free -m
      total   used   free  shared  buff/cache  available
Mem:  32158  13111    683      33       18363       9727
Swap:  4094      0   4094

Total is the total physical memory. 'Used' is a calculated figure; according to the manpage, it's total minus free and buff/cache (and does not include shared). buff/cache is the sum of what 'free -w' will show separately as buffers and cache, and this is where life gets interesting.

Free's buffers is /proc/meminfo's Buffers field, which according to the fine documentation is:

Relatively temporary storage for raw disk blocks
shouldn't get tremendously large (20MB or so)

That size info is clearly what they call 'a little bit out of date', considering that our normal machines run around 400 MB of Buffers. There can be variations from this, and in particular my machine running ZFS on Linux usually has substantially more space in Buffers; right now it's almost 4 GB.

Free's cache is the sum of meminfo's Cached and Slab fields. Cached is the page cache, which is used by normal filesystems to hold pages read off disk (or in the case of NFS, off the network). Slab reports the amount of memory used by the kernel's slab allocator. Calling this cache is potentially misleading, since the slab allocator is widely used all through the kernel for many, many things as you can see by looking at /proc/slabinfo or running slabtop, in both cases as root (note that slab merging can make the slab names somewhat misleading). However it's true that some of the kernel's slab memory usage is for 'caches' (broadly construed).

Free's available is meminfo's MemAvailable, which is a kernel estimate based on a bunch of internal heuristics that I'm not going to try to write up. These heuristics can be wrong for various reasons, including kernel subsystems that can shrink their slab usage under memory pressure but don't put the right magic markers on their slabs so that some of their slab usage shows up in MemAvailable.

Nothing in free's output tells you how much memory the kernel is using in total. The kernel uses more memory than appears in buffers and cache, and free does not attempt to reverse engineer a kernel usage figure based on finding out how much memory appears to be in use by user programs. This is fair enough, given that free is primarily concerned with telling you how much memory is or might be freeable (it's in its name, after all). Still, it's something to bear in mind; if you really want to get a picture of kernel memory usage, you want another tool.

(I don't know what tool you want; I haven't looked yet. I suspect that sufficient information is exposed in /proc/meminfo to work this out with reasonable accuracy.)

PS: If it was up to me, I would redefine free's cached to be meminfo's Cached plus SReclaimable. The latter is the amount of space used by slabs that have explicitly marked themselves as reclaimable under memory pressure, which is hopefully more or less the set of slabs that are actually being used as caches instead of for more sticky things.

FreeAndMeminfo written at 01:56:23; Add Comment

2015-08-10

My irritation with 'systemctl status'

Let me say up front that 'systemctl status' is on the whole a really great command. The basic information it reports about a service is very useful, and it's extra-useful that you can give it a random PID and it will tell you what it belongs to (and what all of the other processes are). There is just one little drawback to it, namely that it also helpfully defaults to showing you recent log messages for the service.

In theory this is a great idea, because in theory things like looking up log messages in the systemd journal takes irrelevant amounts of time. In the real world that I inhabit, systemctl looking up this information causes lurching multi-second pauses (with disk IO storms). Then to add insult to injury, systemctl defaults to showing these log messages in a relatively useless way.

If you remember, you can force systemctl to not show you any log lines with 'systemctl status -n 0 ...'. Then if the service is in an error state, where you actually do care about the log lines, you get to re-run the command again so you can see them. Remember to use '--full' so you can see them in a useful form.

In practice, the odds that I will remember '-n 0' approach zero and so the odds that 'systemctl status' will continue to irritate me on the moderately infrequent times that I run it approach certainty.

Sidebar: How systemctl status should behave

By default, 'systemctl status' should not try to report any log lines if the service's status is normal. If the service's status is abnormal, for example if it errored out on startup, the last N log lines should be shown in full, ie with --full aka -l in effect. Appropriate command line switches would change this behavior around in various ways (always show log lines, never show log lines, etc).

I don't expect the systemd people to ever change things here, so I'm not going to bother filing this as an RFE or a bug.

(And yes, I expect that this runs rather faster on machines that use SSDs instead of spinning rust. That's one reason I suspect that an RFE would not be well received.)

SystemctlStatusLogLookupIssue written at 01:41:16; Add Comment

2015-08-08

The ARC now seems to work right in ZFS on Linux

One of the long standing with ZFS on Linux has been its integration with the general Linux kernel memory allocation system. In particular, back in December I wrote about my problem of memory competition between the ZFS ARC and the general Linux page cache, where my much smaller ext4 root filesystem wound up with a lot of data cached and the ARC was much too small. The good news is that a lot can change in seven months.

As of some recent updates to the development version of ZFS on Linux, this problem has vanished on my machine. I now consistently see the ARC being a decent size and holding a significant amount of data, and the ext4 page cache seems to be modest (you have to sort of observe the ext4 page cache by implication, based on the difference between ARC memory usage and what 'free' reports about overall kernel buffer and cache usage). In short, the ZFS on Linux ARC is now behaving like I expect a ZFS ARC to behave in general.

(Watching 'arcstat.py 1' also suggests that the ARC is being pretty effective when I do things like compile Firefox from source.)

I don't know if all of ZFS on Linux's kernel memory issues are gone. I haven't run into any, but that isn't exactly conclusive (and I do look on my machine for signs of things like memory fragmentation). But I can say that ZoL is now handling the ARC much better than it used to and seems to be much more effective at stealing memory back from ext4. In general this strikes me as a good omen for ZoL really solving its long standing memory usage issues, which will make it a significantly safer thing to use.

Sidebar: Some numbers

Although my machine's current state isn't quite comparable to what it was when I wrote my previous entry, I'll give comparative numbers anyways. On the same 32 GB machine, the ZFS ARC is currently using 16.6 GB with 12 GB of that being file data. Assuming that all of the ZFS ARC size counts as general kernel buffer/cache in 'free' output, the ext4 page cache is using under a GB. This is quite different from the previous numbers (although looking back I'm not entirely sure I was getting the size of the ext4 page cache right).

(All ZFS memory in total, ARC and otherwise, seems to be around 17.75 GB right now. This is kernel memory allocated, not memory in use, although overall utilization is around 95% to 98%.)

(Writing this entry has made me painfully aware that I was and am somewhat fuzzy on what free's output really means and that I need to research what /proc/meminfo fields themselves mean, since free is both summarizing things and being selective.)

ZFSOnLinuxARCNowWorksRight written at 02:45:16; Add Comment

2015-08-03

Running Fedora 22's dnf as a normal user versus as root

If you're me, you've probably done this at some point:

# dnf clean metadata
[...]
# dnf check-update
[... lists some available updates ...]

Then in another window, running as myself:

; dnf updateinfo info
[... doesn't report anything ...]

Wait, what?

What's going on here is that dnf maintains separate package databases for root and for individual users. When you run dnf as root, it consults (and updates) the 'official' DNF package databases and metadata under /var/cache/dnf. When you run dnf as any non-root user, it uses a /var/tmp/dnf-<user>-<random> directory instead; although everything in /var/cache/dnf is world-readable, dnf makes no attempt to look at it. As a result it's completely possible for root's dnf commands and your dnf commands to give completely different results.

You can reduce the odds of this by remembering to do 'dnf clean metadata' or the like as yourself too, but this doesn't absolutely guarantee that root's data and your data are in sync. When your dnf command re-pulls the package information, it may fetch it from a different mirror that is more (or less) up to date than the mirror root's dnf used.

(And of course this wastes bandwidth on re-fetching data you already have.)

I hate the idea of running 'dnf updateinfo info' as root instead of myself, but I'm probably going to have to get used to it. There are just too many annoyances otherwise.

PS: Yes, I know about 'dnf clean expire-cache'; I used to do it all the time with yum. Unlike with yum, with DNF it doesn't work really well to get me the latest update state. Have I mentioned that I'm not fond of DNF?

Sidebar: How dnf should behave

When run as a normal user, DNF should try to use /var/cache/dnf data files if they're suitable, ie if they're sufficiently up to date and DNF is being used in a mode where it only needs read access to them. Only if DNF has been asked to do something that would normally write to /var/cache/dnf would it switch to the /var/tmp/dnf-<user>-... scheme. In other words, if you just used information commands as a normal user you'd get to take advantage of the system DNF database.

DNFUserVersusRoot written at 00:52:39; Add Comment

2015-07-31

Ubuntu once again fails at a good kernel security update announcement

Ubuntu just sent out USN-2700-1, a 14.04 LTS announcement about a kernel update for CVE-2015-3290, CVE-2015-3291, and CVE-2015-5157. People with good memories may at this point remember USN-2688-1, a 14.04 LTS announcement about a kernel update for CVE-2015-3290, CVE-2015-1333, CVE-2015-3291, and CVE-2015-5157. Gosh, that's a familiar list of CVEs, and it sort of looks like the 'repeated CVEs' thing Ubuntu has done before. If you already applied the USN-2688-1 kernel and rebooted everything, it certainly sounds like you can skip USN-2700-1.

That would be a mistake. What Ubuntu is not bothering to mention in USN-2700-1 is that the 64-bit x86 kernels from USN-2688-1 had a bad bug. In that kernel, if a 32-bit program forks and then execs a 64-bit program the 64-bit program segfaults on startup; for example, a 32-bit shell will be unable to run any 64-bit programs (which will be most of them). This bug is the sole reason USN-2700-1 was issued (literally).

The USN-2700-1 text should come with a prominent notification to the effect of 'the previous update introduced a serious bug on 64-bit systems; we are re-issuing corrected kernels without this problem'. Ubuntu has put such notices on updates in the past so the idea is not foreign to them; they just didn't bother doing it this time around. As a result, people who may be affected by this newly introduced kernel bug do not necessarily know that this is their problem and they should update to the USN-2700-1 kernel to fix it.

(At best they may start doing a launchpad bug search and find the bug report. But I don't think it's necessarily all that likely, because the bug's title is not particularly accurate about what it actually is; 'Segfault in ld-2.19.so while starting Steam after upgrade to 3.13.0-59.98' does not point clearly to a 32-bit on 64-bit issue. It doesn't even mention 'on 64-bit platforms' in the description.)

Kernel update notices matter because people use them to decide whether or not to go through the hassle of a system reboot. If a notice is misleading, this goes wrong; people don't update and reboot when they really should. When there are bugs in a kernel update, as there were here, not telling people about them causes them to try to troubleshoot a buggy system without realizing that there is a simple solution.

(Lucky people noticed failures on the USN-2688-1 kernel right away, and so were able to attribute them to the just-done kernel update. But unlucky people will only run into this once in a while, when they run a rare lingering 32-bit program that does this, and so they may not immediately realize that it was due to a kernel update that might now be a week or two in the past.)

(See also a previous Ubuntu kernel update failure, from 2011.)

UbuntuKernelUpdateNoticeFail written at 00:39:59; Add Comment

2015-07-24

Fedora 22's problem with my scroll wheel

Shortly after I upgraded to Fedora 22, I noticed that my scroll wheel was, for lack of a better description, 'stuttering' in some applications. I'd roll it in one direction and instead of scrolling smoothly, what the application was displaying would jerk around all over, both up and down. It didn't happen all of the time and fortunately it didn't happen in any of my main applications, but it happened often enough to be frustrating. As far as I can tell, this mostly happened in native Fedora GTK3 based applications. I saw it clearly in Evince and the stock Fedora Firefox that I sometimes use, but I think I saw it in a few other applications as well.

I don't know exactly what causes this, but I have managed to find a workaround. Running affected programs with the magic environment variable GDK_CORE_DEVICE_EVENTS set to '1' has made the problem go away (for me, so far). There are some Fedora and other bugs that are suggestive of this, such as Fedora bug #1226465, and that bug leads to an excellent KDE explanation of that specific GTK3 behavior. Since this Fedora bug is about scroll events going missing instead of scrolling things back and forth, it may not be exactly my issue.

(My issue is also definitely not fixed in the GTK3 update that supposedly fixes it for other people. On the other hand, updates for KDE and lightdm now appear to be setting GDK_CORE_DEVICE_EVENTS, so who knows what's going on here.)

Since this environment variable suppresses the bad behavior with no visible side effects I've seen, my current solution is to set it for my entire session. I haven't bothered reporting a Fedora bug for this so far because I use a very variant window manager and that seems likely to be a recipe for more argument than anything else. Perhaps I am too cynical.

(The issue is very reproduceable for me; all I have to do is start Evince with that environment variable scrubbed out and my scroll wheel makes things jump around nicely again.)

Sidebar: Additional links

There is this Firefox bug, especially comment 9, and this X server patch from 2013. You'd think a patch from 2013 would be incorporated by now, but who knows.

Fedora22ScrollWheelProblem written at 00:53:59; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.