Wandering Thoughts archives

2017-02-24

What an actual assessment of Ubuntu kernel security updates looks like

Ubuntu recently released some of their usual not particularly helpful kernel security update announcements and I tweeted:

Another day, another tedious grind through Ubuntu kernel security announcements to do the assessment that Ubuntu should be doing already.

I have written about the general sorts of things we want to know about kernel security updates, but there's nothing like a specific example (and @YoloPerdiem asked). So here is essentially the assessment email that I sent to my co-workers.

First, the background. We currently have Ubuntu 16.04 LTS, 14.04 LTS, and 12.04 LTS systems, so we care about security updates for the mainline kernels for all of those (we aren't using any of the special ones). The specific security notices I was assessing are USN-3206-1 (12.04), USN-3207-1 (14.04), and USN-3208-1 (16.04). I didn't bother looking at CVEs that require hardware or subsytems that we don't have or use, such as serial-to-USB hardware (CVE-2017-5549) or KVM (several CVEs here). We also don't update kernels just for pure denial of service issues (eg CVE-2016-9191, which turns out to require containers anyway), because our users already have plenty of ways to make our systems crash if they want to.

So here is a slightly edited and cleaned up version of my assessment email:


Subject: Linux kernel CVEs and my assessment of them

16.04 is only affected by CVE-2017-6074, which we've mitigated, and CVE-2016-10088, which doesn't apply to us because we don't have people who can access /dev/sg* devices.

12.04 and 14.04 are both affected by additional CVEs that are use-after-frees. They are not explicitly exploitable so far, but CVE-2017-6074 is also a use-after-free and is said to be exploitable with an exploit released soon, so I think they are probably equally dangerous.

[Local what-to-do discussion elided.]

Details:

CVE-2017-6074:

Andrey Konovalov discovered a use-after-free vulnerability in the DCCP implementation in the Linux kernel. A local attacker could use this to cause a denial of service (system crash) or possibly gain administrative privileges.

This is bad if not mitigated, with an exploit to be released soon (per here), but we should have totally mitigated it by blocking the DCCP modules. See my worklog on that.

CVE-2016-7911:

Dmitry Vyukov discovered a use-after-free vulnerability in the sys_ioprio_get() function in the Linux kernel. A local attacker could use this to cause a denial of service (system crash) or possibly gain administrative privileges.

Links: 1, 2, 3.

The latter URL has a program that reproduces it, but it's not clear if this can be exploited to do more than crash. But CVE-2017-6074's use-after-free is apparently exploitable, so...

CVE-2016-7910:

It was discovered that a use-after-free vulnerability existed in the block device layer of the Linux kernel. A local attacker could use this to cause a denial of service (system crash) or possibly gain administrative privileges.

Link: 1

Oh look, another use-after-free issue. Ubuntu's own link for the issue says 'allows local users to gain privileges by leveraging the execution of [...]' although their official release text is less alarming.

CVE-2016-10088:

It was discovered that the generic SCSI block layer in the Linux kernel did not properly restrict write operations in certain situations. A local attacker could use this to cause a denial of service (system crash) or possibly gain administrative privileges.

Finally some good news! As far as I can tell from Ubuntu's actual CVE-2016-10088 page, this is only exploitable if you have access to a /dev/sg* device, and on our machines people don't.


(The actual email was plain text, so the various links were just URLs dumped into the text.)

As you can maybe see from this, doing a proper assessment requires reading at least the detailed Ubuntu CVE information in order to work out under what circumstances the issue can be triggered, for instance to know that CVE-2016-10088 requires access to a /dev/sg* device. Not infrequently you have to go chasing further; for example, only Andrey Konovalov's initial notice mentions that he will release an exploit in a few days. In this case we could mitigate the issue anyways by blacklisting the DCCP modules, but in other cases 'an exploit will soon be released' drastically raises the importance of a security exposure (at least for us).

The online USN pages usually link to Ubuntu's pages on the CVEs they include, but the email announcements that Ubuntu sends out don't. Ubuntu's CVE pages usually have additional links, but not a full set; often I wind up finding Debian's page on a CVE because they generally have a full set of search links for elsewhere (eg Debian's CVE-2016-9191 page). I find that sometimes the Red Hat or SuSE bug pages will have the most technical detail and thus help me most in understanding the impact of a bug and how exposed we are.

The amount of text that I wind up writing in these emails is generally way out of proportion to the amount of reading and searching I have to do to figure out what to write. Everything here is a sentence or two, but getting to the point where I could write those is the slog. And with CVE-2017-6074, I had to jump in to set up and test an entire mitigation of blacklisting all the DCCP modules via a new /etc/modprobe.d file and then propagating that file around to all of our Ubuntu machines.

linux/UbuntuKernelUpdateAssessment written at 23:26:07; Add Comment

How ZFS bookmarks can work their magic with reasonable efficiency

My description of ZFS bookmarks covered what they're good for, but it didn't talk about what they are at a mechanical level. It's all very well to say 'bookmarks mark the point in time when [a] snapshot was created', but how does that actually work, and how does it allow you to use them for incremental ZFS send streams?

The succinct version is that a bookmark is basically a transaction group (txg) number. In ZFS, everything is created as part of a transaction group and gets tagged with the TXG of when it was created. Since things in ZFS are also immutable once written, we know that an object created in a given TXG can't have anything under it that was created in a more recent TXG (although it may well point to things created in older transaction groups). If you have an old directory with an old file and you change a block in the old file, the immutability of ZFS means that you need to write a new version of the data block, a new version of the file metadata that points to the new data block, a new version of the directory metadata that points to the new file metadata, and so on all the way up the tree, and all of those new versions will get a new birth TXG.

This means that given a TXG, it's reasonably efficient to walk down an entire ZFS filesystem (or snapshot) to find everything that was changed since that TXG. When you hit an object with a birth TXG before (or at) your target TXG, you know that you don't have to visit the object's children because they can't have been changed more recently than the object itself. If you bundle up all of the changed objects that you find in a suitable order, you have an incremental send stream. Many of the changed objects you're sending will contain references to older unchanged objects that you're not sending, but if your target has your starting TXG, you know it has all of those unchanged objects already.

To put it succinctly, I'll quote a code comment from libzfs_core.c (via):

If "from" is a bookmark, the indirect blocks in the destination snapshot are traversed, looking for blocks with a birth time since the creation TXG of the snapshot this bookmark was created from. This will result in significantly more I/O and be less efficient than a send space estimation on an equivalent snapshot.

(This is a comment about getting a space estimate for incremental sends, not about doing the send itself, but it's a good summary and it describes the actual process of generating the send as far as I can see.)

Yesterday I said that ZFS bookmarks could in theory be used for an imprecise version of 'zfs diff'. What makes this necessarily imprecise is that while scanning forward from a TXG this way can tell you all of the new objects and it can tell you what is the same, it can't explicitly tell you what has disappeared. Suppose we delete a file. This will necessarily create a new version of the directory the file was in and this new version will have a recent TXG, so we'll find the new version of the directory in our tree scan. But without the original version of the directory to compare against we can't tell what changed, just that something did.

(Similarly, we can't entirely tell the difference between 'a new file was added to this directory' and 'an existing file had all its contents changed or rewritten'. Both will create new file metadata that will have a new TXG. We can tell the case of a file being partially updated, because then some of the file's data blocks will have old TXGs.)

Bookmarks specifically don't preserve the original versions of things; that's why they take no space. Snapshots do preserve the original versions, but they take up space to do that. We can't get something for nothing here.

(More useful sources on the details of bookmarks are this reddit ZFS entry and a slide deck by Matthew Ahrens. Illumos issue 4369 is the original ZFS bookmarks issue.)

Sidebar: Space estimates versus actually creating the incremental send

Creating the actual incremental send stream works exactly the same for sends based on snapshots and sends based on bookmarks. If you look at dmu_send in dmu_send.c, you can see that in the case of a snapshot it basically creates a synthetic bookmark from snapshot's creation information; with a real bookmark, it retrieves the data through dsl_bookmark_lookup. In both cases, the important piece of data is zmb_creation_txg, the TXG to start from.

This means that contrary to what I said yesterday, using bookmarks as the origin for an incremental send stream is just as fast as using snapshots.

What is different is if you ask for something that requires estimating the size of the incremental sends. Space estimates for snapshots are pretty efficient because they can be made using information about space usage in each snapshot. For details, see the comment before dsl_dataset_space_written in dsl_dataset.c. Estimating the space of a bookmark based incremental send requires basically doing the same walk over the ZFS object tree that will be done to generate the send data.

(The walk over the tree will be somewhat faster than the actual send, because in the actual send you have to read the data blocks too; in the tree walk, you only need to read metadata.)

So, you might wonder how you ask for something that requires a space estimate. If you're sending from a snapshot, you use 'zfs send -v ...'. If you're sending from a bookmark or a resume token, well, apparently you just don't; sending from a bookmark doesn't accept -v and -v on resume tokens means something different from what it does on snapshots. So this performance difference is kind of a shaggy dog story right now, since it seems that you can never actually use the slow path of space estimates on bookmarks.

solaris/ZFSBookmarksMechanism written at 00:26:44; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.