Wandering Thoughts archives

2015-04-16

Are Python dictionaries necessarily constant-time data structures?

The general view of all forms of hash tables, Python's dictionaries included, is that they are essentially constant-time data structures under normal circumstances. This is not quite true under sufficiently perverse conditions where you have a high degree of collisions in the hashes of the keys, but let's assume that you don't have that for the moment. Ignoring hash collisions, can you treat dictionaries as fast constant-time data structures?

The answer is 'not always', and the path to it has some interesting and perhaps surprising consequences. The potential problem is custom hash functions. If the objects you're using as dictionary keys have a __hash__ method, this method must be called to return the object hash. This is Python code, so it's not necessarily going to be fast by comparison with regular dictionary operations. It may also take a visibly non-constant amount of time, depending on just how it's computing the hash.

(For instance, hashing even Python strings is actually not a constant time operation; it's linear with the length of the string. It's just that all of the code is in C, so by normal standards you're never going to notice the time differences.)

One of the things that you may want to consider as a result of this is memoizing the results of any expensive __hash__ method. This is what Python strings do; if you call hash() on a string, the actual hash computation is done only once and afterwards the already computed (and saved) hash value is just repeated back to you. This only works for things with immutable hash values, but then if your objects have hash values at all they should be immutable ones.

The real answer is that all of this is relatively theoretical. I'm pretty sure that almost no one uses complex custom __hash__ functions for objects defined in Python, although it seems relatively common to define simple ones that just delegate to the hash of another object (probably mostly or always a primitive object with a fast C level hash function). And if you do have objects with complex __hash__ functions that take noticeable amounts of time, you're probably not going to be using them as dictionary keys or set members very often because if you do, well, you'll notice.

On the other hand, the amount of work that the standard library's decimal.Decimal does in its __hash__ function is a little alarming (especially in Python 2.7). Having looked, I wouldn't encourage using them as dictionary keys or set members any time soon, at least not in high-volume dictionaries or sets. The Python 3 version of datetime is another potentially interesting case, since it does a certain amount of grinding away in Python __hash__ functions.

(In Python 2.7, datetime is a C-level module so all of its hashing operations presumably go really fast in general.)

Sidebar: Custom hashes and the Global Interpreter Lock

Let's ask another question: is adding a new key and value to a dictionary an atomic operation that's inherently protected by the GIL? After all, the key might have a custom __hash__ function that runs Python code (and thus bytecode) during any dictionary operation. As far as I can tell from peering at the CPython code, the answer is more or less yes. Although dictionary or set operations may require calling Python code for __hash__ (and for that matter for custom __eq__ methods as well), this is all done conceptually 'before' the actual dictionary modification takes place. The actual modification happens all at once, so you'll never see a dictionary with eg a key set but not its corresponding value.

This does mean that writing 'dct[ky] = val' may involve much more Python bytecode running than you expect (and thus a much higher chance that Python switches to another thread before the new key and value are added to the dictionary). But it's always been the case that Python might switch to another thread at almost any bytecode, so this hasn't created a new race, just widened the time window of an existing one you already had.

python/DictHashingComplexity written at 00:28:20; Add Comment

2015-04-15

Illusory security is terrible and is worse than no security

One of the possible responses to my entry on how your entire download infrastructure should be using HTTPS is to say more or less 'well, at least the current insecure approach is trying, surely that's better than ignoring the whole issue'. My answer is simple: no, it's not. The current situation covered in my entry is actually worse than not having any PGP signatures (and perhaps SHA1 hashes) at all.

In general, illusory security is worse than no security because in practice, illusory security fools people and so lulls them into a false sense of security. I'm pretty sure that almost everyone who does anything at all is going to read the Joyent page, faithfully follow the directions, and conclude that they're secure. As we know, all of their checking actually means almost nothing. In fact I'm pretty sure that the Joyent people who set up that page felt that it creates security.

What makes no security better than illusory security is that it's honest. If Joyent just said 'download this tarball from this HTTP URL', everyone would have the same effective security but anyone who was worried about it would know immediately that they have a problem. No one would be getting a false sense of security; instead they would have an honest sense of a lack of security.

It follows that if you're setting up security, it's very important to get it right. If you're not confident that you've got it right, the best thing you can do is shut up about it and not say anything. Do as much as you can to not lead people into a false sense of security, because almost all of them will follow you if you do.

(Of course this is easier said than done. Most people set out to create good security instead of illusory security, so there's a natural tendency to belive that you've succeeded.)

PS: Let me beat the really security-aware people to the punch by noting that an attacker can always insert false claims of security even if you leave them out yourself; since you don't have security, your lack of claims of it is delivered insecurely and so is subject to alteration. It's my view that such alterations are likely to be more dangerous for the attacker over the long term for various reasons. (If all they need is a short-term win, well, you're up the creek. Welcome to security, land of justified paranoia.)

tech/IllusorySecurityTerrible written at 00:34:24; Add Comment

2015-04-14

Allowing people to be in more than 16 groups with an OmniOS NFS server

One of the long standing problems with traditional NFS is that the protocol only uses 16 groups; although you can be in lots of groups on the client (and on the server), the protocol itself only allows the client to tell the server about 16 of them. Recent versions of Illumos added a workaround (based on the Solaris one) where the server will ignore the list of groups the client sent it and look up the UID's full local group membership. Well, sometimes it will do this, if you get all of the conditions right.

There are two conditions. First, the request from the client must have a full 16 groups in it. This is normally what should happen if GIDs are synchronized between the server and the clients, but in exceptional cases you should watch out for this; if the client sends only 15 groups the server won't do any lookups locally and so can deny permissions for a file you actually have access to based on your server GID list.

Second and less obviously, the server itself must be explicitly configured to allow more than 16 groups. This is the kernel tunable ngroups_max, set in /etc/system:

set ngroups_max = 64

Any number larger than 16 will do, although you want it to cover the maximum number of groups you expect people to be in. I don't know if you can set it dynamically with mdb, so you probably really want to plan ahead on this one. On the positive side, this is the only server side change you need to make; no NFS service parameters need to be altered.

(This ngroups_max need is a little bit surprising if you're mostly familiar with other Unixes, which generally have much larger out of the box settings for this.)

This Illumos change made it into the just-released OmniOS r151014 but is not in any earlier version as far as I know. Anyways, r151014 is a LTS release so you probably want to be using it. I don't know enough about other Illumos distributions like SmartOS and Nexenta's offering to know when (or if) this change made it into them.

(The actual change is Illumos issue 5296 and was committed to the Illumos master in November 2014. The issue has a brief discussion of the implementation et al.)

Note that as far as I know the server and the client do not need to agree on the group list, provided that the client sends 16 groups. My test setup for this actually had me in exactly 16 groups on the client and some additional groups on the server, and it worked. This is a potential gotcha if you do not have perfect GID synchronization between server and client. You should, of course, but every so often things happen and things go wrong.

solaris/OmniOSNFSManyGroups written at 00:35:51; Add Comment

2015-04-12

One speed limit on your ability to upgrade your systems

One of the responses on Twitter to Ted Unangst's long term support considered harmful was this very interesting tweet:

[...] it's not "pain" - it just doesn't happen. At 2 weeks of planning + testing = 26 systems per year

This was eye-opening in a 'I hadn't thought about it that way before now' way. Like many insights, it's blindingly obvious in retrospect; of course how fast you can actually do an upgrade/update cycle determines how many of them you can do in a year (given various assumptions about manpower, parallelism, testing, and so on). And of course this limit applies across all of your systems. It's not just that you can only upgrade a given system so many times a year; it's that you get only so many upgrades in a year, period, across all of your systems.

(What the limit is depends very much on what systems you're trying to upgrade, since the planning, setup, and testing process will take different amounts of time for different systems.)

To upgrade systems more frequently, you have two options. First, you can reduce the time an upgrade cycle takes by speeding up or doing less planning, building, testing, and/or the actual deployment. Second, you can reduce the number of upgrades you need to do creating more uniform systems, so you amortize the time a cycle takes across more systems. If you have six special snowflakes running completely different OSes and upgrading each OS takes a month, you get twelve snowflake upgrades in a year (assuming you do nothing else). But if all six run the same OS in the same setup, you now get to upgrade all six of them more or less once a month (let's optimistically assume that deployment is a snap).

I see this as an interesting driver of uniformity (and at all levels, not just at the system level). Depending on how much pre-production testing you need and use, it's also an obvious driver of faster, better, and often more automated tests.

(Looking back I can certainly see cases where this 'we can only work so fast' stuff has been a limiting factor in our own work.)

sysadmin/UpgradeSpeedLimiter written at 22:09:03; Add Comment

Spam victims don't care what business unit is responsible for the spam

So what happened is that the other day I got some spam that our MTA received from one of the outbound.protection.outlook.com machines. Since sometimes I'm stubborn, I actually tried reporting this to abuse@outlook.com. After some go-arounds (apparently the Outlook abuse staff don't notice email messages if they're MIME attachments), I got the following reply:

Thank you for your report. Based on the message header information you have provided, this email appears to have originated from an Office 365 or Exchange Online tenant account. To report junk mail from Office 365 tenants, please send an email to junk@office365.microsoft.com and include the junk mail as an attachment.

Ha ha, no. As I put it on Twitter, your spam victims don't care about what exact business unit is responsible for the specific systems or customers or whatever that sent spam. Sorting that out is your business, not theirs. Telling people complaining about spam to report it to someone else is a classic 'see figure one' response. What it actually means, as everyone who gets this understands, is that Microsoft doesn't actually want to get spam reports and doesn't actually want to stop spam.

Oh, sure, there's probably some internal bureaucratic excuse here. Maybe the abuse@outlook.com team is being scored on metrics like 'spam incidents processed per unit time' and 'amount of spam per unit time', and not having to count this as 'their' spam or spend time forwarding the message to other business units helps the numbers out. But this doesn't let Microsoft off the hook, because Microsoft set these metrics and allows them to stand despite predictable crappy results. If Microsoft really cared, outlook.com would not be the massive spam emitter that it is. Instead Microsoft is thoroughly in the 'see figure one' and 'we're too big for you to block' business, just like a lot of other big email providers.

(For people who do not already know this, 'see figure one' refers to a certain sort of grim humour from the early days of Usenet and possibly before then, as covered here and here. The first one may be more original, but the 'we don't care, we don't have to, we're the phone company' attitude is also authentic for how people read this sort of situation. Application to various modern organizations in your life is left as an exercise to the reader.)

spam/BusinessUnitIndifference written at 02:03:08; Add Comment

2015-04-10

I wish systemd would get over its thing about syslog

Anyone who works with systemd soon comes to realize that systemd just doesn't like syslog very much. In fact systemd is so unhappy with syslog that it invented its own logging mechanism (in the form of journald). This is not news. What people who don't have to look deeply into the situation often don't realize is that systemd's dislike is sufficiently deep that systemd just doesn't interact very well with syslog.

I won't say that bugs and glitches 'abound', because I've only run into two issues so far (although both issues are relatively severe). One was that systemd mis-filed kernel messages under the syslog 'user' facility instead of the 'kernel' one; this bug made it past testing and into RHEL 7 / CentOS 7. The other is that sometimes on boot, randomly, systemd will barf up a significant chunk of old journal messages (sometimes very old) and re-send them to syslog. If you don't scroll back far enough while watching syslog logs, this can lead you to believe that something really bad and weird has happened.

(This has actually happened to me several times.)

This is stupid and wrongheaded on systemd's part. Yes, systemd doesn't like syslog. But syslog is extremely well established and extremely useful, especially in the server space. Part of that is historical practice, part of that is that syslog is basically the only cross-platform logging technology we have, and partly it's because you can do things like forward syslog to other machines, aggregate logs from multiple machines on one, and so on (and do so in a cross-platform way). And a good part of it is because syslog is simple text and it's always been easy to do a lot of powerful ad-hoc stuff with text. That systemd continually allows itself to ignore and interact badly with syslog makes everyone's life worse (except perhaps the systemd authors). Syslog is not going away just because the systemd authors would like it to and it is high time that systemd actually accepted that and started not just sort of working with syslog but working well with it.

One of systemd's strengths until now has been that it played relatively well (sometimes extremely well) with existing systems, warts and all. It saddens me to see systemd increasingly throw that away here.

(And I'll be frank, it genuinely angers me that systemd may feel that it can get away with this, that systemd is now so powerful that it doesn't have to play well with other systems and with existing practices. This sort of arrogance steps on real people; it's the same arrogance that leads people to break ABIs and APIs and then tell others 'well, that's your problem, keep up'.)

PS: If systemd people feel that systemd really does care about syslog and does its best to work well with it, well, you have two problems. The first is that your development process isn't managing to actually achieve this, and the second is that you have a perception problem among systemd users.

linux/SystemdAndSyslog written at 23:42:47; Add Comment

My Firefox 37 extensions and addons (sort of)

A lot has changed in less than a year since I last tried to do a comprehensive inventory of my extensions, so I've decided it's time for an update since things seem to have stabilized for the moment. I'm labeling this as for Firefox 37 since that's the just out latest version, but I'm actually running Firefox Nightly (although for me it's more like 'Firefox Weekly', since I only bother quitting Firefox to switch to the very latest build once in a while). I don't think any of these extension work better in Nightly than in Firefox 37 (if anything, some of them may work better in F37).

Personally I hope I'm still using this set of extensions a year from now, but with Firefox (and its addons) you never know.

Safe browsing:

  • NoScript to disable JavaScript for almost everything. In a lot of cases I don't even bother with temporary whitelisting; if a site looks like it's going to want lots of JavaScript, I just fire it up in my Chrome Incognito environment.

    NoScript is about half of my Flash blocking, but is not the only thing I have to rely on these days.

  • FlashStopper is the other half of my Flash blocking and my current solution to my Flash video hassles on YouTube, after FlashBlock ended up falling over. Note that contrary to what its name might lead you to expect, FlashStopper blocks HTML5 video too, with no additional extension needed.

    (In theory I should be able to deal with YouTube with NoScript alone, and this even works in my testing Firefox. Just not in my main one for some reason. FlashStopper is in some ways nicer than using NoScript for this; for instance, you see preview pictures for YouTube videos instead of a big 'this is blocked' marker.)

  • µBlock has replaced the AdBlock family as my ad blocker. As mentioned I mostly have this because throwing out YouTube ads makes YouTube massively nicer to use. Just as other people have found, µBlock clearly takes up the least memory out of all of the options I've tried.

    (While I'm probably not all that vulnerable to ad security issues, it doesn't hurt my mood that µBlock deals with these too.)

  • CS Lite Mod is my current 'works on modern Firefox versions' replacement for CookieSafe after CookieSafe's UI broke for me recently (I needed to whitelist a domain and discovered I couldn't any more). It appears to basically work just like CookieSafe did, so I'm happy.

I've considered switching to Self-Destructing Cookies, but how SDC mostly works is not how I want to deal with cookies. It would be a good option if I had to use a lot of cookie-requiring sites that I didn't trust for long, but I don't; instead I either trust sites completely or don't want to accept cookies from them at all. Maybe I'm missing out on some conveniences that SDC would give me by (temporarily) accepting more cookies, but so far I'm not seeing it.

My views on Ghostery haven't changed since last time. It seems especially pointless now that I'm using µBlock, although I may be jumping to assumptions here.

User interface (in a broad sense):

  • FireGestures. I remain absolutely addicted to controlling my browser with gestures and this works great.

    (Lack of good gestures support is the single largest reason I won't be using Chrome regularly any time soon (cf).)

  • It's All Text! handily deals with how browsers make bad editors. I use it a bunch these days, and in particular almost of my comments here on Wandering Thoughts are now written with it, even relatively short ones.

  • Open in Browser because most of the time I do not want to download a PDF or a text file or a whatever, I want to view it right then and there in the browser and then close the window to go on with something else. Downloading things is a pain in the rear, at least on Linux.

(I wrote more extensive commentary on these addons last time. I don't feel like copying it all from there and I have nothing much new to say.)

Miscellaneous:

  • HTTPS Everywhere basically because I feel like using HTTPS more. This sometimes degrades or breaks sites that I try to browse, but most of my browsing is not particularly important so I just close the window and go do something else (often something more productive).

  • CipherFox gives me access to some more information about TLS connections, although I'd like a little bit more (like whether or not a connection has perfect forward secrecy). Chrome gets this right even in the base browser, so I wish Firefox could copy them and basically be done.

Many of these addons like to plant buttons somewhere in your browser window. The only one of these that I tolerate is NoScript's, because I use that one reasonably often. Everyone else's button gets exiled to the additional dropdown menu where they work pretty fine on the rare occasions when I need them.

(I would put more addon buttons in the tab bar area if they weren't colourful. As it is, I find the bright buttons too distracting next to the native Firefox menu icons I put there.)

I've been running this combination of addons in Firefox Nightly sessions that are now old enough that I feel pretty confident that they don't leak memory. This is unlike any number of other addons and combinations that I've tried; something in my usage patterns seems to be really good at making Firefox extensions leak memory. This is one reason I'm so stuck on many of my choices and so reluctant to experiment with new addons.

(I would like to be able to use Greasemonkey and Stylish but both of them leak memory for me, or at least did the last time I bothered to test them.)

PS: Firefox Nightly has for some time been trying to get people to try out Electrolysis, their multi-process architecture. I don't use it, partly because any number of these extensions don't work with it and probably never will. You can apparently check the 'e10s' status of addons here; I see that NoScript is not e10s ready, for example, which completely rules out e10s for me. Hopefully Mozilla won't be stupid enough to eventually force e10s and thus break a bunch of these addons.

web/Firefox37Extensions written at 02:14:56; Add Comment

2015-04-09

Probably why Fedora puts their release version in package release numbers

Packaging schemes like RPM and Debian debs split full package names up into three components: the name, the (upstream) version, and the (distribution) release of the package. Back when people started making RPM packages, the release component tended to be just a number, giving you full names like liferea-1.0.9-1 (this is release 1 of Liferea 1.0.9). As I mentioned recently, the modern practice of Fedora release numbers has changed to include the distribution version. Today we have liferea-1.10.13-1.fc21 instead (on Fedora 21, as you can see). Looking at my Fedora systems, this appears to be basically universal.

Before I started writing this entry and really thinking about the problem, I thought there was a really good deep reason for this. However, now I think it's so that if you're maintaining the same version of a package on both Fedora 20 and Fedora 21, you can use the exact same .spec file. As an additional reason, it makes automated rebuilds of packages for (and in) new Fedora versions easier and work better for upgrades (in that someone upgrading Fedora versions will wind up with the new version's packages).

The simple magic is in the .spec file:

Release: 1%{?dist}

The RPM build process will substitute this in at build time with the Fedora version you're building on (or for), giving you release numbers like 1.fc20 and 1.fc21. Due to this substitution, any RPM .spec file that does releases this way can be automatically rebuilt on a new Fedora version without needing any .spec file changes (and you'll still get a new RPM version that will upgrade right, since RPM sees 1.fc21 as being more recent than 1.fc20).

The problem that this doesn't really deal with (and I initially thought it did) is wanting to build an update to the Fedora 20 version of a RPM without updating the Fedora 21 version. If you just increment the release number of the Fedora 20 version, you get 2.fc20 and the old 1.fc21 and then upgrades won't work right (you'll keep the 2.fc20 version of the RPM). You'd have to change the F20 version to a release number of, say, '1.fc20.1'; RPM will consider this bigger than 1.fc20 but smaller than 1.fc21, so everything works out.

(I suspect that the current Fedora answer here is 'don't try to do just a F20 rebuild; do a pointless F21 rebuild too, just don't push it as an update'. Really there aren't many situations where you'd need to do a rebuild without any changes in the source package, and if you change the source package, eg to add a new patch, you probably want to do a F21 update too. I wave my hands.)

PS: I also originally thought that Ubuntu does this too, but no; while Ubuntu embeds 'ubuntu' in a lot of their package release numbers, it's not specific to the Ubuntu version involved and any number of packages don't have it. I assume it marks packages where Ubuntu deviates from the upstream Debian package in some way, eg included patches and so on.

linux/FedoraRPMReleaseNumberIssue written at 00:50:23; Add Comment

2015-04-08

Your entire download infrastructure needs to use HTTPS

Let's start with something that I tweeted:

Today's security sadface: joyent's Illumos pkgsrc download page is not available over https, so all those checksums/etc could be MITMd.

Perhaps it is not obvious what's wrong here. Well, let's work backwards. The Joyent pkgsrc bootstrap tar archive is served over plain HTTP, so a man in the middle attacker can serve us a compromised tarball when we use curl to fetch it. That's obvious, and the page gives us a SHA1 checksum and a PGP key to verify the tarball. But the page itself is served over over plain HTTP, so the man in the middle attacker could alter it too so it has the SHA1 checksum of their compromised tarball. So surely the PGP verification will save us? No, once again we are undone by HTTP; both the PGP key ID and the detached PGP ASCII signature are served over HTTP, so our MITM attacker can alter the page to have a different PGP key ID and then serve us a detached PGP ASCII signature made with it for their compromised tarball.

(Even if retrieving the PGP key itself from the keyserver is secure, the attacker can easily insert their own key with a sufficiently good looking email address and so on. Or maybe even a fully duplicated email address and other details.)

There's a very simple rule that everyone should follow here: every step of a download process needs to be served over HTTPS. For instance, even without PGP keys et al in the picture it isn't sufficient to serve just the tarball over HTTPS, because a MITM attacker can rewrite the plaintext 'download things here' page to tell you to download the package over HTTP and then they have you. The entire chain needs to be secure (and forced that way) and from as far upstream in the process as you can manage (eg from the introductory pkgsrc landing page on down, because otherwise the attacker changes the landing page to point to a HTTP download page that they supply and so on).

Of course, having some HTTPS is better than none; it at least makes attackers work harder if they have to not just give you a different tarball than you asked for but also alter a web page in flight (but don't fool yourself that this is much more work, not with modern tools). And it's good to not rely purely on HTTPS by itself; SHA1 checksums and PGP signatures are at least cross-verification and can detect certain sorts of problems.

By the way, in case you think that this is purely theoretical, see the case of some Tor exit nodes silently patching nasty stuff into binaries fetched through them with HTTP. And I believe that there are freely available tools that will do on the fly alterations to web pages they detect you fetching over insecure wireless networks.

(I don't feel I'm unfairly picking on Joyent here because clearly they care not just about the integrity of the tarball but also its security, since they give not just a SHA1 (which might just be for an integrity check) but also a PGP key ID and a signature checking procedure.)

web/HttpsAndDownloads written at 01:04:58; Add Comment

2015-04-07

How Ubuntu and Fedora each do kernel packages

I feel the need to say things about the Ubuntu (and I believe Debian) kernel update process, but before I do that I want to write down how kernel packages look on Ubuntu and Fedora from a sysadmin's perspective because I think a number of people have only been exposed to one or the other. The Fedora approach to kernel packages is also used by Red Hat Enterprise Linux (and CentOS) and probably other Linux distributions that use yum and RPMs. I believe that the Ubuntu approach is also used by Debian, but maybe Debian does it a bit differently; I haven't run a real Debian system.

Both debs and RPMs have the core concepts of a package having a name, an upstream version number, and a distribution release number. For instance, Firefox on my Fedora 21 machine is currently firefox, upstream version 37.0, and release 2.fc21 (increasingly people embed the distribution version in the release number for reasons beyond the scope of this entry).

On Fedora you have some number of kernel-... RPMs installed at once. These are generally all instance of the kernel package (the package name); they differ only in their upstream version number and their release number. Yum normally keeps the most recent five of them for you, deleting the oldest when you add a new one via a 'yum upgrade' when a new version of the kernel package is available. This gives you a list of main kernel packages that looks like this:

kernel-3.18.8-201.fc21.x86_64
kernel-3.18.9-200.fc21.x86_64
kernel-3.19.1-201.fc21.x86_64
kernel-3.19.2-201.fc21.x86_64
kernel-3.19.3-200.fc21.x86_64

Here the kernel RPM with upstream version 3.19.3 and Fedora release version 200.fc21 is the most recent kernel I have installed (and this is a 64-bit machine as shown by the x86_64 architecture).

(This is a slight simplification. On Fedora 21, the kernel is actually split into three kernel packages: kernel, kernel-core, and kernel-modules. The kernel package for a specific version is just a meta-package that depends (through a bit of magic) on its associated kernel-core and kernel-modules packages. Yum knows how to manage all of this so you keep five copies not only of the kernel meta-package but also of the kernel-core and kernel-modules packages and so on. Mostly you can ignore the sub-packages in Fedora; I often forget about them. In RHEL up through RHEL 7, they don't exist and their contents are just part of the kernel package; the same was true of older Fedora versions.)

Ubuntu is more complicated. There is a single linux-image-generic (meta-)package installed on your system and then some number of packages with the package name of linux-image-<version>-<release>-generic for various <version> and <release> values. Each of these packages has a deb upstream version of <version> and a release version of <release>.<number>, where the number varies depending on how Ubuntu built things. Each specific linux-image-generic package version depends on a particular linux-image-<v>-<r>-generic package, so when you update to it it pulls in that specific kernel (at whatever the latest package release of it is).

Because of all of this, Ubuntu systems wind up with multiple kernels installed at once by the side effects of updating linux-image-generic. A new package version of l-i-g will depend on and pull in an entirely new linux-image-<v>-<r>-generic package, leaving the old linux-image-*-generic packages just sitting there. Unlike with yum, nothing in plain apt-get limits how many old kernels you have sitting around; if you leave your server alone, you'll wind up with copies of all kernel packages you've ever used. As far as the Ubuntu package system sees it, these are not multiple versions of the same thing but entirely separate packages, each of which you have only one version of.

This gives you a list of packages that looks like this (splitting apart the package name and the version plus Ubuntu release, what 'dpkg -l' calls Name and Version):

linux-image-3.13.0-24-generic   3.13.0-24.47
linux-image-3.13.0-45-generic   3.13.0-45.74
linux-image-3.13.0-46-generic   3.13.0-46.79
linux-image-3.13.0-48-generic   3.13.0-48.80

linux-image-generic             3.13.0.48.55

(I'm simplifying again; on Ubuntu 14.04 there are also linux-image-extra-<v>-<r>-generic packages.)

On this system, the current 3.13.0.48.55 version of linux-image-generic depends on and thus requires the linux-image-3.13.0-48-generic package, which is currently 'at' the nominal upstream version 3.13.0 and Ubuntu release 48.80. Past Ubuntu versions of linux-image-generic depended on the other linux-image-*-generic packages and caused them to be installed at the time.

I find the Fedora/RHEL approach to be much more straightforward than the Ubuntu approach. With Fedora, you just have N versions of the kernel package installed at once; done. With Ubuntu, you don't really have multiple versions of any given package installed; you just have a lot of confusingly named packages, each of which has one version installed, and these packages get installed on your system as a side effect of upgrading another package (linux-image-generic). As far as I know the Ubuntu package system doesn't know that all of these different named packages are variants of the same thing.

(A discussion of some unfortunate consequences of this Ubuntu decision is beyond the scope of this entry. See also.)

Sidebar: kernel variants

Both Ubuntu and Fedora have some variants of the kernel; for instance, Fedora has a PAE variant of their 32-bit x86 kernel. On Fedora, these get a different package name, kernel-pae, and everything else works in the same way as for normal kernels (and you have both PAE and regular kernels installed at the same time; yum will keep the most recent five of each).

On Ubuntu I believe these get a different meta-package that replaces linux-image-generic, for example linux-image-lowlatency, and versions of this package depend on specific kernel packages with different names, like linux-image-<v>-<r>-lowlatency. You can see the collection with 'apt-cache search linux-image'.

Both Fedora and Ubuntu have changed how they handled kernel variants over time; my memory is that Ubuntu had to change more in order to become more sane. Today their handling of variants strikes me as reasonably close to each other.

linux/UbuntuVsFedoraKernelPackages written at 01:24:19; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.