Wandering Thoughts archives

2016-08-10

Systemd has a problem with SATA disks behind port multipliers

The servers for our disk based backup system are all running Ubuntu 12.04, because that's what was current when we last had to touch them. Ubuntu 12.04 is on its way out, so we're starting to rebuild all of our 12.04 machines on 16.04, and today we did this to the first backup server. Fortunately it was a server used for our long-term backups server and thus something we don't need right away, because 16.04 turns out to have a problem here.

All of our backup servers use external SATA disk enclosures and SATA port multipliers, partly because that's what we had available and partly because that was the inexpensive option at the time (and maybe still today). When we first booted the machine in 16.04, it looked like it was only detecting one of the disks on each port multiplier channel instead of all four. Further investigation showed that all the disks were being detected, but only one out of every four was showing up in /dev/disk/by-path, which we rely on to give us stable identifiers for each disk slot in the external enclosure. More than that, the path identifiers are different. On 12.04, we got /dev/disk/by-path identifiers like pci-0000:02:00.0-scsi-2:0:0:0 while on 16.04, they're like pci-0000:02:00.0-ata-1.

One obvious difference is that 16.04 uses systemd and systemd has swallowed udev and in the process likely made a number of changes to it (in the grand systemd tradition). Certainly some Internet searches found suggestive bits. Unfortunately this turns out to be somewhat of a red herring; the real cause is less active damage (by systemd and udev) and more non-benign neglect and ignorance that has been exposed by the kernel changing the underlying sysfs topology to be more honest.

In 12.04 (with what is now an ancient kernel), a typical port multiplier disk shows up in sysfs as (take a deep breath):

/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:00.0/host8/target8:0:0/8:0:0:0/block/sdk

This is essentially claiming itself to be a generic SCSI disk. You have to look hard at attributes in various spots in sysfs to find out the truth, and the 12.04 udev does not; it considers this disk to be a generic SCSI disk and handles its naming like any other SCSI controller (we can see that from the '-scsi-' bit that is in the by-path name).

In 16.04, this same disk slot is:

/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:00.0/ata1/host0/target0:0:0/0:0:0:0/block/sda

(Yes, the sdX names keep changing. That's why we need /dev/disk/by-path.)

This addition of ata1 to the full path appears to trigger a different code path in systemd's version of udev, one that specifically deals with (S)ATA disks. This code path does not believe that there can be more than one disk per (S)ATA host node (the host0 here), and so it gives all four disks on this port the same ID_PATH value, one based purely on the ATA port number of the port they're all attached through; they are all pci-0000:02:00.0-ata-1. Naturally there can be only one /dev/disk/by-path/pci-0000:02:00.0-ata-1 directory entry, so three out of the four disks lose out.

(You can see this code in src/udev/udev-builtin-path_id.c in handle_scsi_ata(); it's present in both Ubuntu 16.04's systemd-229 and the current git tip here. The corresponding code for the generic case of SCSI-like devices is much more complicated.)

This is what I could politely call an oversight on the part of the systemd/udev conglomerate. The code for giving ATA devices names has stayed unchanged since it was introduced in 2015 (where it replaced 2012 code that skipped them entirely), so it's always had this issue. Had the kernel not switched to honestly reporting these ports as ATA ports instead of generic SCSI host ports, we could have missed seeing this, as the naive ATA-handling code would never have been exercised. Now, though, we're left with the mess. I've filed Ubuntu bug 1611945, although I don't know if it'll do any good.

(Now that writing this entry has caused me to discover the exact problem, I'm going to be able to refine the Ubuntu bug report. Unfortunately I can't report a bug directly to the upstream systemd, although I'm convinced it's still there in their code, because Ubuntu 16.04 doesn't have a systemd version that's within their 'you can report' window.)

What I don't have any answers for is the best way to deal with this issue. We could try 14.04 (although it'd have similar problems if its kernel has the sysfs topology change), or perhaps we could write a bunch of additional udev rules to create our own hard-coded version of /dev/disk/by-path using PCI identifiers and so on. I admit that the idea of writing udev rules is somewhat scary, as the whole area has never struck me as either easy or well documented.

(Probably the udev rule approach is the best solution.)

linux/SystemdSATAPortMultiplierProblem written at 23:15:33; Add Comment

A look into a future where things assume you have a smartphone

My local bike club has increasingly been going paperless for bike ride routes. I've been a paper holdout, but recently I pushed myself into deciding to finally get a bike GPS unit for ride navigation. Naturally this led to a frenzy of Internet research on the whole area, the summary of which is that there are only two serious competitors in this area right now (which is much less than I was expecting).

At this point, Garmin has probably been making the Garmin Edge series of GPS bike computers for a decade or more, and my impression is that they haven't fundamentally changed how they work over that time. As you'd expect for something that originated so long ago, they're pretty much self-contained boxes that believe in doing most everything themselves (although modern models can use your smartphone as a peripheral). On the one hand, this makes them pleasantly easy to deal with even from things like Linux machines, because they require so little from your computer (an Edge unit presents itself as a USB disk, and you do things like transfer ride routes to it by putting files in a magic directory). On the other hand, a Garmin Edge is a slow, pokey device with a relatively small screen by modern standards and because the Edge is determined to do everything itself, your experience is very dependent on how well it implements things like turn by turn directions. Sometimes this does not go well, and in general the experience of working with an Edge is not exactly fluid and fast.

The Wahoo Elemnt is Garmin's only serious competition right now, and it has taken a very different approach. Rather than try to be self-contained, it fundamentally assumes that you have a smartphone and so your Elemnt offloads a bunch of preparation work on to it and the outside world. Configuring the Elemnt for things like what data fields get displayed? That's done through the smartphone app, using the smartphone's much bigger and more responsive display. And in what I care about, route planning and preparing turn by turn directions is outsourced to various existing websites that specialize in it (and that do a good job, giving me a lot of control and the ability to fiddle with things in advance). The Elemnt can still be a standalone device when you're riding, but its clear philosophy is that it is one component in a larger environment and it doesn't have to try to do everything itself.

(Like modern Garmin Edges, the Elemnt can use your smartphone as a peripheral in various ways during rides.)

The Garmin philosophy of being self contained made a great deal of sense ten years ago, and probably was pretty much necessary at the time. Ten years ago Google Maps had barely been launched, high speed Internet was a lot less high speed, computers were slower (and screens smaller) and probably less prevalent among cyclists, smartphones were barely there, and so on. Really, there was not much of a greater environment to be part of. But I think the Wahoo philosophy clearly makes a lot of sense in today's world (even if it's inconvenient for me, since I don't have a smartphone yet). A GPS bike computer that can run for ten hours or more under the hot sun with the screen on and GPS going will never be able to compete in screen resolution, processing power, or responsiveness with a decent smartphone, so why even try? Build something that's relentlessly specialized in what it can do well, and let smartphones handle what they're good at.

I didn't expect to get a look into the future when I started researching GPS bike computers, but I think I have. I can't see any reason for the Wahoo Elemnt approach not to become more and more common over time; more devices will just assume that you can interact with them through your smartphone, and in fact that you'll want to because it gives you a better experience.

(Yes, there are some things you give up when devices stop being self-contained. It is potentially useful that many Garmin Edge models can work out routes all on their own, for example if the skies open up and I want to bail out of a bike ride in order to get home as fast as possible. You can theoretically do this in the field with an Elemnt and your smartphone, but it's probably not going to be as easy.)

PS: This may be less a look into the future and more a look into an increasingly prevalent present. I don't look into modern electronic stuff very often, so I wouldn't be surprised if there's already a lot of things that require you to have a smartphone and I just haven't noticed.

PPS: The Magellan Cyclo series appears to be strictly inferior to the Garmin Edge if you primarily care about navigation. At least as far as turn directions go, it definitely takes the self-contained 'I know better than you' approach (and it doesn't necessarily).

(I have acquired some opinions here, as you can tell.)

tech/AssumingSmartphoneGadgetFuture written at 01:04:43; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.