Wandering Thoughts

2017-05-15

Thinking about how much asynchronous disk write buffering you want

Pretty much every modern system defaults to having data you write to filesystems be buffered by the operating system and only written out asynchronously; you have to take special steps either to make your write IO synchronous or to force it to disk (which can lead to design challenges). When the operating system is buffering data like this, one obvious issue is the maximum amount of data it should let you buffer up before you have to slow down or stop.

Let's start with two obvious observations. First, if you write enough data, you will always have to eventually slow down to the sustained write speed of your disk system. The system only has so much RAM; even if the OS lets you use it all, eventually it will all be filled up with your pending data and at that point you can only put more data into the buffer when some earlier data has drained out. That data drains out at the sustained write speed of your disk system. The corollary to this is that if you're going to write enough data, there is very little benefit to letting you fill up lots of a write buffer; the operating system might as well limit you to the sustained disk write speed relatively early.

Second, RAM being used for write buffers is often space taken away from other productive uses for that RAM. Some times you will read back some of the written data and be able to get it from RAM, but if it is purely written (for now) then the RAM is otherwise wasted, apart from any benefits that write buffering may get you. By corollary with our first observation, buffering huge amounts of write-only data for a program that is going to be limited by disk write speed is not productive (because it can't even speed the program up).

So what are the advantages of having some amount of write buffering, and how much do we need to get them?

  • It speeds up programs that write occasionally or only once and don't force their data to be flushed to the physical disk. If their data fits into the write buffer, these programs can continue immediately (or exit immediately), possibly giving them a drastic performance boost. The OS can then write the data out in the background as other things happen.

    (Per our first observation, this doesn't help if the collection of programs involved write too much data too fast and overwhelm the disks and possibly your RAM with the speed and volume.)

  • It speeds up programs that write in bursts of high bandwidth. If your program writes a 1 GB burst every minute, a 1 GB or more write buffer means that it can push that GB into the OS very fast, instead of being limited to the (say) 100 MB/s of actual disk write bandwidth and taking ten seconds or more to push out its data burst. The OS can then write the data out in the background and clear the write buffer in time for your next burst.

  • It can eliminate writes entirely for temporary data. If you write data, possibly read it back, and then delete the data fast enough, the data needs never be written to disk if it can be all kept in the write buffer. Explicitly forcing data to disk obviously defeats this, which leads to some tradeoffs in programs that create temporary files.

  • It allows the OS to aggregate writes together for better performance and improved data layout on disk. This is most useful when your program issues comparatively small writes to the OS, because otherwise there may not be much improvement to be had from aggregating big writes into really big writes. OSes generally have their own limits on how much they will aggregate together and how large a single IO they'll issue to disks, which clamps the benefit here.

    (Some of the aggregation benefit comes from the OS being able to do a bunch of metadata updates at once, for example to mark a whole bunch of disk blocks as now used.)

    More write buffer here may help if you're writing to multiple different files, because it allows the OS to hold back writes to some of those files to see if you'll write more data to them soon enough. The more files you write to, the more streams of write aggregation the OS may want to keep active and the more memory it may need for this.

    (With some filesystems, write aggregation will also lead to less disk space being used. Many filesystems that compresses data are one example, and ZFS in general can be another one, especially on RAIDZ vdevs (and also).)

  • If the OS starts writing out data in the background soon enough, a write buffer can reduce the amount of time a program takes to write a bunch of data and then wait for it to be flushed to disk. How much this helps depends partly on how fast the program can generate data to be written; for the best benefit, you want this to be faster than the disk write rate but not so fast that the program is done before much background write IO can be started and completed.

    (Effectively this converts apparently synchronous writes into asynchronous writes, where actual disk IO overlaps with generating more data to be written.)

Some of these benefits require the OS make choices that push against each other. For example, the faster the OS starts writing out buffered data in the background, the more it speeds up the overlapping write and compute case but the less chance it has to avoid flushing data to disk that's written but then rapidly deleted (or otherwise discarded).

How much write buffering you want for some of these benefits depends very much on what your programs do (individually and perhaps in the aggregate). If your programs write only in bursts or fall into the 'write and go on' pattern, you only need enough write buffer to soak up however much data they're going to write in a burst so you can smooth it out again. Buffering up huge amounts of data for them beyond that point doesn't help (and may hurt, both by stealing RAM from more productive uses and from leaving more data exposed in case of a system crash or power loss).

There is also somewhat of an inverse relationship between the useful size of your write buffer and the speed of your disk system. The faster your disk system can write data, the less write buffer you need in order to soak up medium sized bursts of writes because that write buffer clears faster. Under many circumstances you don't need the write buffer to store all of the data; you just need it to store the difference between what the disks can write over a given time and what your applications are going to produce over that time.

(Conversely, very slow disks may call in theory call for very big OS write buffers, but there are often practical downsides to that.)

WriteBufferingHowMuch written at 23:55:52; Add Comment

2017-05-14

People don't like changes (in computer stuff)

There are always some people who like to fiddle around with things. Some number of photographers are always shuffling camera settings or experimenting with different post-processing; some number of cyclists are always changing bits of their bikes; some car enthusiasts like fiddling with engines and so on. But most people are not really interested in this; they want to get something that works and then they want it to keep on just like that, because it works and it's what they know.

Computers are not an exception to this. For most people, a computer is merely a tool, like their car. What this means is that people don't like their computers to change, any more than they want other things in their life to change. Imagine how it would be if every time you took your car in for service, the mechanics changed something about how the dashboard and controls worked, and every few years during a big service call they would replace the dashboard entirely with a new one that maybe mostly looked and worked the same. Or not. You and many other people would find it infuriating, and pretty soon people would stop bringing their cars in for anything except essential service.

Unfortunately us computer people really love to change things in updates, and of course 'upgrade' is generally synonymous with 'changes' in practice. Against all available evidence we are convinced that people want the latest shiny things we come up with, so we have a terrible track record of forcing them down people's throats. This is not what people want. People want stuff to work, and once it works they want us to stop screwing with it because it works, thanks. People are well aware that us screwing with stuff could perhaps improve it, but that's what everyone claims about all changes; rarely do people push out a change that says 'we're making your life worse' and most changes are created with the belief that they're either necessary or an improvement. However, much of the time the changes don't particularly make people's lives clearly better, and when they do make people's lives better in the long run there is often a significant payoff period that makes the disruption not worth it in the short run.

(Rare and precious is a non-bugfix update that immediately makes people's lives better. And bugfix updates are just making things work the way they should have in the first place.)

In my opinion, this is a fundamental reason why forcing updates on people is not particularly a good answer to people not patching. Unless upgrades and updates magically stop changing things, forcing updates means forcing changes, which makes people unhappy because they generally very much do not want that.

(There is also the chance that an update will do harm. Every time that happens, people's trust in updates decays along with their willingness to take the risk. If your system works now, applying an update might keep it working or it might blow things up, so applying an update is always a risk.)

PeopleDislikeChanges written at 00:51:20; Add Comment

2017-05-13

People don't patch systems and that's all there is to it

Recently (ie, today) there has been all sorts of commotion in the news about various organizations getting badly hit by malware that exploits a vulnerability that was patched by Microsoft in MS17-010, a patch that was released March 14th. I'm sure that the usual suspects are out in force pointing their fingers at organizations for not patching. In response to this you might want to read, say, Steve Bellovin on the practical difficulties of patching. I agree with all of this, of course, but I have an additional perspective.

Although one may dress it up in various ways, real computer security ultimately requires understanding what people actually do and don't do. By now we have a huge amount of experience in this area about what happens when updates are released, and so we know absolutely for sure that people often don't apply updates, and the extended version of this, which is people often still stick with things that aren't getting security updates. You can research why this happens and argue about how sensible they are in doing so and what the balance of risks is, but the ground truth is that this is what happens. Much as yelling at people has not magically managed to stop them from falling for phish and malware links in email (for all sorts of good reasons), yelling at people has not persuaded them to universally apply patches (and to update no longer supported systems) and it is not somehow magically going to do so in the future. If your strategy to deal with this is 'yell harder' (or 'threaten people more'), then it is a more or less guaranteed failure on day one.

(If we're lucky, people apply patches and updates sometime, just not right away.)

Since I don't know what the answers are, I will leave corollaries to this blunt fact as an exercise for the reader.

(I'm not throwing stones here, either. I have systems of my own that are out of date or even obsolete (my Linux laptop is 32-bit, and 32-bit Linux Chrome hasn't gotten updates for some years now). Some of the time I don't have any particularly good reason why I haven't updated; it's just that it's too much of a pain and disruption because it requires a reboot.)

PS: I'm pretty sure that forcing updates down people's throats is not the answer, at least not with the disruptive updates that are increasingly the rule. See, for example, people's anger at Microsoft forcing Windows reboots on them due to updates.

PeopleDontPatch written at 00:20:10; Add Comment

2017-05-11

The challenges of recovering when unpacking archives with damage

I wrote recently about how 'zfs receive' makes no attempt to recover from damaged input, which means that if you save 'zfs send' output somewhere and your saved file gets damaged, you are up the proverbial creek. It is worth mentioning that this is not an easy or simple problem to solve in general, and that doing a good job of this is likely going to affect a number of aspects of your archive file format and how it's processed. So let's talk a bit about what's needed here.

The first and most obvious thing you need is an archive format that makes it possible to detect and then recover from damage. Detection is in some sense easy; you checksum everything and then when a checksum fails, you know damage has started. More broadly, there are several sorts of damage you need to worry about: data that is corrupt in place, data that has been removed, and data that has been inserted. It would be nice if we could assume that data will only get corrupted in place, but my feelings are that this assumption is unwise.

(For instance, you may get 'removed data' if something reading a file off disk hits a corrupt spot and spits out only partial or no data for it when it continues on to read the rest of the file.)

In-place corruption can be detected and then skipped with checksums; you skip any block that fails its checksum, and you resume processing when the checksums start verifying again. Once data can be added or removed, you also need to be able to re-synchronize the data stream to do things like find the next start of a block; this implies that your data format should have markers, and perhaps some sort of escape or encoding scheme so that the markers can never appear in actual data. You want re-synchronization in your format in general anyway, because one of the things that can get corrupt is the 'start of file' marker; if it gets corrupted, you obviously need to be able to unambiguously find the start of the next file.

(If you prefer, call this a more general 'start of object' marker, or just metadata in general.)

So you have an archive file format that has internal markers for redundancy and where you can damage it and resynchronize with as little data lost and unusable as possible. But this is just the start. Now you need to look at the overall structure of your archive and ask what happens if you lost some chunk of metadata; how much of the archive is unable to be usefully processed? For example, suppose that data in the archive is identified by inode number, you have a table mapping inode numbers to filenames, and this table can only be understood with the aid of a header block. Then if you lose the header block to corruption, you lose all of the filenames for everything in the archive. The data in the archive may be readable in theory, but it's not useful in practice unless you're desperate (since you'd have to go through a sea of files identified only by inode number to figure out what they are and what directory structure they might go into).

Designing a resilient archive format, one that recovers as much as possible in the face of corruption, often means designing an inconvenient or at least inefficient one. If you want to avoid loss from corruption, both redundancy and distributing crucial information around the archive are your friends. Conversely, clever efficient formats full of highly compressed and optimized things are generally not good.

You can certainly create archive formats that are resilient this way. But it's unlikely to happen by accident or happenstance, which means that an archive format created without resilience in mind probably won't be all that resilient even if you try to make the software that processes it do its best to recover and continue in the face of damaged input.

ResilientArchivesChallenges written at 02:07:30; Add Comment

2017-05-05

The temptation of a Ryzen-based machine for my next office workstation

My office workstation is the same hardware build as my current home machine, which means that it's also more than five years old by now. I was not necessarily expecting to replace it soon, but this week things have started to happen to it. First there was an unexpected system lockup and reboot and then today my CPU reporting thermal throttling, with the always fun kernel messages of:

CPU1: Core temperature above threshold, cpu clock throttled
CPU3: Package temperature above threshold, cpu clock throttled
CPU2: Package temperature above threshold, cpu clock throttled
CPU0: Package temperature above threshold, cpu clock throttled
CPU1: Package temperature above threshold, cpu clock throttled

(All of my system's fans are working; I checked. Some sources suggest that the first step here is to take the CPU and heatsink apart, clean off the current thermal paste, and re-paste it. I may try to do this on Monday if we have thermal paste around at work, but the timing is terrible as I'm about to go on vacation.)

For obvious reasons this has pushed me into thinking more actively about replacement hardware for my office machine, and when I start thinking about that, I have to admit that building an AMD Ryzen based machine feels like an attractive idea despite what I said about Ryzen for my likely next home machine. There are either three or two sensible reasons for this and one emotional one.

The first reason is that I generally do more multi-core things on my office machine than on my home machine; I run VMs (and might run more if it had less impact on things) and I compile software more often for various fuzzy reasons (and some of the time I care more about how fast this happens, for example if I'm bisecting some problem in an open-source project like Firefox). In theory this makes single core CPU performance less important and many well-performing cores more useful, especially in the future as more things become multi-core enabled (for example, the Go developers are working on concurrent compilation of single files).

The complicated and only potential reason is that work is more price sensitive about things like CPU costs than I am for my home machine. I started out thinking that the Intel i7-7700K was more expensive than the Ryzen 7 models, but this turns out to be wrong; at current Canadian prices, the i7-7700K and the Ryzen 1700X are about the same price (the Ryzen 1800X is clearly more and the Ryzen 1700 is only a bit cheaper). However these are still relatively expensive CPUs, so I might well get forced down to, say, something in the range of the i5-7600K and the Ryzen 5 1600X. At this level people seem to think that the Ryzen 5 is the relatively clear winner; you don't lose as much on single-core performance and you pick up a significant edge in multi-core work.

The third reason is the possibility of ECC support. At least some AMD Ryzen motherboards do seem to actually support this in practice and if I more or less get it for free, I'll definitely take it. It's only a 'nice to have' thing, though; I wouldn't give up anything substantive to get it, even (or especially) on my office machine.

The emotional reason is that I want the plucky underdog AMD to make good, and I want to support them. I don't particularly like Intel's domination and various things it leads to (such as their ECC non-support) and I would be perfectly happy to be part of giving them a real challenge for once. If a Ryzen based system is competitive with an Intel one, I'm somewhat irrationally biased in favour of the AMD option.

(For example, going with the AMD option would require a graphics card and I haven't looked at the relative level of motherboard features that I'd probably wind up with. My emotional 'I would like AMD' reaction has pushed those pragmatic issues out to the periphery. For that matter, apparently there are memory speed issues with AMD Ryzens and 32 GB of RAM, and memory bandwidth may matter to at least some of what I do.)

AMDRyzenOfficeTemptation written at 23:50:00; Add Comment

2017-05-01

When a TLS client's certificate is offered to the TLS server

Everyone knows that TLS servers have certificates and TLS clients check those certificates when they connect to the server. What is probably less well know is that TLS clients can have certificates as well, and the server can ask a client for a certificate as part of the TLS handshake. As the result of a Twitter conversation today, I wound up wondering and guessing if TLS client certificates get sent to the server after the connection becomes encrypted.

The short answer: they don't. The TLS 1.2 RFC's handshake protocol overview describes it in relatively short form and is explicit that the client certificate is sent to the server before the connection becomes encrypted, not afterward.

Does this matter? Yes, sort of. Because the TLS handshake protocol is not encrypted, a passive observer can eavesdrop on it and fully decode it. This is handy if you want to use wireshark to see what cryptography people actually get with your server, but it also means that a passive observer (such as an ISP) can see what client certificate is in use if you use client certificates. Since client certificates are hopefully unique, this allows such an eavesdropper to identify such clients. They may not know who you are, but they can correlate different clients by their certificates.

(If client certificates carry real identification of the user in the Subject Name field, an eavesdropper can read all of these out in plain text. Sadly I suspect that there are client certificates that carry such data, because putting all of the proper Distinguished Name details certainly sort of looks like the proper thing to do. This might be okay for client certificates that are intended to be used only inside an organization and never exposed over the Internet, but I would avoid it in anything intended for use across the Internet.)

The good news is that client certificates aren't used much from what I've seen. They may be more used inside organizations, but there the potential threat from eavesdroppers is hopefully much lower. And since client certificates are so little used, probably no one bothers implementing this eavesdropping attack in practice because the payoff is too low.

(The only time I've personally used TLS client certificates is for StartSSL. I assume StartSSL opted for TLS client certificates because they're probably harder for users to compromise than passwords, since you can't exactly write them down (although you can send them around in email and so on).)

TLSClientCertificateWhen written at 22:19:36; Add Comment

2017-04-28

Hardware RAID and the problem of (not) observing disk IO

We normally don't use machines with hardware RAID controllers; we far prefer software RAID. But in the grand university tradition of not having any money, sometimes we inherit machines with them and have to make them work. Which brings me to my tweet:

This hardware RAID controller's initialization of an idle RAID-6 array of 2 TB disks has been running for more than 24 hours now.

At one level this is probably typical; it likely took about as long for Linux's software RAID to initialize a similar RAID-6 array on another similar machine recently. I suspect that both of them are (or were) doing it in one of the slow ways.

But on another level, the problem is that I have no idea if anything is wrong here. One reason for that is that as far as I know, there's no way on this hardware RAID controller for me to see IO performance information for the individual disks. All I get is aggregate performance (or staring at the disk activity lights, except that on this piece of hardware they're basically invisible). The result is that I have no idea what my disks are actually doing and how fast they're doing that. Is the initialization going slowly because it's seek-heavy, or because the hardware RAID controller is only doing IO slowly (to preserve IO bandwidth for non-existent other requests), or because the disks just aren't going that fast, or some other reason? I can't tell. It's an opaque black box.

In a world where disks either work perfectly as specified or fail outright, this might be okay. You could measure (or know) the raw disk performance, and then use the observed array performance to derive more or less what load each disk should be seeing and how it must be performing. Obviously there are a lot of uncertainties and assumptions in that; we're assuming that IO is divided evenly over the drives, for example. But this is not that world; in this world, disks can quietly have their performance degrade. In an array with evenly distributed IO, this will have the effect of 'smearing' a single degraded disk's bad performance across the entire array. Instead of seeing that one disk has become a lot slower, it looks like all your disks have slowed down somewhat. And if you suspect that you have such a quietly degraded drive, well, good luck finding it.

I don't know if really good hardware RAID has this sort of observability built into it; I've only had one exposure to theoretically enterprise-level RAID, which had its issues. But I'm pretty sure that garden variety hardware RAID doesn't, and it's become clear to me that that is a pretty big strike against garden variety hardware RAID. Without low level observability it's very close to impossible to diagnose any number of performance problems. Black box hardware RAID that you don't have to think about is all very good until it turns into a badly performing black box, and then all you can do is throw it away and start again.

HardwareRAIDObservability written at 01:32:26; Add Comment

2017-04-19

Some things on how PCs boot the old fashioned BIOS way

A while back I wrote on how PCs boot and hard disk partitioning, which gave a quick description of the original BIOS MBR-based method of booting a PC (in contrast to UEFI booting). It's worth talking about BIOS booting in slightly more detail so that we can understand some constraints on it.

In practice, BIOS MBR booting goes through at least two and often three stages. The first stage loader is started by the BIOS when the BIOS loads the first 512-byte sector from an appropriate hard disk (which gets complicated these days) and then transfers control to it. Generally all that the first stage does is load more code and data from disk, in other words load the second stage. Because the first stage has only a few hundred bytes to work with, most first-stage loaders hard code the disk block locations where they find the second stage, and basically all first stage loaders use BIOS services to read data from disk.

The second stage boot code contains at least the core of the bootloader, and is often fairly comprehensive. For GRUB, I believe the second stage image contains the GRUB shell and menu system, various software RAID and filesystem modules (such as the one that understands ext4), and some other things. For GRUB, the second stage can't boot your system by itself; instead it will proceed onwards to read your grub.cfg, possibly load additional GRUB modules that your grub.cfg tells it to, and then for Linux hopefully finish by loading your selected kernel and initramfs into memory and starting them.

I believe that most second stage bootloaders continue to use BIOS services to read any additional data they need from disk (for GRUB, your grub.cfg, any additional GRUB modules, and then finally the kernel and initramfs). This means that BIOS MBR based bootloaders are subject to the whims and limitations of those BIOS services, which in some situations may be significant.

It's common for bootloaders to read data directly out of your filesystem. This obviously means that they need to be able to understand your filesystem and anything it sits on top of (such as software RAID, btrfs, LUKS, and so on). Since bootloader filesystem and so on support can lag behind kernel support, this can leave you wanting (or needing) a separate /boot that your bootloader can understand.

A certain amount of this is a lot simpler with UEFI booting. UEFI has a concept of files in a (simple) filesystem, real NVRAM variables, and a reasonably rich collection of UEFI services. This means that bootloaders don't have to hardcode disk block locations or have complicated ways of hunting for additional files; instead they can do relatively normal IO. Since UEFI will load an entire UEFI executable for you, it also eliminates the first stage bootloading.

Since UEFI booting is better in a number of ways than BIOS booting, you might imagine that an increasing number of machines are set up to boot that way. Perhaps they are for name-brand desktops that are preinstalled with Windows, but at least for Linux servers I don't believe that that's the case. Part of this is sheer pragmatics; the people putting together installers for Linux distributions and so on know that BIOS MBR booting works on basically everything while UEFI may or may not be supported and may or may not work well. Maybe in five years, server installs will default to UEFI unless you tell them to use BIOS MBR.

PCBIOSBootingStages written at 23:19:51; Add Comment

2017-04-10

How TLS certificates specify the hosts they're for

Any TLS-like system of public key cryptography plus Certificate Authorities that vouch for your identity needs a format for certificates. When Netscape came up with the very first version of SSL, they opted not to invent a certificate format to go with their new encryption protocol; instead they reused the certificate format that had already been proposed for email in 1993 in RFC 1422 (as part of PEM). This foundational work didn't define its own certificate format either; instead it opted to politely reuse an existing specification, X.509, which associated with the X.500 directory specification, which was as a broad whole intended to be part of the OSI network stack before OSI as a whole was completely run over by the Internet TCP/IP juggernaut. This reuse in PEM was apparently intended in part to help interoperability with X.500/X.509 based email systems that more or less never materialized.

(The Internet used to worry a great deal about talking to foreign email systems and how to map between SMTP email concepts and those systems.)

A standard part of X.509 certificates is their Subject name, ie what the certificate is about. X.509 subject names are not freeform text; instead they are a set of attributes making up an X.500 Distinguished Name. Since these attributes come from an OSI directory specification, they're focused on identifying people in some role, as you can see from the list of standard DN keys in RFC 1779. This presented Netscape with a little problem, namely how to use a certificate to identify an Internet host instead of a person. So back in 1995 and 1996 when SSL v2 and v3 were specified, people took the obvious way out. X.500 DNs have a field called the CN, or Common Name, which for people is your name. For host certificates, the CN field was reused as the hostname (with the other DN fields being used to identify the organization officially associated with the certificate and so on).

In the beginning this was fine. However, as SSL went on people started wanting to have more than one hostname associated with a single SSL certificate; if nothing else, people liked having both www.<domain> and plain <domain>. This is a problem because the CN only contained a single hostname. Fortunately X.509 certificates can contain extensions, and so by no later than RFC 2459 (which went with TLS 1.0) there was the Subject Alternative Name extension. A SubjectAltName was explicitly a sequence of things (including DNS hostnames), so you could put multiple hostnames in it.

Then things get complicated. As of RFC 3280 in 2002, including SANs in TLS certificates was theoretically mandatory (via), and in fact using a CN was deprecated in RFC 2818 in 2000. How SANs and the CN interacts depends on the software involved; per this Stackoverflow answer, RFC 5280 in 2008 said to look at both, while RFC 6125 in 2011 theoretically mandated checking SANs only if they were present (and they should always be present). See also this SO answer for more information about the CA/Browser Forum Baseline Requirements view on this, and there's also the discussion in this Let's Encrypt issue.

Given that there is plenty of pre-2011 software out there on the Internet, you can in practice find TLS certificates with all variants of this; with a CN only, with SANs only, with SANs and a CN that contains one of the SAN's names, and with SANs that do not contain the certificate's CN. With the widespread prevalence of multiple hostnames in SANs in certificates (sometimes a very long list of them, eg on Cloudflare's web servers), all even vaguely modern TLS software can validate certificates using only SAN names. I don't know if there's any browser or TLS software yet that ignores the CN or actively requires certificates to have a SAN, but it's probably not too far off; at this point, it's likely to be at least a little bit unsafe to use TLS certificates without a SAN.

(You probably can't get CA signed certificates without a SubjectAltName, because the CA will add them for you whether or not you like it. Self-signed certificates are a different matter entirely, but that's another entry.)

TLSCertificatesNamingHosts written at 21:55:09; Add Comment

2017-03-23

ARM servers had better just work if vendors want to sell very many

A few years ago I wrote about the cost challenge facing hypothetical future ARM servers here; to attract our interest, they'd have to be cheaper or better than x86 servers in some way that we cared about. At the time I made what turns out to be a big assumption: I assumed that ARM servers would be like x86 servers in that they would all just work with Linux. Courtesy of Pete Zaitcev's Standards for ARM computers and Linaro and the follow-on Standards for ARM computers in 2017, I've now learned that this was a pretty optimistic assumption. The state of play in 2017 is that LWN can write an article called Making distributions Just Work on ARM servers that describes not current reality but an aspirational future that may perhaps arrive some day.

Well, you know, no wonder no one is trying to actually sell real ARM servers. They're useless to a lot of people right now. Certainly they'd be useless to us, because we don't want to buy servers with a bespoke boatloader that probably only works with one specific Linux distribution (the one the vendor has qualified) and may not work in the future. A basic prerequisite for us being interested in ARM servers is that they be as useful for generic Linux as x86 servers are (modulo which distributions have ARM versions at all). If we have to buy one sort of servers to run Ubuntu but another sort to run CentOS, well, no. We'll buy x86 servers instead because they're generic and we can even run OpenBSD on them.

There are undoubtedly people who work at a scale with a server density where things like the power advantages of ARM might be attractive enough to overcome this. These people might even be willing to fund their own bootloader and distribution work. But I think that there are a lot of people who are in our situation; we wouldn't mind extra-cheap servers to run Linux, but we aren't all that interested in buying servers that might as well be emblazoned 'Ubuntu 16.04 only' or 'CentOS only' or the like.

I guess this means I can tune out all talk of ARM servers for Linux for the next few years. If the BIOS-level standards for ARM servers for Linux are only being created now, it'll be at least that long until there's real hardware implementing workable versions of them that isn't on the bleeding edge. I wouldn't be surprised if it takes half a decade before we get ARM servers that are basically plug and play with your choice of a variety of Linux distributions.

(I don't blame ARM or anyone for this situation, even though it sort of boggles me. Sure, it's not a great one, but the mere fact that it exists means that ARM vendors haven't particularly cared about the server market so far (and may still not). It's hard to blame people for not catering to a market that they don't care about, especially when we might not care about it either when the dust settles.)

ArmServersHaveToJustWork written at 23:14:50; Add Comment

(Previous 10 or go back to March 2017 at 2017/03/17)

Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.