Some things about booting with UEFI that are different from MBR booting
If you don't dig into it, a PC that boots with UEFI seems basically the same as one that uses BIOS MBR booting, even if you have multiple OSes installed (for example, Linux and Windows 10). In either case, with Linux you boot into a GRUB boot menu with entries for Linux kernels and also Windows, and you can go on to boot either. However, under the hood this is an illusion and there are some important differences, as I learned in a recent UEFI adventure.
In BIOS MBR booting, there's a single bootloader per disk (loaded from the MBR). You only ever boot this bootloader; if it goes on to boot an entire alternate OS, it's often doing tricky magic to make them think they've been booted from the MBR. If you call up the BIOS boot menu, what it offers you is a choice of which disk to load the MBR bootloader from. When you install a bootloader on a disk, for example when your Linux distribution's installer sets up GRUB, it overwrites any previous bootloader present; in order to keep booting other things, they have to be in the configuration for your new bootloader. Since there's only one bootloader on a disk, loss or corruption of this bootloader is fatal for booting from the disk, even if you have an alternate OS there.
In UEFI booting, there isn't a single bootloader per disk the way
there is with MBR booting. Instead, the UEFI firmware itself may
have multiple boot entries; if you installed multiple OSes, it
almost certainly does (with one entry per OS). The UEFI boot manager
tries these boot entries in whatever order it's been set to, passing
control to the first one that successfully loads. This UEFI bootloader
can then do whatever it wants to; in GRUB's case, it will normally
display its boot menu and then go on to boot the default entry. If
you call up the UEFI firmware boot menu, what you see is these UEFI
boot entries, probably augmented with any additional disks that
have an EFI system partition with an
on them (this is the default UEFI bootloader name for 64-bit x86
systems). This may reveal UEFI boot entries that you didn't realize
were (still) there, such as a UEFI Windows boot entry or a lingering
(If you have multiple fixed disks with EFI system partitions, I believe that you can have UEFI boot entries that refer to different disks. So in a mirrored system disk setup, in theory you could have an UEFI boot entry for the EFI system partition on each system disk.)
The possibility of multiple UEFI boot entries means that your machine can boot an alternate OS that has a UEFI boot entry even if your normal primary (UEFI) bootloader is damaged, for example if it has a corrupted or missing configuration file. Under some situations your machine may transparently fall back to such an additional UEFI boot entry, which can be pretty puzzling if you're used to the normal BIOS MBR situation where either your normal bootloader comes up or the BIOS reports 'cannot boot from this disk'. It's also possible to have two UEFI boot entries for the same OS, one of which works and one of which doesn't (or, for a non-hypothetical example, one which only works when Secure Boot is off because it uses an an unsigned UEFI bootloader).
A UEFI bootloader that wants to boot an alternate OS has more options than a BIOS MBR bootloader does. Often the simplest way is to use UEFI firmware services to load the UEFI bootloader for the other OS and transfer control to it. For instance, in GRUB:
This is starting exactly the same Windows UEFI bootloader that my
Windows UEFI boot entry uses. I'm not sure that Windows notices any
difference between being booted directly from its UEFI boot entry
and being chainloaded this way. However, such chainloading doesn't
require that there still be a UEFI boot entry for the alternate OS,
just that the UEFI bootloader
.EFI file still be present and
working. Similarly, you can have UEFI boot entries for alternate
OSes that aren't present in your GRUB menu; the two systems are
normally decoupled from each other.
(You could have a UEFI bootloader that read all the UEFI boot entries
and added menu entries for any additional ones, but I don't believe
that GRUB does this. You could also have a
grub.cfg menu builder
efibootmgr to automatically discover such additional
A UEFI bootloader is not obliged to have a boot menu or support booting alternate OSes (or even alternate installs of its own OS), because in theory that's what additional UEFI boot entries are for. The Windows 10 UEFI bootloader normally boots straight into Windows, for example. Linux UEFI bootloaders will usually have an option for a boot menu, though, because in Linux you typically want to have more than one kernel as an option (if only so you can fall back to the previous kernel if a new one has problems).
(In theory you could probably implement multiple kernels as multiple UEFI boot entries, but this gets complicated, there's only so many of them (I believe five), and apparently UEFI firmware is often happier if you change its NVRAM variables as little as possible.)
Sidebar: UEFI multi-OS food fights
In the BIOS MBR world, installing multiple OSes could result in each new OS overwriting the MBR bootloader with its own bootloader, possibly locking you out of the other OSes. In the UEFI world there's no single bootloader any more, so you can't directly get this sort of food fight; each OS should normally only modify its own UEFI boot entry and not touch other ones (although if you run out of empty ones, who knows what will happen). However, UEFI does have the idea of a user-modifiable order for these boot entries, so an OS (new or existing) can decide that its UEFI boot entry should of course go at the front of that list, so it's the default thing that gets booted by the machine.
I suspect that newly installed OSes will almost always try to put themselves in as the first and thus default UEFI boot entry. Existing OSes may or may not do this routinely, but I wouldn't be surprised if they definitely did it should you tell them to check for boot problems and repair anything they find. Probably this is a feature.
Some notes and considerations on SSH host key verification
Suppose, not entirely hypothetically, that you want to verify the
SSH host keys of a server and that you're doing so with code that's
reasonably under your control (instead of relying on, say, OpenSSH's
ssh program). Then there are a number of things that you're going
to want to think about because of how the SSH protocol works and
how it interacts with security decisions.
The first thing to know is that you can only verify one type of host key in a single connection. As covered in RFC 4253 section 7.1, the client (you) and the server (the remote end) send each other a list of supported host key algorithms, and then the two of you pick one of the supported algorithms and verify the server's key in that algorithm. If you know multiple types of host keys for a server and you want to verify that the server knows all of them, you need to verify each type of key in a separate connection.
In theory, the client controls the preference order of the SSH host key types; you can say that you prefer ed25519 keys to RSA keys and the server should send its ed25519 key instead of its RSA key. In practice, a server can get away with sending you any type of host key that you said you'd accept, even if it's not your preference, because a server is allowed to claim that it doesn't have your preferred sort of host key (but good servers should be obedient to your wishes, because that's what the protocol requires). As a result, if you're verifying host keys you have a security decision to make: are you willing to accept any type of host key you have on file, or if you have your preferred type of host key on file, do you insist that the server present that type of key?
To be concrete, suppose that you have ed25519 and RSA keys for a server, you prefer ed25519 keys, and when you try to verify the server it offers you its RSA key instead of its ed25519 key. You could reject this on the grounds that either the server does not have the ed25519 key it should or that it's not following the protocol specification, or you could accept it because the server has a SSH host key that you have on file for it.
(As far as I can tell, OpenSSH's
ssh command behaves the second
way; it'll accept an RSA key even if you also have an ed25519 key
for the server in your
If you pick the first approach, you want to configure your SSH connection to the server to only accept a single key type, that being the best key type you have on file for the server. If you pick the second approach, you'll want to list all key types you have, in preference order (I prefer ed25519 to RSA and skip (EC)DSA keys entirely, while the current OpenSSH ssh_config manpage prefers ECDSA to ed25519 to RSA).
Under normal circumstances, the server will present only a single host key to be verified (and it certainly can only present a single type of key). This means that if you reject the initial host key the server presents, you will never be called on to verify another type of host key. If the server presents an ed25519 key and you reject it, you'll never get asked to verify an RSA key; the connection just fails. If you wanted to fall back to checking the RSA key in this case, you would have to make a second connection (during which you would only ask for RSA keys). In other words, if the server presents a key it must be correct. With straightforward code, your condition is not 'the server passes if it can eventually present any key that you know', your condition is 'the server passes if the first and only key it presents is one you know'.
PS: If you want to match the behavior of OpenSSH's
I think you're going to need to do some experimentation with how
it actually behaves in various situations. I'm sure that I don't
fully understand it myself. Also, you don't necessarily want to
ssh here; it doesn't necessarily make the most secure
choices. For instance,
ssh will happily accept a
file where a server has multiple keys of a given type, and pass the
server if it presents a key that matches any one of them.
Sidebar: How a server might not know some of its host keys
The short version is re-provisioning servers. If you generate or record a server's host key of a given type, you need to also make sure that the server is (re-)provisioned with that key when it gets set up. If you miss a key type, you'll wind up with the server generating and presenting a new key of that type. This has happened to us every so often; for example, we missed properly re-provisioning ed25519 keys on Ubuntu 14.04 machines for a while.
The cost of memory access across a NUMA machine can (probably) matter
We recently had an interesting performance issue reported to us by a researcher here. We have a number of compute machines, none of them terribly recent; some of them are general access and some of them can be booked for exclusive usage. The researcher had a single-core job (I believe using R) that used 50 GB or more of RAM. They first did some computing on a general-access compute server with Xeon E5-2680s and 96 GB of RAM, then booked one of our other servers with Xeon X6550s and 256 GB of RAM to do more work on (possibly work that consumed significantly more RAM). Unfortunately they discovered that the server they'd booked was massively slower for their job, despite having much more memory.
We don't know for sure what was going on, but our leading theory is NUMA memory access effects because the two servers have significantly different NUMA memory hierarchies. In fact they are the two example servers from my entry on getting NUMA information from Linux. The general access server had two sockets for 48 GB of RAM per socket, while the bookable compute server with 256 GB of RAM had eight sockets and so only 32 GB of RAM per socket. To add to the pain, the high-memory server also appears to have a higher relative cost for access to the memory of almost all of the other sockets. So on the 256 GB machine, memory access was likely going to other NUMA nodes significantly more frequently and then being slower to boot.
Having said that, I just investigated and there's another difference; the 96 GB machine has DDR3 1600 MHz RAM, while the 256 GB machine has DDR3 RAM at 1333 Mhz (yes, they're old machines). This may well have contributed to any RAM-related slowdown and makes me glad that I checked; I don't usually even consider RAM module speeds, but if we think there's a RAM-related performance issue it's another thing to consider.
I found the whole experience to be interesting because it pointed out a blind spot in my usual thinking. Before the issue came up, I just assumed that a machine with more memory and more CPUs would be better, and if it wasn't better it would be because of CPU issues (here they're apparently generally comparable). That NUMA layout (and perhaps RAM speed) made the 'big' machine substantially worse was a surprise. I'm going to have to remember this for the future.
PS: The good news is that we had another two-socket E5-2680 machine with 256 GB that the researcher could use, and I believe they're happy with its performance. And with 128 GB of RAM per socket, they can fit even quite large R processes into a single socket's memory.
AMD Ryzens, their memory speed peculiarities, and ECC
Intel's Core i7 CPUs, such as the one I'm planning to use in my next PC, have a nice simple memory support story; there's no ECC support and regardless of how many DDR4 DIMMs you use, they'll run at an officially supported maximum rate of 2666 MHz. Unfortunately AMD Ryzen memory support is nowhere near that simple and some of its complexities create hassles. Since I've recently been putting together a Ryzen system configuration for reasons beyond the scope of this entry, I want to write down what I've learned.
To start out with, Ryzens (and apparently everything made from AMD's Zen microarchitecture) have restrictions on what memory speeds they can achieve with various types and numbers of DIMMs. Current Ryzens support four DIMM slots and from the charts, if you have two DIMMs, single rank DIMMs give you a maximum memory speed of 2666 MHz and double rank DIMMs a maximum of 2400 Mhz, while with four DIMMs the single rank maximum is 2133 Mhz and the double rank maximum is 1866 Mhz. This is without overclocking; if you overclock, apparently you can significantly improve the four DIMM cases.
(I couldn't dig up a clear answer about maximum memory speeds for single-channel mode, but I believe they're probably the same and it's just that your bandwidth drops. Most people install DIMMs in pairs these days.)
Many DDR4 DIMMs appear to be double rank, although it's often hard to tell; memory vendors generally aren't clear about this. Kingston is a noteworthy exception, as they let you search their DDR4 modules based on rank (among other things, and note that you need unbuffered DDR4 DIMMs for Ryzen). Where single rank DDR4 DIMMs are available, they appear to only go up to 8 GB sizes; in 16 GB, you only have double rank DIMMs. This means that if you want 32 GB of RAM with a Ryzen, your maximum memory speed is 2400 MHz using two 16 GB DDR4-2400 DIMMs. Using single ranked 8 GB DIMMs is actually worse, since four DIMMs pushes you down to a maximum of 2133 MHz.
Before I started writing this entry, I was going to say that you could only get ECC memory for Ryzens in DIMMs of at most 8 GB. More extensive research shows that this is not the case; both Kingston and Crucial have 16 GB ECC DDR4-2400 UDIMMs that their websites consider compatible with some Ryzen motherboards such as the ASRock X370 Taichi (taken from here, which tested Ryzen ECC with 8 GB Crucial single rank ECC UDIMMs) or the Asus PRIME X370-PRO. However, the available ECC UDIMMs appear to have conservative timings and thus I believe slightly higher latency than other DDR4 UDIMMs (eg, CL 17 versus CL15 for more performance focused non-ECC DDR4-2400 UDIMMs). Whether this will make a practical difference for most uses is an open question, especially for a non-gaming workstation.
(A casual check suggests that ECC DDR4-2400 also appears to be clearly higher priced than non-ECC DDR4-2400, which is already annoyingly expensive.)
PS: I believe that much of this applies to AMD Threadripper and even Epyc CPUs and systems, because they're all built around the same microarchitecture. Threadrippers are quad channel instead of dual channel, so you get twice as many DIMMs before you run into these limitations, but that still means that if you want fast memory you're stuck with at most 64 GB of RAM in a Threadripper system. 64 GB is probably lots for a desktop, but we're looking at Threadripper based compute servers and our users like memory.
We may have reached a point where (new) ARM servers will just work for Linux
Back in March I wrote about how ARM servers had better just work, which pointed to various articles on what seemed to be a fairly problematic situation around basic questions like whether there was a standard way to boot Linux on ARM based servers. Today skeeto asked what I'd make of the news of Red Hat Enterprise Linux introducing ARM server support. So I opened my mouth on Twitter and got some very useful replies from Jon Masters, who is Red Hat's chief ARM architect. The short version of the answer is that 'modern' ARM servers (by which I mostly mean future ARM servers, since I don't think very many are out now) will basically behave like x86 servers with UEFI.
ARM itself has established a ServerReady certification program, and it has an underlying standard for server booting called SBBR that goes with a standard for server architecture as a whole (SBSA). The ServerReady program appears to provide a convenient way to know if your ARM-supporting Linux should boot on a particular ARM server; if the ARM server has the certification, it supports everything that Linux needs to boot. Presumably ARM servers will basically all be ServerReady ones. Given the Red Hat announcement, RHEL is going to support ServerReady ARM servers; CentOS either already does or perhaps will soon. Ubuntu already supports ARM systems with UEFI firmware in 16.04 LTS, at least according to their wiki page.
(Other OSes that can use UEFI and ACPI can presumably take advantage of the ARM standards too. FreeBSD is apparently on course to make 64-bit server ARM a fully supported platform, per here.)
All of this sounds very nice, but then I paid attention to a little sentence from Cloudflare's recent post on ARM servers:
Up until now, a major obstacle to the deployment of ARM servers was lack, or weak, support by the majority of the software vendors. In the past two years, ARM’s enablement efforts have paid off, as most Linux distros, as well as most popular libraries support the 64-bit ARM architecture. Driver availability, however, is unclear at that point.
There was enough driver support that Cloudflare could run Linux on an engineering sample of Qualcomm's Centriq server platform, but that still leaves a lot of question marks. If ARM servers have an entirely new set of hardware for things like Ethernet, USB, and SATA, then we could be back in the early dark days of Linux on x86 servers, where every new server came with big question marks about Linux support (especially in 'long term support' stable releases). An optimistic hope would be that ARM server vendors will be reusing a lot of existing PCIE-based hardware and the Linux drivers for it all are architecture-independent and just work on ARM.
(To a certain extent I expect that this has to be the way things will go. It seems unlikely that ARM server vendors will try to do their own hardware for 10G Ethernet, SAS chipsets, computational GPUs, and various other things (or that it would be very successful if they tried).)
PS: even if all the hardware works, the cost challenge remains important for us. As a result I doubt we'll be using ARM servers any time soon, since I expect the early generations of them to be focusing on people who care about total system performance for the cost, performance per watt, and issues with heat and density.
What it means to support ECC RAM (especially for AMD Ryzen)
Ever since the AMD Ryzen series of CPUs was introduced, there's been a lot of confusion about whether they supported ECC RAM and to what degree. One of the sources of confusion and imprecision is there are a number of different possible meanings of 'supporting ECC RAM'. So let's run down the hierarchy:
- The system will power up and run with ECC RAM modules installed.
- Single-bit errors will (always) be corrected.
- Corrected single-bit errors will be reported and logged, so you can
know that you have a problem.
- Double-bit errors will be detected, reported and logged, so you at least know when they've happened even though ECC can't fix them.
- Double-bit errors will fault and panic the system, rather than it continuing on with known memory errors.
When server-class systems are said to 'support ECC RAM', people mean that they do all the way up to at least #4 and often #5. People who buy servers would be very unhappy if you sold them one that was claimed to support ECC but you merely meant 'works with ECC RAM' or 'silently corrects single-bit errors'; this is not what they expect and want, even if 'silently corrects single-bit errors' means that ECC is doing something to help system reliability.
(With that said, correcting single-bit errors is not nothing, since single-bit errors are expected to be the majority of RAM errors. And if you believe that your RAM is good in general and it's just being hit by stray cosmic rays and other random things, not having reports is not a big issue because they probably wouldn't be telling you anything actionable. But server people really don't like to make those assumptions; they want reports so that if errors are frequent or not random, they can see.)
I think it's safe to say that people who specifically want ECC on non-server systems consider #2 to be the bare minimum. If the system lacks it and only 'supports' ECC in the sense of running with ECC RAM modules, you're basically paying extra for your RAM for nothing. A fair number of people would probably be reluctantly satisfied with this level, but I believe most people want up to at least #4 (where all errors are logged and correctable errors are fixed). Whether you want your desktop to reboot out from underneath you on an uncorrectable ECC error is likely something that opinions vary on.
In the absence of clear statements about what 'supporting ECC RAM' means in a non-server context (and perhaps even in a server one), people who want more than just the first level of (nominal) support are left with a great deal of uncertainties. As far as I know, this has been and continues to be the situation with AMD Ryzens and ECC RAM support; no one is prepared to officially make a clear statement about it, and without official statements we don't know what's guaranteed and what's not. For example, it's possible that there are microcode or chipset issues which mean that ECC error detection and correction isn't reliable.
(Some people have done testing with Ryzens, but that just shows what happens some of the time, under some test situations. For example, that some single-bit errors are detected, corrected, and logged doesn't mean that all of them are.)
There are two sorts of TLS certificate mis-issuing
The fundamental TLS problem is that there a ton of Certificate Authorities and all of them can create certificates for your site and give them to people. When talking about this, I think it's useful to talk about two different sorts of mis-issuance of these improper, unapproved certificates.
The first sort of mis-issuance is when a certificate authority's systems (computer and otherwise) are fooled or spoofed into issuing an otherwise normal certificate that the CA should not have approved. This issuance process goes through the CA's normal procedures, or something close to them. As a result, the issued certificate is subject to all of the CA's usual logging, OCSP processes (for better or worse), and so on. Crucially, this means that such certificates will normally appear in Certificate Transparency logs if the CA publishes to CT logs at all (and CAs are increasingly doing so).
The second sort of mis-issuance is when the CA is subverted or coerced into signing certificates that don't go through its normal procedures. This is what happened with DigiNotar, for example; DigiNotar was compromised, and as a result of that compromise some unknown number of certificates were improperly signed by DigiNotar. This sort of mis-issuance is much more severe than the first sort, because it's a far deeper breach and generally means that far less is known about the resulting certificates.
My impression is that so far, the first sort of mis-issuance seems much more common than the second sort. In a way this is not surprising; identity verification is tricky (whether manual or automated) and is clearly subject to a whole lot of failure modes.
The corollary to this is that mandatory Certificate Transparency logging can likely do a lot to reduce the impact and speed up the time to detection of most mis-issued certificates. While it can't do much about the second sort of mis-issuance, it can pretty reliably work against the first sort, and those are the dominant sort (at least so far). An attacker who wants to get a mis-issued certificate that isn't published to CT logs must not merely break a CA's verification systems but also somehow compromise their backend systems enough to subvert part of the regular certificate issuance processing. This is not quite a full compromise of the CA's security, but it's a lot closer to it than merely finding a way around the CA's identity verification processes (eg).
There are several ways to misread specifications
In an ideal world, everyone reads specifications extremely carefully and arrives at exactly the same results. In the real world, generally specifications are read casually, with people skimming things, not putting together disparate bits, and so on. If the specification is not clear (and perhaps even if it is), one of the results of this is misinterpretations. However, it's my view that not all misinterpretations are the same, and these differences affect both how fast the misreading propagates and how likely it is to be noticed (both of which matter since ultimately specifications are defined by their implementations).
I see at least three ways for misreadings to happen. First, you can accept more than the specification wants; if a field is specified as ASCII, you can also accept UTF-8 in the field. Second, you can accept less than the specification wants; if a field is specified as UTF-8, you can only accept ASCII. Finally, you can have a different interpretation of some aspect; the specification might have a 'hostname' field where it's intended to accept ASCII or UTF-8, but you accept ASCII or IDNA ASCII and decode the IDNA ASCII to UTF-8.
In a sense, the important thing about all of these misreadings is that they're only partially incompatible with the real specification. In all of my examples, a pure-ASCII thing can be successfully interchanged between your misreading and a fully correct implementation. It's only when one side or the other goes outside of the shared base that problems ensue, and that's obviously part of how far a misreading can propagate before it's noticed.
There are also misreadings where you are outright wrong, for example the specification says 'text' and you assume this means UTF-16 instead of ASCII. But these sort of misreadings are going to be detected almost immediately unless the field (or option or whatever) is almost never used, because you'll be completely incompatible with everyone else instead of only partially incompatible.
My belief is that the misreading that will propagate the farthest and be noticed the slowest is a more expansive reading, because there's no way to notice this until someone eventually generates an improper thing and it's accepted by some implementations but not all. If those accepting implementations are popular enough, the specification will probably expand through the weight of enough people accepting this new usage.
(The corollary to this is that if a specification has conformance tests, you should include tests for things that should not be accepted. If your specification says 'field X is ASCII only' or 'field X cannot be HTML', include test vectors where field X is UTF-8 or contains HTML or whatever.)
Misreadings that are more restrictive or different can flourish only when the things that they don't accept are missing, rare, or hard to notice the mistake in. This goes some way to explain why implementation differences are often found in the dark corners or obscure parts of a specification, in that there isn't enough usage for people to notice misreadings and other mistakes.
Why I plan to pick a relatively high-end desktop CPU for my next PC
My general reflex when it comes to any number of PC components, CPUs included, is to avoid the (very) highest end parts. This is for the well known reason that the top of the product range is where nearly everyone raises prices abruptly. If you want something close to the very best, the vendor will charge you for the privilege and the result is not that great a price to performance ratio. Going down a step or three can give you most of the benefits for much less cost. This is certainly the approach that I took with the CPU for my current machine, where picking an i5-2500 over an i7-2600 was about 2/3rds of the cost for much of the performance.
Well, you know what, this time around I'm not going to do that for the new PC I'm slowly planning. The lesser reason is that there is now much more of a (potential) performance difference in Intel's current desktop CPUs; an i5-8400 is clocked clearly lower than i7-8700, on top of not having hyperthreading and having 3 MB less L3 cache. But the larger reason is that I'm no longer convinced that economizing here makes long term sense with how I've treated my PCs so far. Specifically, I seem to get at least five years out of each one and I don't upgrade it over that time.
I think that buying a cost-effective CPU makes a lot of sense if you're going to later upgrade it to another cost-effective CPU when the performance difference becomes meaningful to you. But if you're not, buying a cost-effective CPU that meaningfully underperforms a higher-end one means that your machine's relative performance slides sooner and faster. Buying a meaningfully faster CPU now keeps your machine from becoming functionally obsolete for longer, and if you amortize the extra cost over the (long) lifetime of your machine, it may not come out to all that much extra.
(It's my current view that CPU performance still matters to me, so I will notice differences here.)
To some degree this contradicts what I said in my thinking about whether I'd upgrade my next PC partway through its life, where I was open to a mid-life CPU upgrade. There are two answers here. One of them is that I don't really want to go through the hassle of a CPU upgrade, both in figuring out when to do it and then the actual physical process.
The other answer is that this is all rationalization and justification. In reality, I've become tired of doing the sensible, economical thing and settling for a 'not as good as it could reasonably be' system. Buying an objectively overpriced CPU is something that I can afford to do and it will make me irrationally happier with the resulting PC (and the five year amortized cost is not all that much).
HD usage can be limited by things other than cost per TB
I was recently reading WDC: No SSD/HDD Crossover (via), which reports Western Digital data that says that HDs will continue to have a significant price per TB advantage over SSDs for at least the next decade (the quoted figure is a 10:1 advantage). I'm perfectly prepared to believe this (I have no idea myself), but at the same time I don't think it's necessarily very relevant. The simple way to put it is that a great deal of storage is not bulk storage.
If you're doing bulk storage, then certainly the cost per TB matters lot and HDs will likely continue to have the advantage. But if you're not, there are a number of other concerns that will probably clip the wings of HDs long before then. Two classical concerns are maintaining enough IOPS per TB, and the time it takes to restore your data redundancy after a disk is lost (whether through a RAID resynchronization or some other mechanism).
Larger and larger HDs might come with an increase in IOPS per disk, but history is fairly strongly against that; genuine sustainable IOPS per disk has been basically flat for HDs for years. This means that as your HDs grow bigger, IOPS per TB drops; the same amount of IOPS per disk is spread among more TB per disk. If you feel you need reasonably responsive random IO, this can easily mean that your usable TB per disk is basically capped. This is the situation that we're in with our fileservers, where we deliberately used 2 TB HDs instead of something larger in order to maintain a certain level of IOPS per TB.
(This IOPS limit is different from a situation where HDs simply can't provide enough IOPS to meet your needs.)
The time required to restore full data redundancy after a disk failure goes up as you put more and more data on a single disk. If you lose a giant disk, you get to copy a giant disk's worth of data, and the bigger your disks are the longer this takes. At a certain point many people decide that they can't afford such long rebuild times, and so they have to cap the usable TB per disk. Alternately they have to build in more redundancy, which requires more disks and results in higher costs per usable TB of space (not raw TB of space).
(This has already happened once; as disks got larger, people moved away from RAID-5 in favour of RAID-6 in large part because of rebuild times and the resulting exposure if you lost a drive in a RAID-5 array.)
If your usable TB per disk is capped in this way, the only thing that larger, cheaper per TB HDs do for you is perhaps drive down the price of smaller right-sized disks (in our case, this would be 2 TB disks). Unfortunately, disks seem to have a floor price and as disk capacity increases, what you get for this floor price in a decent quality drive seems quite likely to go over the maximum TB that you can use. More to the point, the cost of SSDs with the same capacity is going to keep coming down toward where they're affordable enough. This is the real-world SSD inflection point for many environments; not the point where the price per TB of SSDs reaches that of HDs, but the point where you might as well use SSDs instead of the HDs that you can use, or at least you can afford to do so.