A CPU's TDP is a misleading headline number
The AMD Ryzen 1800X in my work machine and the Intel Core i7-8700K in my home machine are both 95 watt TDP processors. Before I started measuring things with the actual hardware, I would have confidently guessed that they would have almost the same thermal load and power draw, and that the impact of a 95W TDP CPU over a 65W TDP CPU would be clearly obvious (you can see traces of this in my earlier entry on my hardware plans). Since it's commonly said that AMD CPUs run hotter than Intel ones, I'd expect the Ryzen to be somewhat higher than the Intel, but how much difference would I really expect from two CPUs with the same TDP?
Then I actually measured the power draws of the two machines, both at idle and under various different sorts of load. The result is not even close; the Intel is clearly using less power even after accounting for the 10 watts of extra power the AMD's Radeon RX 550 graphics card draws when it's lit up. It's ahead at idle, and it's also ahead under full load when the CPU should be at maximum power draw. Two processors that I would have expected to be fundamentally the same at full CPU usage are roughly 8% different in measured power draw; at idle they're even further apart on a proportional basis.
(Another way that TDP is misleading to the innocent is that it's not actually a measure of CPU power draw, it's a measure of CPU heat generation; see this informative reddit comment. Generally I'd expect the two to be strongly correlated (that heat has to come from somewhere), but it's possible that something that I don't understand is going on.)
Intellectually, I may have known that a processor's rated TDP was merely a measure of how much heat it could generate at maximum and didn't predict either its power draw when idle or its power draw under load. But in practice I thought that TDP was roughly TDP, and every 95 watt TDP (or 65 watt TDP) processor would be about the same as every other one. My experience with these two machines has usefully smacked me in the face with how this is very much not so. In practice, TDP apparently tells you how big a heatsink you need to be safe and that's it.
(There are all sorts of odd things about the relative power draws of the Ryzen and the Intel under various different sorts of CPU load, but that's going to be for another entry. My capsule summary is that modern CPUs are clearly weird and unpredictable beasts, and AMD and Intel must be designing their power-related internals fairly differently.)
PS: TDP also doesn't necessarily predict your actual observed CPU temperature under various conditions. Some of the difference will be due to BIOS decisions about fan control; for example, my Ryzen work machine appears to be more aggressive about speeding up the CPU fan, and possibly as a result it seems to report lower CPU temperatures under high load and power draw.
(Really, modern PCs are weird beasts. I'm not sure you can do more than putting in good cooling and hoping for the best.)
For the first time, my home PC has no expansion cards
When I started out with PCs, you needed a bunch of expansion cards to make them do anything particularly useful. In the era of my first home PC, almost all I used on the motherboard was the CPU and the memory; graphics, sound, Ethernet (if applicable to you), and even a good disk controller were add-on cards. As a result, selecting a motherboard often involved carefully counting how many slots you got and what types they were, to make sure you had enough for what you needed to add.
(Yes, in my first PC I was determined enough to use SCSI instead of IDE. It ran BSDi, and that was one of the recommendations for well supported hardware that would work nicely.)
Bit by bit, that's changed. In the early 00s, things started moving on to the motherboard, starting (I believe) with basic sound (although that didn't always work out for Linux people like me; as late as 2011 I was having to use a separate sound card to get things working). When decent SATA appeared on motherboards it stopped being worth having a separate disk controller card, and eventually the motherboard makers started including not just Ethernet but even decent Ethernet chipsets. Still, in my 2011 home machine I turned to a separate graphics card for various reasons.
With my new home machine, I've taken the final step on this path. Since I'm using the Intel onboard graphics, I no longer need even a separate graphics card and now have absolutely no cards in the machine; everything is on the motherboard. It's sometimes an odd feeling to look at the back of my case and see all of the case's slot covers still in place.
(My new work machine still needs a graphics card and that somehow feels much more normal and proper, especially as I've also added an Ethernet card to it so that I have a second Ethernet port for sysadmin reasons.)
I think one of the reasons that having no expansion cards feels odd to me is that for a long time having an all-onboard machine was a sign that you'd bought a closed box prebuilt PC from a vendor like Dell or HP (and were stuck with whatever options they'd bundled in to the box). These prebuilt PCs have historically not been a great choice for people who wanted to run Linux, especially picky people like me who want unusual things, and I've had the folkloric impression that they were somewhat cheaply put together and not up to the quality standards of a (more expensive) machine you'd select yourself.
As a side note, I do wonder about the business side of how all of this came about. Integrating sound and Ethernet and so on on motherboards isn't completely free (if nothing else, the extra physical connectors cost something), so the motherboard vendors had to have a motivation. Perhaps it was just the cut-throat competition that pushed them to offering more things on the board in order to make themselves more attractive.
(I also wonder what will be the next thing to become pervasive on motherboards. Wireless networking is one possibility, since it's already on higher end motherboards, and perhaps BlueTooth. But it also feels like we're hitting the limits of what can be pulled on to motherboards or added.)
A learning experience with iOS's fingerprint recognition
I have both an iPhone and an iPad, both of which have fingerprint based unlocking, which I use. I interact with the iPhone sufficiently often that I generally unlock it multiple times a day, but for various reasons I use the iPad much less frequently and can even go for a couple of days before I dig it out and poke at it.
It's been winter around here for the past while, and Toronto's winter is dry. These days that dryness is hard on my fingers, especially the fingers of my right hand (I'm right handed, which may contribute to this); my fingertips get chapped and cracked and generally a bit beaten up despite some effort to take care of them by slathering moisturizer on and so on.
(The problem with using moisturizer, especially on your fingertips, is that I generally want to do something with my hands and don't want to get moisturizer all over what I'll be typing on or holding or whatever.)
Over the course of this winter, I gradually noticed that my iPad was getting harder and harder to unlock. I'd have to wiggle my right thumb around to get it to like it, and sometimes it just wouldn't and I'd wind up typing my unlock password. If I remembered to try my left thumb, often that would work, and my iPhone had no problems at all; I'd tap it and pretty much it'd always unlock. For most of the winter, when this happened I'd wipe the sensor clean on the iPad and mutter to myself and just live with it. It had to be a little glitch on the iPad, right? But every so often I'd stare at my roughed-up and increasingly hard to make out right thumb fingerprint and wonder.
When I couldn't unlock the iPad recently, I gave in to frustration and tried something just to see if it would help: I enrolled my right thumb's fingerprint again (as a new fingerprint). The difference was night and day. Suddenly the iPad was unlocking just like my iPhone, like it was supposed to and as I remembered it doing in the beginning; tap the sensor and it unlocked instantly without fuss or problems.
My best guess is the obvious guess; not only does the iOS fingerprint system have some margin for error, but it updates its recognition model over time. Because I unlocked my iPhone often enough, its recognition model could follow along as my right thumb's fingerprint got more and more roughed up over the course of the winter. However I didn't unlock my iPad often enough for these updates to kick in (or they couldn't or didn't move the model fast enough), so as the model and my fingerprint drifted further and further apart it got harder and harder to get it to match up with my cracked-skin fingerprint. Re-enrolling my thumb again added a new recognition model that worked on the current, beaten up state.
(This time around I've actually named that fingerprint, so I can easily remove it later. I may try the experiment of removing it in the summer when my right thumb's fingerprint is all recovered and has good skin condition again. In theory the original enrollment should be good enough at that point.)
Next winter I'm going to try to use my iPad more often or at least unlock it more regularly. Probably I'll aim to unlock it a couple of times every day, even if I'm not going to do anything more than tell it to check for various sorts of updates.
(Or I could start reading books on it. I did recently get pulled into reading a great SF novella on it, which was a pretty good experience, and I certainly have more books I could read there.)
Wrestling with metrics to get meaningful, useful ones
I'm currently working on hacking together something
to show us useful information about the most active NFS filesystems
on a client (what I called
nfsiotop in yesterday's entry). Linux has copious per-mount statistics and the program that I
started from already read them all, so a great deal of what I've
been doing has been wrestling with the raw data available to come
up with useful metrics and figure out good ways of displaying them.
This is a common experience; I have some version of it almost every
time I wind up trying to boil a flood of raw data down to some
useful summaries of it.
The first part of this wrestling is just figuring out what pieces
of the raw data are even useful in practice. Looking at the actual
data on live systems always produces a certain amount of surprises;
for example, one promising looking field turned out to be zero on
all of our systems. Others can just be too noisy or not quite mean
what you understood them to mean, or not behave the way you thought
they were going to when the system is under load or otherwise in
an interesting state. One common thing to discover is that in
practice, certain detailed breakdowns in the raw data aren't
interesting and you actually want much more aggregated versions
(then you get to figure out how to aggregate in useful ways that
still keep things meaningful). In the specific case of Linux NFS
filesystem statistics, you could present various data separately
for each different NFS operation, but you don't really want to; you
probably don't care about, for example, how many
a second were done on the filesystem. At the same time you might
care about some broad categories since different NFS operations
have different impacts on the server.
The second part of this wrestling is figuring out if I can use some tempting piece of raw data in a useful and meaningful way, and the mirror image version of this, if there is some way to torture the raw data that I have so that it creates a useful metric that I really want. There are a great many metrics you can calculate from raw statistics, but a lot of the metrics don't necessarily mean anything much or can be misleading. It's tempting to believe that a particular calculation you've come up with means something useful, especially if it seems to correlate with load or some other interesting state, but it isn't necessarily so. I've find it all too easy to have my desire for a particular useful metric wind up blinding me to the flaws in what I'm calculating; I want to believe that I've come up with a clever trick to give me something I want, even if I haven't.
(I'm very aware of this since years ago I wound up being quite
annoyed that Linux's
iostat was confidently presenting a metric
that was very desirable but couldn't actually be calculated accurately
from the available information (see here).
I don't want to do that to myself in my own tools; if I print out
a metric, I want it to be meaningful, useful, and not misleading.)
For a concrete example of this, let's talk about a hypothetical 'utilization' metric for NFS mounts, by analogy to the stat for disks, where 100% utilization of a NFS mount would mean that there always was at least one outstanding NFS operation during the particular time period. Utilization is nice because it tells you more about how busy something is than a raw operation count does. Is 100 operations a second busy or nothing? It depends on how fast the server responds and how many operations you issue in parallel and so on.
The current Linux kernel NFS client statistics don't directly expose enough data to generate this number. But they do expose the total cumulative time spent waiting for the server to reply to each request (you have to sum it up from each separate NFS operation, but it's there). Is it meaningful to compare this total time to the time period and compute, say, a ratio or a percentage? On the one hand, if the total cumulative time is less than the time period, your utilization has to be under 100%; if you spent only half a second waiting for all operations issued over a second, then at least half of the time there had to be nothing outstanding. On the other hand, a high cumulative time doesn't necessarily mean high utilization, because you can easily have multiple outstanding requests that the server processes in parallel.
Let's call the ratio of cumulative time to elapsed time the 'saturation'. This metric does mean something, but it may not be useful and it may be misleading. How do we want to present it, if we present it at all? As a percentage clamped to 100%? As a percentage that can go above 100%? As a raw ratio? Is it mostly useful if it's below 100%, because then it's clearly signaling that we can't possibly have 100% utilization, or is it meaningful to see how much over 100% it goes? I don't currently have answers for any of these questions.
All of this is typical of the sort of wrestling with metrics that I wind up doing. I work out some metrics, I fiddle around with printing them in various ways, I try to see if they tell me things that look useful when I know that various sorts of load or stress are happening, and then I try to convince myself that they mean something and I'm not fooling myself.
PS: After you've convinced yourself that a metric means something (and what it means, and why), do write it all down in a comment in the code to capture the logic before it falls out of your head. And by 'you' I mean 'me'.
Linux is good at exposing the truth of how motherboards are wired
One of the things I've learned over time, sometimes the hard way, is that Linux (and other open source operating systems) are brutally honest about how various things on motherboards are actually hooked up. As a result, they are a good way of exposing any, well, let us call them 'convenient inaccuracies' in how motherboard manuals present things. The major source of inaccuracies that I've tended to run across has been SATA port numbering, and on servers we've also had Ethernet port numbering issues.
(Non-servers tend not to have issues with Ethernet port numbering because they only have at most one. Servers can have multiple ones, sometimes split between multiple chipsets for extra fun.)
Typical motherboards present a nice clear, coherent picture of their SATA port numbering and how it winds up in physical ports on the motherboard. Take, for example, the Asus Prime X370-Pro, a Ryzen motherboard that I happen to have some recent experience with. The manual for this motherboard, the board itself, and the board's BIOS, will all tell you that it has eight SATA ports, numbered 1 through 8. Each set of ports uses a dual connector and those connectors are in a row, with 1-2 on the bottom running up through 7-8 at the top.
(As usual, the manual doesn't tell you whether the top port or the bottom port in a dual connector is the lower numbered one. It turns out to be the top one. I don't count this as an inaccuracy as everything agrees on it once you can actually check.)
Linux will tell you that this is not accurate. From the bottom up,
the ports actually run 1-2, 5-6, 3-4, 7-8; that is, the middle pairs
of ports have been flipped (but not the two ports within a pair of
ports; the lower numbered one is still on the top connector). This
shows up in Linux's /dev/sd* enumeration, the underlying
kernel names, and Linux SCSI host names,
and all of them are consistent with this reversed numbering. I
assume that any open source OS would show the same results, since
they're all likely looking directly at what the hardware tells them
and ignoring any BIOS tables that might attempt to name various
(I don't know if the BIOS exposes its port naming in any OS-visible tables, but it seems at least plausible that the BIOS does. Certainly it seems likely to cause confusion in Windows users if the OS calls the devices one thing and the BIOS calls them another, and BIOS vendors are usually pretty motivated to not confuse Windows users. The motherboard's DMI/SMBIOS data does appear to have some information about the SATA ports, although I don't know if DMI contains enough detail to match up specific SATA ports with their DMI names.)
I have to assume that motherboard makers have good reasons for such weird port numbering issues. Since I have very little knowledge here, all I can do is guess and speculate, and the obvious speculation is wire routing issues that make it easier to flip some things around. Why only the middle two sets of ports would be flipped is a mystery, though.
(This is not the first time I've had to figure out the motherboard SATA port numbering; I think that was one of the issues here, for example, although there is no guarantee that the BIOS mapping matches the mapping on the physical motherboard and in the manual.)
Some questions I have about DDR4 RAM speed and latency in underclocked memory
Suppose, not hypothetically, that you're putting together an Intel Core i7 based machine, specifically an i7-8700, and you're not planning to overclock. All Coffee Lake CPUs have an officially supported maximum memory rate of 2666 MHz (regardless of how many DIMMs or what sort of DIMM they are, unlike Ryzens), so normally you'd just buy some suitable DDR4 2666 MHz modules. However, suppose that the place you'd be ordering from is out of stock on the 2666 MHz CL15 modules you'd normally get, but has faster ones, say 3000 MHz CL15, for essentially the same price (and these modules are on the motherboard's qualified memory list).
At this point I have a bunch of questions, because I don't know what you can do if you use these higher speed DDR4-3000 CL15 DIMMs in a system. I can think of a number of cases that might be true:
- The DIMMs operate as DDR4-2666 CL15 memory. Their faster speed does
nothing for you now, although with a future CPU and perhaps a future
motherboard they would speed up.
(Alternately, perhaps underclocking the DIMMs has some advantage, maybe slightly better reliability or slightly lower power and heat.)
- The DIMMs can run at 2666 MHz but at a lower latency, say CL14,
since DDR4-3000 CL15 has an absolute time latency of 10.00 ns and
2666 MHz CL14 is over that at 10.5 ns (if I'm doing the math
This might require activating an XMP profile in the BIOS, or it might happen automatically if what matters to this stuff is the absolute time involved, not the nominal CLs. However, according to the Wikipedia entry on CAS latency, synchronous DRAM cares about the clock cycles involved and so CL15 might really be CL15 even if when you're underclocking your memory. DDR4 is synchronous DRAM.
- The DIMMs can run reliably at memory speeds faster than 2666 MHz,
perhaps all the way up to their rated 3000 MHz; this doesn't count
as CPU overclocking and is fine on the non-overclockable i7-8700.
(One possibility is that any faster than 2666 MHz memory listed on the motherboard vendor's qualified memory list is qualified at its full speed and can be run reliably at that speed, even on ordinary non-overclockable i7 CPUs. That would be nice, but I'm not sure I believe the PC world is that nice.)
- The system can be 'overclocked' to run the DIMMs faster than 2666
MHz (but perhaps not all the way to the rated 3000 MHz), even on
an i7-8700. However this is actual overclocking of the overall
system (despite it being within the DIMMs' speed rating), is not
necessarily stable, and the usual caveats apply.
- You need an overclockable CPU such as an i7-8700K in order to run memory any faster than the officially supported 2666 MHz. You might still be able to run DDR4-3000 CL15 at 2666 MHz CL14 instead of CL15 on a non-overclockable CPU, since the memory frequency is not going up, the memory is just responding faster.
Modern DIMMs apparently generally come with XMP profile(s) (see also the wikichip writeup) that let suitable BIOSes more or less automatically run them at their official rated speed, instead of the official JEDEC DDR4 standard speeds. Interestingly, based on the Wikipedia JEDEC table even DDR4-2666 CL15 is not JEDEC standard; the fasted DDR4-2666 CL the table lists is CL17. This may mean that turning on an XMP profile is required merely to get 2666 MHz CL15 even with plain standard DDR4-2666 CL15 DIMMs. That would be weird, but PCs are full of weird things. One interesting potential consequence of this could be that if you have DDR4-3000 CL15 DIMMs, you can't easily run them at 2666 MHz CL15 instead of 2666 MHz CL17 because the moment you turn on XMP they'll go all the way up to their rated 3000 MHz CL15.
(I learn something new every time I write an entry like this.)
PS: People say that memory speed isn't that important, but I'm not sure I completely believe them and anyway, if I wind up with DIMMs rated for more than 2666 MHz I'd like to know what they're good for (even if the answer is 'nothing except being available now instead of later'). And if one can reliably get somewhat lower latency and faster memory for peanuts, well, it's at least a bit tempting.
Meltdown and Spectre have made this a bad time to get a new x86 CPU
Despite reasonably solid plans, I still don't have a new home machine and in fact I probably won't get one for some time (even with my recent scare). Instead I'm likely to prolong the life of my current machine from 2011 at least a year longer than I was expecting. By far the largest reason for my delay is that it's currently a bad time to get a new x86 CPU, due to Meltdown and Spectre and the general class of security attacks that they've created. More specifically, for me it's due to the uncertainty about effective future CPU performance they've created.
All current x86 CPUs are vulnerable to at least some of the known Spectre attacks, and all current Intel CPUs are vulnerable to Meltdown (AMDs are believed not vulnerable to current attacks). Mitigating the current attacks costs performance, sometimes significant amounts of it, sometimes perhaps less. In addition there seems very likely to be additional speculative execution attacks discovered in the future (some may already have been found) that will require their own additional workarounds, with their own performance penalties. In short, things are only going to get worse for current CPUs.
There are least two options for what happens from here and I don't think we know which one it's going to be. The first option is that there will be good mitigations that are easy to roll into new CPUs almost immediately. Within a CPU refresh iteration or two, new CPUs could be much better at dealing with speculative execution attacks, with clearly cheaper mitigations required from software.
(This seems especially likely to happen with Intel CPUs and Meltdown, given that AMDs sidestep it entirely.)
The second option is that we're not going to get real CPU fixes for these issues for at least one major CPU generation, because small tweaks and changes won't be enough to do more than make things hurt a bit less. Discovering all the problems takes time; redesigning various bits of speculative execution hardware takes more time. In the Intel world, we might not get this until the end of 2018 with Ice Lake, or even later with Tiger Lake. This is especially possible if the first round of hardware mitigations turn out to be not enough, perhaps because people keep coming up with new attack variants that need new hardware mitigations.
If CPUs will get good mitigations in the next generation of product announcements, buying a CPU now gives you basically a lemon; soon you'll be able to get CPUs with meaningful effective performance increases because they won't need as many expensive mitigations. If CPUs won't get good mitigations until, say, the third quarter of 2019, we're probably pretty much in the usual situation with CPU performance increases; if you want a few years, you always get more (for some workloads). If the timeline is somewhere in the middle, I don't know; presumably it depends on how much you need the performance you can get with a new current CPU and system over what you have now.
(This also depends on what system lifetime you expect. If you live on the bleeding edge and discard systems after a year or two anyway, your calculations are a lot different than someone who's aiming for a five or six year lifetime.)
However, I have to admit that part of my reaction is emotional. I just don't want to buy a product that I know is flawed, and all current CPUs are flawed (in theory Intel more than AMD, but in practice AMD Ryzens and Linux are a bad combination). Rationally perhaps I should just go ahead and buy my planned machine now and just live with any performance impact (if I care, I can turn the mitigations off reasonably safely). But the mere idea of giving Intel money in this situation irritates me.
(Maybe for once I'll do a sensible, rational thing, especially with what may be a slowly dying home machine, but don't hold your breath.)
It feels good to have a fallback option for home computing
Earlier this evening, I had a close call with my now increasingly aged home machine; the fan noise went to 'very loud' and the CPU temperature started climbing. After I hastily shut it down and blew out all of the dust, I was able to determine that the actual problem was a dying case fan, which has a simple workaround (just take the side of the case off). Until I worked this out, though, I was facing the prospect that my home machine might be effectively dead for a while. Unlike last time, this was a lot less distressing and nerve wracking, because these days I have a fallback option for home computing.
My PC remains my sole actual machine at home (I still haven't done anything about replacing it for reasons that deserve an entry of their own), but over the past while I've made two important changes that give me an additional option. The first change was that I have a smartphone that can provide Internet access when my main Internet connection is out. The second is that this fall I got a tablet (from the company you'd expect), and more than that, when I got it I was smart enough to talk myself into getting the keyboard for it. The tablet lets me browse around on the Internet, and with the keyboard and a SSH client I can actually do meaningful remote work. It's not anywhere near as nice as my actual desktop for various reasons, but I can get by, and knowing that I wasn't helpless in the face of a dead desktop was a real relief when I was facing the prospect that that's what I had.
(Unfortunately my Internet goes out when my home PC does, because it's where I do DSL PPPoE. In theory I could probably reconfigure my 'DSL modem' to do it, because it's really a DSL router that I'm having act as a bridge, but it looks like I can't do this from the tablet for no readily apparent reason. It may be an inconvenient security feature that my DSL router refuses to let itself be reconfigured over the wireless interface.)
The tablet is not as nice as my work laptop and if I was going to be without my home desktop for any significant length of time I would definitely be taking the laptop home with me, but it is good enough to be okay to good over the short term. Also, being confined to the tablet probably would have the useful side effect of encouraging me to get off the Internet for once.
PS: As an experiment, I've written much of this entry from my tablet to demonstrate to myself that I could and that it wasn't too irritating (and to work out how to do it). It is irritating and limiting enough that I'm not going to do it unless I have to.
PPS: The two worst things about the keyboard are that it has no physical escape key and that I don't think I can remap the mostly useless 'caps lock' key to be a Control key. At least my SSH client uses Cmd + ` as Escape, which is not too far from actual Esc. As far as the feel goes, it's okay but it's not up to a real keyboard.
(This is not the entry that I planned to write today, but sometimes life intervenes and suddenly this issue is on my mind.)
Sidebar: What I'll likely do with my home PC
I don't like running with the side of my case off for various reasons, so I want to replace the dying case fan. My motherboard supports PWM case fans so that's what I'm planning to get (the current fan is not hooked up to the motherboard at all, just directly wired to power). Someone I know online has basically persuaded me to probably replace the CPU fan too, although that's more annoying and also more expensive if I get a good CPU cooler. I could get a basic 120 mm case fan and a basic LGA1155 CPU cooler at one of the local hardware stores, but if I want good ones (from, say, Noctua), I'll have to order them online and wait a bit.
On the one hand, a good new fan and CPU cooler should help prolong the life of my home machine, since the current ones are more than five years old. On the other hand, that will tacitly encourage me to continue sitting on my hands about replacing the whole machine, partly because it won't feel as urgent and partly because I will get the irrational urge to keep using the current machine to get my money's worth from the new parts.
(Probably I will throw money at the problem in irritation and get, say, the Noctua NF-S12A or NF-P12 120 mm case fan and a Noctua L12S CPU cooler.)
Some consumer SSDs are moving to a 4k 'advance format' physical block size
Earlier this month I wrote an entry about consumer SSD nominal physical block sizes, because I'd noticed that almost all of the recent SSDs we had advertised a 512 byte physical block size (the exceptions were Intel 'DC' SSDs). In that entry, I speculated that consumer SSD vendors might have settled on just advertising them as 512n devices and we'd see this on future SSDs too, since the advertised 'physical block size' on SSDs is relatively arbitrary anyways.
Every so often I write a blog entry that becomes, well, let us phrase it as 'overtaken by events'. Such is the case with that entry. Here, let me show you:
$ lsblk -o NAME,TRAN,MODEL,PHY-SEC --nodeps /dev/sdf /dev/sdg NAME TRAN MODEL PHY-SEC sdf sas Crucial_CT2050MX 512 sdg sas CT2000MX500SSD1 4096
The first drive is a 512n 2 TB Crucial MX300. We bought a number of them in the fall for a project, but then Crucial took them out of production in favour of the new Crucial MX500 series. The second drive is a 2TB Crucial MX500 from a set of them that we just started buying to fill out our drive needs for the project. Unlike the MX300s, this MX500 advertises a 4096 byte physical block size and therefor demonstrates quite vividly that the thesis of my earlier entry is very false.
(I have some 750 GB Crucial MX300s and they also advertise 512n physical block sizes, which led to a ZFS pool setup mistake. Fixing this mistake is now clearly pretty important, since if one of my MX300s dies I will probably have to replace it with an MX500.)
My thesis isn't just false because different vendors have made different decisions; this example is stronger than that. These are both drives from Crucial, and successive models at that; Crucial is replacing the MX300 series with the MX500 series in the same consumer market segment. So I already have a case where a vendor has changed the reported physical block size in what is essentially the same thing. It seems very likely that Crucial doesn't see the advertised physical block size as a big issue; I suspect that it's primarily set based on whatever the flash controller being used works best with or finds most convenient.
(By today, very little host software probably cares about 512n versus 4k drives. Advanced format drives have been around long enough that most things are probably aligning to 4k and issuing 4k IOs by default. ZFS is an unusual and somewhat unfortunate exception.)
I had been hoping that we could assume 512n SSDs were here to stay because it would make various things more convenient in a ZFS world. That is now demonstrably wrong, which means that once again forcing all ZFS pools to be compatible with 4k physical block size drives is very important if you ever expect to replace drives (and you should, as SSDs can die too).
PS: It's possible that not all MX500s advertise a 4k physical block size; it might depend on capacity. We only have one size of MX500s right now so I can't tell.
Access control security requires the ability to do revocation
I recently read Guidelines for future hypertext systems (via). Among other issues, I was sad but not surprised to see that it was suggesting an idea for access control that is perpetually tempting to technical people. I'll quote it:
All byte spans are available to any user with a proper address. However, they may be encrypted, and access control can be performed via the distribution of keys for decrypting the content at particular permanent addresses.
This is in practice a terrible and non-workable idea, because practical access control requires the ability to revoke access, not just to grant it. When the only obstacle preventing people from accessing a thing is a secret or two, people's access can only move in one direction; once someone learns the secret, they have perpetual access to the thing. With no ability to selectively revoke access, at best you can revoke everyone's access by destroying the thing itself.
(If the thing itself is effectively perpetual too, you have a real long term problem. Any future leak of the secret allows future people to access your thing, so to keep your thing secure you must keep your secret secure in perpetuity. We have proven to be terrible at this; at best we can totally destroy the secret, which of course removes our own access to the thing too.)
Access control through encryption keys has a mathematical simplicity that appeals to people, and sometimes they are tempted to wave away the resulting practical problems with answers like 'well, just don't lose control of the keys' (or even 'don't trust anyone you shouldn't have', which has the useful virtue of being obviously laughable). These people have forgotten that security is not math, security is people, and so a practical security system must cope with what actually happens in the real world. Sooner or later something always goes wrong, and when it does we need to be able to fix it without blowing up the world.
(In the real world we have seen various forms of access control systems without revocation fail repeatedly. Early NFS is one example.)