Wandering Thoughts

2018-02-19

Some consumer SSDs are moving to a 4k 'advance format' physical block size

Earlier this month I wrote an entry about consumer SSD nominal physical block sizes, because I'd noticed that almost all of the recent SSDs we had advertised a 512 byte physical block size (the exceptions were Intel 'DC' SSDs). In that entry, I speculated that consumer SSD vendors might have settled on just advertising them as 512n devices and we'd see this on future SSDs too, since the advertised 'physical block size' on SSDs is relatively arbitrary anyways.

Every so often I write a blog entry that becomes, well, let us phrase it as 'overtaken by events'. Such is the case with that entry. Here, let me show you:

$ lsblk -o NAME,TRAN,MODEL,PHY-SEC --nodeps /dev/sdf /dev/sdg
NAME TRAN   MODEL            PHY-SEC
sdf  sas    Crucial_CT2050MX     512
sdg  sas    CT2000MX500SSD1     4096

The first drive is a 512n 2 TB Crucial MX300. We bought a number of them in the fall for a project, but then Crucial took them out of production in favour of the new Crucial MX500 series. The second drive is a 2TB Crucial MX500 from a set of them that we just started buying to fill out our drive needs for the project. Unlike the MX300s, this MX500 advertises a 4096 byte physical block size and therefor demonstrates quite vividly that the thesis of my earlier entry is very false.

(I have some 750 GB Crucial MX300s and they also advertise 512n physical block sizes, which led to a ZFS pool setup mistake. Fixing this mistake is now clearly pretty important, since if one of my MX300s dies I will probably have to replace it with an MX500.)

My thesis isn't just false because different vendors have made different decisions; this example is stronger than that. These are both drives from Crucial, and successive models at that; Crucial is replacing the MX300 series with the MX500 series in the same consumer market segment. So I already have a case where a vendor has changed the reported physical block size in what is essentially the same thing. It seems very likely that Crucial doesn't see the advertised physical block size as a big issue; I suspect that it's primarily set based on whatever the flash controller being used works best with or finds most convenient.

(By today, very little host software probably cares about 512n versus 4k drives. Advanced format drives have been around long enough that most things are probably aligning to 4k and issuing 4k IOs by default. ZFS is an unusual and somewhat unfortunate exception.)

I had been hoping that we could assume 512n SSDs were here to stay because it would make various things more convenient in a ZFS world. That is now demonstrably wrong, which means that once again forcing all ZFS pools to be compatible with 4k physical block size drives is very important if you ever expect to replace drives (and you should, as SSDs can die too).

PS: It's possible that not all MX500s advertise a 4k physical block size; it might depend on capacity. We only have one size of MX500s right now so I can't tell.

SSDsAnd4KSectorsII written at 00:34:40; Add Comment

2018-02-11

Access control security requires the ability to do revocation

I recently read Guidelines for future hypertext systems (via). Among other issues, I was sad but not surprised to see that it was suggesting an idea for access control that is perpetually tempting to technical people. I'll quote it:

All byte spans are available to any user with a proper address. However, they may be encrypted, and access control can be performed via the distribution of keys for decrypting the content at particular permanent addresses.

This is in practice a terrible and non-workable idea, because practical access control requires the ability to revoke access, not just to grant it. When the only obstacle preventing people from accessing a thing is a secret or two, people's access can only move in one direction; once someone learns the secret, they have perpetual access to the thing. With no ability to selectively revoke access, at best you can revoke everyone's access by destroying the thing itself.

(If the thing itself is effectively perpetual too, you have a real long term problem. Any future leak of the secret allows future people to access your thing, so to keep your thing secure you must keep your secret secure in perpetuity. We have proven to be terrible at this; at best we can totally destroy the secret, which of course removes our own access to the thing too.)

Access control through encryption keys has a mathematical simplicity that appeals to people, and sometimes they are tempted to wave away the resulting practical problems with answers like 'well, just don't lose control of the keys' (or even 'don't trust anyone you shouldn't have', which has the useful virtue of being obviously laughable). These people have forgotten that security is not math, security is people, and so a practical security system must cope with what actually happens in the real world. Sooner or later something always goes wrong, and when it does we need to be able to fix it without blowing up the world.

(In the real world we have seen various forms of access control systems without revocation fail repeatedly. Early NFS is one example.)

SecurityRequiresRevocation written at 02:21:43; Add Comment

2018-02-07

Consumer SSDs and their nominal physical block sizes, now and in the future

Recently, D. Ebdrup left a comment here asking a question about the behavior of ZFS on 'advanced format' disk drives with 4 Kbyte physical block (or sector) sizes. I figured I'd have no problems answering it, since at this point I should have any number of ZFS pools on such drives. Then I started actually looking for them.

For years, I've been strongly advocating that people should set up ZFS pools as if they were on 4k sector advance format drives in order to future-proof your pools against the day when the only drives you could get were 4k sector ones. I set up my initial Linux ZFS pools this way, for example. But when I create a ZFS pool on SSDs on my home machine a while back (using SSDs that are now about a year old), I was surprised to discover that my SSDs were claiming to be 512 byte physical sector devices.

You can probably guess where this is going. Most of our SSDs are consumer SSDs, and it seems very hard to find one that claims to have a 4k physical sector size. Intel's SSDs usually claim a 4k sector size (although this is switchable on some models, with official support these days), but Crucial, SanDisk, Samsung 850s, SPCC, one Intel SSD, and old OWC SSDs all claim to have 512 byte physical sectors (the sole exception is an old Crucial 'CT120M50'). It turns out that we do have 4k advance format disks, but they're hard drives, not SSDs. Even the NVMe SSD in my Dell XPS 13 laptop claims to be a 512 byte physical sector device (it says it's a SK hynix NVMe drive, although who knows what modifications Dell had made to it).

For years I've been expecting that at some point in the future SSDs would report themselves as advance format drives, even if current ones didn't do so. But at this point neither advance format drives nor SSDs are new things, yet SSD makers are still having consumer SSDs report in as '512n' drives despite the fact that any number of (consumer) HDs have been advance format drives for years (I'm not sure if there are any non-AF consumer HDs left). Given this, now I wonder if SSDs are ever going to switch over to claiming to be advance format drives or if the SSD makers have concluded that keeping them reporting as 512n drives is easier in various ways.

(Given my ZFS pool setup mistake plus that I still haven't fixed it, I would be perfectly happy if this was the case.)

PS: This matters somewhat for ZFS because a few things in ZFS work better in ZFS pools that are set up assuming 512n drives (ie, with ashift=9). If you're going to be able to use 512n drives for the lifetime of your pool, you might as well take advantage of this. But I don't know if this is a really safe assumption that you want to make, especially if you have especially long-lived ZFS pools the way I may have.

(For my still entirely theoretical new home machine, this also depends on what sort of midlife upgrades I'll do.)

SSDsAnd4KSectors written at 00:51:30; Add Comment

2018-02-02

Some practical tradeoffs involved in using HTTPS instead of HTTP

Today I tweeted:

I'm a little bit disappointed that the ASUS UEFI BIOS fetches BIOS updates over HTTP instead of HTTPS (I'm sure they're signed, though). But thinking more about it, it's probably a sensible decision on ASUS's part; there are a lot of potential issues with HTTPS in the field.

Let's start with the background. I haven't been paying attention to BIOSes for current hardware until recently and it turns out that modern UEFI BIOSes are startlingly smart. In particular, they're smart enough to fetch their own BIOS updates over the Internet; all you have to do is tell them how to connect to the network (DHCP, static IP address, etc). For various reasons I prefer to update the BIOS this way and when I did such an update today, I was able to confirm that the entire update process used only HTTP. At first I was reflexively disappointed but the more I thought about the practical side the more I feel that Asus made the right choice.

The problem with fetching things over HTTPS is that it exposes a number of issues. These include:

  • HTTPS interception middleware that requires you to load their root certificate. Should the BIOS try to handle this?

  • Certificate validation in general, which has been called "the most dangerous code in the world". This has unusual challenges in a BIOS environment (consider the issue of knowing what time and date it is) and may be more or less impossible to do really well as a result.

    (Let's ignore certificate revocation, seeing as everyone does.)

  • Long term changes in the set of CA roots that you want to use. BIOSes may live for many years and be updated infrequently, so Asus might find itself with a very limited selection of CAs (and hosting providers) that were accepted by half-decade or decade old BIOSes that they still wanted to support updates from.

    (Asus can update the CA roots as part of a BIOS update, but not everyone does BIOS updates very often.)

  • Similarly, long-term evolution in the set of TLS ciphers that are supported and acceptable. We're seeing a version of this as (Open)SSH evolves and drops support for old key exchange methods that are the only methods implemented by some old clients. TLS ciphers have sometimes turned over fairly drastically as weaknesses were found.

    (More generally, TLS has sometimes needed changes to deal with attacks.)

Right now you also have the issue of a transition away from older versions of TLS towards TLS 1.3. If you're looking forward five or ten years (which is a not unreasonable lifetime for some hardware), you might expect that the future world will be basically TLS 1.3 or even TLS 1.4 only by the end of that time. Obviously a BIOS you ship today can't support TLS 1.3 because the protocol isn't finalized yet.

Including a TLS client also opens up more attack surface against the BIOS, although this feels like a pretty obscure issue to me (it'd be an interesting way to get some code running behind an organization's firewall, but people don't run BIOS updates all that often).

Since people can usually download the BIOS updates by hand from an OS, Asus could have decided to accept the potential problems and failures. But this would likely create a worse user experience where in-BIOS update attempts failed for frustrating reasons, and I can't blame Asus for deciding that they didn't want to deal with the extra user experience risks. If the BIOS updates themselves are signed (and I assume that they are in these circumstances), the extra security risks of using HTTP instead of HTTPS are modest and Asus gets to avoid a whole other set of risks.

(Part of those risks are reputational ones, where people get to create security conference presentations about 'Asus fumbles TLS certificate verification' or the like. If nothing else, many fewer people will try to break BIOS update signature verification than will poke your TLS stack to see if they can MITM it or find other issues.)

PS: There is one situation where not being able to update your BIOS without an OS is a problem, namely when you need to apply a BIOS update to be able to install your OS in the first place (and you don't have a second machine). But this is hopefully pretty rare.

BIOSViaHTTPSensible written at 23:41:58; Add Comment

2018-01-15

Meltdown and the temptation of switching to Ryzen for my new home machine

Back in November, I put together a parts list for my still hypothetical new home Linux machine. At the time I picked an Intel CPU because Intel is still the top in single-core performance, especially when you throw in TDP; the i7-8700 is clearly superior to the Ryzen 7 1700, which is the last (or first) 65W TDP Ryzen. Then two things happened. The first is my new office workstation turned out to be Ryzen-based and it appears to work fine, run cool (actually cooler than my current machines, and seems quiet from limited testing. The second is Meltdown and to a lesser extent Spectre.

Mitigating Meltdown on Intel CPUs costs a variable and potentially significant amount of performance, depending on what your system is doing; a CPU bound program is only minorly affected, but something that interacts with the OS a lot has a problem. AMD CPUs are unaffected. AMD Zen-based CPUs, including Ryzens, are also partly immune to the branch predictor version of Spectre (from here) and so don't take a performance hit from mitigations for them.

(Currently, current Intel CPUs also cause heartburn for the retpoline Spectre mitigation, because they'll speculate through return instructions. This will apparently be changed in a microcode update, which will likely cost some performance.)

Almost the entire reason I was selecting an Intel CPU over a Ryzen was the better single-core performance; with more cores, everyone agrees that Ryzens are ahead on workloads that parallelize well. But it seems likely that Meltdown will throw away at least part of that advantage on at least some of the workloads that I care about, and anyway things like Firefox are becoming increasingly multi-threaded (although not for a while for me). There still are areas where Intel CPUs are superior to Ryzens, but then Ryzens have advantages themselves, such as supporting ECC (at least to some degree).

All of that is fine and rational, but if I'm being honest I have to admit that it's not the only reason. Another reason is that I plain don't like Intel's behavior. For years, Intel has taken advantage of lack of real competition to do things like not offer ECC in desktop CPUs or limit desktop CPUs to only four cores (it's remarkable how the moment AMD came along with real competition, Intel was able to crank that up to six cores and may go higher in the next generation). Meltdown provides a convenient reason or at least justification to spit in Intel's eye.

With all of that said, I don't know if I'm actually going to go through with this idea. A hypothetical Ryzen build is somewhat more expensive and somewhat more irritating than an Intel one, since it needs a graphics card and has more RAM restrictions, and it's at least possible that Intel will soon come out with new CPUs that do better in the face of Meltdown and Spectre (and have more cores). For the moment I'm probably just going to sit on my hands (again) and see how I like my new work desktop (when I turn the new machine into my work desktop).

(My home machine hasn't started exploding yet, so the path of least resistence and least effort is to do nothing. I'm very good at doing nothing.)

MeltdownAMDRyzenTemptation written at 01:58:53; Add Comment

2018-01-12

Open source software licenses matter

I gave in to the temptation of some bait from a co-worker on Twitter, and it turns out that the conversation has made me want to say a couple of things. Today's entry is about the more serious one.

There's a not uncommon attitude among (some) people that other people are too religiously attached to various open source licenses and should get over it. So what if you don't like the specific license some project opted for? It doesn't really matter; don't be picky and contribute to the project anyway, rather than pointlessly duplicating work under another license you like better.

(This attitude isn't unique to any particular cultural camp. You can find these people anywhere, including on both sides of the GPL/Linux vs BSD general split with each side saying that the other should give up its silly insistence on the GPL or the BSD license.)

This attitude is both entitled and quite wrong.

Open source licenses are different from each other, in both philosophies and communities. People who chose to work with one or another are deliberately choosing to do somewhat different types of work (generally in their free time). To tell these people that these things don't matter and that they should live with different ones is to dictate to them that they should be doing the type of work you want them to be doing, not the type of work they've chosen to do.

This is bogus and it's entitlement speaking. We are 'obviously correct' about what matters, our opinions are better, and other people should change to follow what we think is important, not what they've chosen. It's my considered opinion that the most appropriate reply to a serious expression of this attitude involves some fingers, because really, what other reply is there to someone who believes they get to tell you what sort of work you'll do in your free time and what should matter to you? This is not something you debate.

The choice of open source license matters because it changes what sort of work people are doing when they work on a project. Some people don't care exactly what sort of work they're doing, perhaps as long as it's open source enough (and that's fine). Other people do care, and these people are not wrong.

As a corollary, laughing at people who are 'so silly' as to care about which open source license they work with is, well, let me just call it not a good look and leave it at that.

(Note that this is not the same as the people who think all projects should obviously be using their favorite license and any projects that aren't have made a tragic mistake. These people are not right either, of course.)

SoftwareLicensesMatter written at 01:57:42; Add Comment

2017-12-11

Some things about booting with UEFI that are different from MBR booting

If you don't dig into it, a PC that boots with UEFI seems basically the same as one that uses BIOS MBR booting, even if you have multiple OSes installed (for example, Linux and Windows 10). In either case, with Linux you boot into a GRUB boot menu with entries for Linux kernels and also Windows, and you can go on to boot either. However, under the hood this is an illusion and there are some important differences, as I learned in a recent UEFI adventure.

In BIOS MBR booting, there's a single bootloader per disk (loaded from the MBR). You only ever boot this bootloader; if it goes on to boot an entire alternate OS, it's often doing tricky magic to make them think they've been booted from the MBR. If you call up the BIOS boot menu, what it offers you is a choice of which disk to load the MBR bootloader from. When you install a bootloader on a disk, for example when your Linux distribution's installer sets up GRUB, it overwrites any previous bootloader present; in order to keep booting other things, they have to be in the configuration for your new bootloader. Since there's only one bootloader on a disk, loss or corruption of this bootloader is fatal for booting from the disk, even if you have an alternate OS there.

In UEFI booting, there isn't a single bootloader per disk the way there is with MBR booting. Instead, the UEFI firmware itself may have multiple boot entries; if you installed multiple OSes, it almost certainly does (with one entry per OS). The UEFI boot manager tries these boot entries in whatever order it's been set to, passing control to the first one that successfully loads. This UEFI bootloader can then do whatever it wants to; in GRUB's case, it will normally display its boot menu and then go on to boot the default entry. If you call up the UEFI firmware boot menu, what you see is these UEFI boot entries, probably augmented with any additional disks that have an EFI system partition with an EFI/BOOT/BOOTX64.EFI file on them (this is the default UEFI bootloader name for 64-bit x86 systems). This may reveal UEFI boot entries that you didn't realize were (still) there, such as a UEFI Windows boot entry or a lingering Linux one.

(If you have multiple fixed disks with EFI system partitions, I believe that you can have UEFI boot entries that refer to different disks. So in a mirrored system disk setup, in theory you could have an UEFI boot entry for the EFI system partition on each system disk.)

The possibility of multiple UEFI boot entries means that your machine can boot an alternate OS that has a UEFI boot entry even if your normal primary (UEFI) bootloader is damaged, for example if it has a corrupted or missing configuration file. Under some situations your machine may transparently fall back to such an additional UEFI boot entry, which can be pretty puzzling if you're used to the normal BIOS MBR situation where either your normal bootloader comes up or the BIOS reports 'cannot boot from this disk'. It's also possible to have two UEFI boot entries for the same OS, one of which works and one of which doesn't (or, for a non-hypothetical example, one which only works when Secure Boot is off because it uses an an unsigned UEFI bootloader).

A UEFI bootloader that wants to boot an alternate OS has more options than a BIOS MBR bootloader does. Often the simplest way is to use UEFI firmware services to load the UEFI bootloader for the other OS and transfer control to it. For instance, in GRUB:

chainloader /EFI/Microsoft/Boot/bootmgfw.efi

This is starting exactly the same Windows UEFI bootloader that my Windows UEFI boot entry uses. I'm not sure that Windows notices any difference between being booted directly from its UEFI boot entry and being chainloaded this way. However, such chainloading doesn't require that there still be a UEFI boot entry for the alternate OS, just that the UEFI bootloader .EFI file still be present and working. Similarly, you can have UEFI boot entries for alternate OSes that aren't present in your GRUB menu; the two systems are normally decoupled from each other.

(You could have a UEFI bootloader that read all the UEFI boot entries and added menu entries for any additional ones, but I don't believe that GRUB does this. You could also have a grub.cfg menu builder that used efibootmgr to automatically discover such additional entries.)

A UEFI bootloader is not obliged to have a boot menu or support booting alternate OSes (or even alternate installs of its own OS), because in theory that's what additional UEFI boot entries are for. The Windows 10 UEFI bootloader normally boots straight into Windows, for example. Linux UEFI bootloaders will usually have an option for a boot menu, though, because in Linux you typically want to have more than one kernel as an option (if only so you can fall back to the previous kernel if a new one has problems).

(In theory you could probably implement multiple kernels as multiple UEFI boot entries, but this gets complicated, there's only so many of them (I believe five), and apparently UEFI firmware is often happier if you change its NVRAM variables as little as possible.)

Sidebar: UEFI multi-OS food fights

In the BIOS MBR world, installing multiple OSes could result in each new OS overwriting the MBR bootloader with its own bootloader, possibly locking you out of the other OSes. In the UEFI world there's no single bootloader any more, so you can't directly get this sort of food fight; each OS should normally only modify its own UEFI boot entry and not touch other ones (although if you run out of empty ones, who knows what will happen). However, UEFI does have the idea of a user-modifiable order for these boot entries, so an OS (new or existing) can decide that its UEFI boot entry should of course go at the front of that list, so it's the default thing that gets booted by the machine.

I suspect that newly installed OSes will almost always try to put themselves in as the first and thus default UEFI boot entry. Existing OSes may or may not do this routinely, but I wouldn't be surprised if they definitely did it should you tell them to check for boot problems and repair anything they find. Probably this is a feature.

UEFIBootThings written at 22:22:18; Add Comment

2017-12-03

Some notes and considerations on SSH host key verification

Suppose, not entirely hypothetically, that you want to verify the SSH host keys of a server and that you're doing so with code that's reasonably under your control (instead of relying on, say, OpenSSH's ssh program). Then there are a number of things that you're going to want to think about because of how the SSH protocol works and how it interacts with security decisions.

The first thing to know is that you can only verify one type of host key in a single connection. As covered in RFC 4253 section 7.1, the client (you) and the server (the remote end) send each other a list of supported host key algorithms, and then the two of you pick one of the supported algorithms and verify the server's key in that algorithm. If you know multiple types of host keys for a server and you want to verify that the server knows all of them, you need to verify each type of key in a separate connection.

In theory, the client controls the preference order of the SSH host key types; you can say that you prefer ed25519 keys to RSA keys and the server should send its ed25519 key instead of its RSA key. In practice, a server can get away with sending you any type of host key that you said you'd accept, even if it's not your preference, because a server is allowed to claim that it doesn't have your preferred sort of host key (but good servers should be obedient to your wishes, because that's what the protocol requires). As a result, if you're verifying host keys you have a security decision to make: are you willing to accept any type of host key you have on file, or if you have your preferred type of host key on file, do you insist that the server present that type of key?

To be concrete, suppose that you have ed25519 and RSA keys for a server, you prefer ed25519 keys, and when you try to verify the server it offers you its RSA key instead of its ed25519 key. You could reject this on the grounds that either the server does not have the ed25519 key it should or that it's not following the protocol specification, or you could accept it because the server has a SSH host key that you have on file for it.

(As far as I can tell, OpenSSH's ssh command behaves the second way; it'll accept an RSA key even if you also have an ed25519 key for the server in your known_hosts.)

If you pick the first approach, you want to configure your SSH connection to the server to only accept a single key type, that being the best key type you have on file for the server. If you pick the second approach, you'll want to list all key types you have, in preference order (I prefer ed25519 to RSA and skip (EC)DSA keys entirely, while the current OpenSSH ssh_config manpage prefers ECDSA to ed25519 to RSA).

Under normal circumstances, the server will present only a single host key to be verified (and it certainly can only present a single type of key). This means that if you reject the initial host key the server presents, you will never be called on to verify another type of host key. If the server presents an ed25519 key and you reject it, you'll never get asked to verify an RSA key; the connection just fails. If you wanted to fall back to checking the RSA key in this case, you would have to make a second connection (during which you would only ask for RSA keys). In other words, if the server presents a key it must be correct. With straightforward code, your condition is not 'the server passes if it can eventually present any key that you know', your condition is 'the server passes if the first and only key it presents is one you know'.

PS: If you want to match the behavior of OpenSSH's ssh command, I think you're going to need to do some experimentation with how it actually behaves in various situations. I'm sure that I don't fully understand it myself. Also, you don't necessarily want to imitate ssh here; it doesn't necessarily make the most secure choices. For instance, ssh will happily accept a known_hosts file where a server has multiple keys of a given type, and pass the server if it presents a key that matches any one of them.

Sidebar: How a server might not know some of its host keys

The short version is re-provisioning servers. If you generate or record a server's host key of a given type, you need to also make sure that the server is (re-)provisioned with that key when it gets set up. If you miss a key type, you'll wind up with the server generating and presenting a new key of that type. This has happened to us every so often; for example, we missed properly re-provisioning ed25519 keys on Ubuntu 14.04 machines for a while.

SSHHostKeyVerificationNotes written at 23:38:02; Add Comment

2017-11-30

The cost of memory access across a NUMA machine can (probably) matter

We recently had an interesting performance issue reported to us by a researcher here. We have a number of compute machines, none of them terribly recent; some of them are general access and some of them can be booked for exclusive usage. The researcher had a single-core job (I believe using R) that used 50 GB or more of RAM. They first did some computing on a general-access compute server with Xeon E5-2680s and 96 GB of RAM, then booked one of our other servers with Xeon X6550s and 256 GB of RAM to do more work on (possibly work that consumed significantly more RAM). Unfortunately they discovered that the server they'd booked was massively slower for their job, despite having much more memory.

We don't know for sure what was going on, but our leading theory is NUMA memory access effects because the two servers have significantly different NUMA memory hierarchies. In fact they are the two example servers from my entry on getting NUMA information from Linux. The general access server had two sockets for 48 GB of RAM per socket, while the bookable compute server with 256 GB of RAM had eight sockets and so only 32 GB of RAM per socket. To add to the pain, the high-memory server also appears to have a higher relative cost for access to the memory of almost all of the other sockets. So on the 256 GB machine, memory access was likely going to other NUMA nodes significantly more frequently and then being slower to boot.

Having said that, I just investigated and there's another difference; the 96 GB machine has DDR3 1600 MHz RAM, while the 256 GB machine has DDR3 RAM at 1333 Mhz (yes, they're old machines). This may well have contributed to any RAM-related slowdown and makes me glad that I checked; I don't usually even consider RAM module speeds, but if we think there's a RAM-related performance issue it's another thing to consider.

I found the whole experience to be interesting because it pointed out a blind spot in my usual thinking. Before the issue came up, I just assumed that a machine with more memory and more CPUs would be better, and if it wasn't better it would be because of CPU issues (here they're apparently generally comparable). That NUMA layout (and perhaps RAM speed) made the 'big' machine substantially worse was a surprise. I'm going to have to remember this for the future.

PS: The good news is that we had another two-socket E5-2680 machine with 256 GB that the researcher could use, and I believe they're happy with its performance. And with 128 GB of RAM per socket, they can fit even quite large R processes into a single socket's memory.

NUMAMemoryCanMatter written at 00:07:52; Add Comment

2017-11-18

AMD Ryzens, their memory speed peculiarities, and ECC

Intel's Core i7 CPUs, such as the one I'm planning to use in my next PC, have a nice simple memory support story; there's no ECC support and regardless of how many DDR4 DIMMs you use, they'll run at an officially supported maximum rate of 2666 MHz. Unfortunately AMD Ryzen memory support is nowhere near that simple and some of its complexities create hassles. Since I've recently been putting together a Ryzen system configuration for reasons beyond the scope of this entry, I want to write down what I've learned.

To start out with, Ryzens (and apparently everything made from AMD's Zen microarchitecture) have restrictions on what memory speeds they can achieve with various types and numbers of DIMMs. Current Ryzens support four DIMM slots and from the charts, if you have two DIMMs, single rank DIMMs give you a maximum memory speed of 2666 MHz and double rank DIMMs a maximum of 2400 Mhz, while with four DIMMs the single rank maximum is 2133 Mhz and the double rank maximum is 1866 Mhz. This is without overclocking; if you overclock, apparently you can significantly improve the four DIMM cases.

(I couldn't dig up a clear answer about maximum memory speeds for single-channel mode, but I believe they're probably the same and it's just that your bandwidth drops. Most people install DIMMs in pairs these days.)

Many DDR4 DIMMs appear to be double rank, although it's often hard to tell; memory vendors generally aren't clear about this. Kingston is a noteworthy exception, as they let you search their DDR4 modules based on rank (among other things, and note that you need unbuffered DDR4 DIMMs for Ryzen). Where single rank DDR4 DIMMs are available, they appear to only go up to 8 GB sizes; in 16 GB, you only have double rank DIMMs. This means that if you want 32 GB of RAM with a Ryzen, your maximum memory speed is 2400 MHz using two 16 GB DDR4-2400 DIMMs. Using single ranked 8 GB DIMMs is actually worse, since four DIMMs pushes you down to a maximum of 2133 MHz.

Before I started writing this entry, I was going to say that you could only get ECC memory for Ryzens in DIMMs of at most 8 GB. More extensive research shows that this is not the case; both Kingston and Crucial have 16 GB ECC DDR4-2400 UDIMMs that their websites consider compatible with some Ryzen motherboards such as the ASRock X370 Taichi (taken from here, which tested Ryzen ECC with 8 GB Crucial single rank ECC UDIMMs) or the Asus PRIME X370-PRO. However, the available ECC UDIMMs appear to have conservative timings and thus I believe slightly higher latency than other DDR4 UDIMMs (eg, CL 17 versus CL15 for more performance focused non-ECC DDR4-2400 UDIMMs). Whether this will make a practical difference for most uses is an open question, especially for a non-gaming workstation.

(A casual check suggests that ECC DDR4-2400 also appears to be clearly higher priced than non-ECC DDR4-2400, which is already annoyingly expensive.)

PS: I believe that much of this applies to AMD Threadripper and even Epyc CPUs and systems, because they're all built around the same microarchitecture. Threadrippers are quad channel instead of dual channel, so you get twice as many DIMMs before you run into these limitations, but that still means that if you want fast memory you're stuck with at most 64 GB of RAM in a Threadripper system. 64 GB is probably lots for a desktop, but we're looking at Threadripper based compute servers and our users like memory.

RyzenMemorySpeedAndECC written at 02:58:51; Add Comment

(Previous 10 or go back to November 2017 at 2017/11/14)

Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.