2017-05-05
The temptation of a Ryzen-based machine for my next office workstation
My office workstation is the same hardware build as my current home machine, which means that it's also more than five years old by now. I was not necessarily expecting to replace it soon, but this week things have started to happen to it. First there was an unexpected system lockup and reboot and then today my CPU reporting thermal throttling, with the always fun kernel messages of:
CPU1: Core temperature above threshold, cpu clock throttled CPU3: Package temperature above threshold, cpu clock throttled CPU2: Package temperature above threshold, cpu clock throttled CPU0: Package temperature above threshold, cpu clock throttled CPU1: Package temperature above threshold, cpu clock throttled
(All of my system's fans are working; I checked. Some sources suggest that the first step here is to take the CPU and heatsink apart, clean off the current thermal paste, and re-paste it. I may try to do this on Monday if we have thermal paste around at work, but the timing is terrible as I'm about to go on vacation.)
For obvious reasons this has pushed me into thinking more actively about replacement hardware for my office machine, and when I start thinking about that, I have to admit that building an AMD Ryzen based machine feels like an attractive idea despite what I said about Ryzen for my likely next home machine. There are either three or two sensible reasons for this and one emotional one.
The first reason is that I generally do more multi-core things on my office machine than on my home machine; I run VMs (and might run more if it had less impact on things) and I compile software more often for various fuzzy reasons (and some of the time I care more about how fast this happens, for example if I'm bisecting some problem in an open-source project like Firefox). In theory this makes single core CPU performance less important and many well-performing cores more useful, especially in the future as more things become multi-core enabled (for example, the Go developers are working on concurrent compilation of single files).
The complicated and only potential reason is that work is more price sensitive about things like CPU costs than I am for my home machine. I started out thinking that the Intel i7-7700K was more expensive than the Ryzen 7 models, but this turns out to be wrong; at current Canadian prices, the i7-7700K and the Ryzen 1700X are about the same price (the Ryzen 1800X is clearly more and the Ryzen 1700 is only a bit cheaper). However these are still relatively expensive CPUs, so I might well get forced down to, say, something in the range of the i5-7600K and the Ryzen 5 1600X. At this level people seem to think that the Ryzen 5 is the relatively clear winner; you don't lose as much on single-core performance and you pick up a significant edge in multi-core work.
The third reason is the possibility of ECC support. At least some AMD Ryzen motherboards do seem to actually support this in practice and if I more or less get it for free, I'll definitely take it. It's only a 'nice to have' thing, though; I wouldn't give up anything substantive to get it, even (or especially) on my office machine.
The emotional reason is that I want the plucky underdog AMD to make good, and I want to support them. I don't particularly like Intel's domination and various things it leads to (such as their ECC non-support) and I would be perfectly happy to be part of giving them a real challenge for once. If a Ryzen based system is competitive with an Intel one, I'm somewhat irrationally biased in favour of the AMD option.
(For example, going with the AMD option would require a graphics card and I haven't looked at the relative level of motherboard features that I'd probably wind up with. My emotional 'I would like AMD' reaction has pushed those pragmatic issues out to the periphery. For that matter, apparently there are memory speed issues with AMD Ryzens and 32 GB of RAM, and memory bandwidth may matter to at least some of what I do.)
Digging into BSD's choice of Unix group for new directories and files
I have to eat some humble pie here. In comments on my entry on
an interesting chmod
failure, Greg A.
Woods pointed out that FreeBSD's behavior of creating everything
inside a directory with the group of the directory is actually
traditional BSD behavior (it dates all the way back to the 1980s),
not some odd new invention by FreeBSD. As traditional behavior it
makes sense that it's explicitly allowed by the standards, but
I've also come to think that it makes sense in context and in
general. To see this, we need some background about the problem
facing BSD.
In the beginning, two things were true in Unix: there was no
mkdir()
system call, and processes
could only be in one group at a time. With processes being in only
one group, the choice of the group for a newly created filesystem
object was easy; it was your current group. This was felt to be
sufficiently obvious behavior that the V7 creat(2)
manpage
doesn't even mention it.
(The actual behavior is implemented in the kernel in maknode()
in iget.c.)
Now things get interesting. 4.1c BSD seems to be where mkdir(2)
is introduced and where creat()
stops being a system call
and becomes an option to open(2)
.
It's also where processes can be in multiple groups for the first
time. The 4.1c BSD open(2)
manpage is silent about the group of
newly created files, while the mkdir(2)
manpage specifically
claims that new directories will have your effective group (ie, the
V7 behavior). This is actually wrong. In both mkdir()
in
sys_directory.c
and maknode()
in ufs_syscalls.c,
the group of the newly created object is set to the group of the
parent directory. Then finally in the 4.2 BSD mkdir(2)
manpage
the group of the new directory is correctly documented (the 4.2
BSD open(2)
manpage
continues to say nothing about this). So BSD's traditional behavior
was introduced at the same time as processes being in multiple groups,
and we can guess that it was introduced as part of that change.
When your process can only be in a single group, as in V7, it makes perfect sense to create new filesystem objects with that as their group. It's basically the same case as making new filesystem objects be owned by you; just as they get your UID, they also get your GID. When your process can be in multiple groups, things get less clear. A filesystem object can only be in one group, so which of your several groups should a new filesystem object be owned by, and how can you most conveniently change that choice?
One option is to have some notion of a 'primary group' and then
provide ways to shuffle around which of your groups is the primary
group. One problem with this is that it's awkward and error-prone
to work in different areas of the filesystem where you want your
new files and directories to be in different groups; every time you
cd
around, you may have to remember to change your primary group.
If you move into a collaborative directory, better shift (in your
shell) to that group; cd
back to $HOME
, or simply want to write
a new file in $HOME
, and you'd better remember to change back.
Another option is the BSD choice of inheriting the group from
context. By far the most common case is that you want your new files
and directories to be created in the 'context', ie the group, of
the surrounding directory. If you're working in $HOME
, this is
your primary login group; if you're working in a collaborative area,
this is the group being used for collaboration. Arguably it's a
feature that you don't even have to be in that group (if directory
permissions allow you to make new files). Since you can chgrp
directories that you own, this option also gives you a relatively
easy and persistent way to change which group is chosen for any
particular area.
If you fully embrace the idea of Unix processes being in multiple groups, not just having one primary group and then some number of secondary groups, then the BSD choice makes a lot of sense. And for all of its faults, BSD tended to relatively fully embrace its changes (not totally, perhaps partly because it had backwards compatibility issues to consider). While it leads to some odd issues, such as the one I ran into, pretty much any choice here is going to have some oddities. It's also probably the more usable choice in general if you expect much collaboration between different people (well, different Unix logins), partly because it mostly doesn't require people to remember to do things.
(I know that on our systems, a lot of directories intended for collaborative work tend to end up being setgid specifically to get this behavior.)