An instruction oddity in the ppc64 (PowerPC 64-bit) architecture
Over on the Fediverse, I reported my discovery of a ppc64 oddity:
TIL that the ppc64 (PowerPC 64-bit) architecture overloads 'or r1,r1,r1' (and the same using all r6 or r2) to change the (hardware) priority of your thread. This came up in a Go code generation issue, and Raymond Chen mentioned it in passing in 2018.
As Raymond Chen notes, 'or rd, ra, ra' has the effect of 'move ra to rd'. Moving a register to itself is a NOP, but several Power versions (the Go code's comment says Power8, 9, and 10) overload this particular version of a NOP (and some others) to signal that the priority of your hardware thread should be changed by the CPU; in the specific case of 'or r1, r1, r1' it drops you to low priority. That leaves us with the mystery of why such an instruction would be used by a compiler, instead of the official NOP (per Raymond Chen, this is 'or r0, r0, 0').
The answer is kind of interesting and shows how intricate things can get in modern code. Go, like a lot of modern languages, wants to support stack tracebacks from right within its compiled code, without the aid of an external debugger. In order to do that, the Go runtime needs to be able to unwind the stack. Unwinding the stack is a very intricate thing on modern CPUs, and you can't necessarily do it past arbitrary code. Go has a special annotation for 'you can't unwind past here', which is automatically applied when the Go toolchain detects that some code (including assembly code) is manipulating the stack pointer in a way that it doesn't understand:
SPWRITE indicates a function that writes an arbitrary value to SP (any write other than adding or subtracting a constant amount).
As covered in the specific ppc64 diff in the change that introduced this issue, Go wanted to artificially mark a particular runtime function this way (see CL 425396 and Go issue #54332 for more). To do this it needed to touch the stack pointer in a harmless way, which would trigger the toolchain's weirdness detector. On ppc64, the stack pointer is in r1. So the obvious and natural thing to do is to move r1 to itself, which encodes as 'or r1, r1, r1', and which then triggers this special architectural behavior of lowering the priority of that hardware thread. Oops.
(The fix changes this to another operation that is apparently harmless due to how the Go ABI works on ppc64. Based on the ppc64 architecture section of the Go internal ABI, Go seems to define r0 as always zero.)
I don't know why PowerPC decided to make r1 (the stack pointer) the register used to signal lowering hardware thread priority, instead of some other register. It's possible r1 was chosen specifically because very few people were expected to write an or-NOP using the stack pointer instead of some other register.
(The whole issue is a useful reminder that modern architectures can have some odd corners and weird cases.)