The CPU architectural question of what is a (reserved) NOP

January 29, 2023

I recently wrote about an instruction oddity in the PowerPC 64-bit architecture, where a number of or instructions with no effects were reused to signal hardware thread priority to the CPU. This came up when Go accidentally used one of those instructions for its own purposes and accidentally lowered the priority of the hardware thread. One of the reactions I've seen has been a suggestion that people should consider all unofficial NOPs (ie, NOPs other than the officially documented ones) to be reserved by the architecture. However, this raises a practical philosophical question, namely what's considered a NOP.

In the old days, CPU architectures might define an explicit NOP instruction that was specially recognized by the CPU, such as the 6502's NOP. Modern CPUs generally don't have a specific NOP instruction in this way; instead, the architecture has a significant number of instructions that have no effects (for various reasons including of the regularity of instruction sets) and one or a few of those instructions is blessed as the official NOP and may be specially treated by CPUs. The PowerPC 64-bit official NOP is 'or r1, r1, 0', for example (which theoretically OR's register r1 with 0 and puts the result back into r1).

Update: I made a mistake here; the official NOP uses register r0, not r1, so 'or r0, r0, 0', sometimes written 'ori 0, 0, 0'.

So if you say that all unofficial NOPs are reserved and should be avoided, you have to define what exactly a 'NOP' is in your architecture. One aggressive definition you could adopt is that any instruction that always has no effects is a NOP; this would make quite a lot of instructions NOPs and thus unofficial NOPs. This gives the architecture maximum freedom for the future but also means that all code generation for your architecture needs to carefully avoid accidentally generating an instruction with no effects, even if it naturally falls out by accident through the structure of that program's code generation (which could be a simple JIT engine).

Alternately, you could say that (only) all variants of your standard NOP are reserved; for PowerPC 64-bit, this could be all or instructions that match the pattern of either 'or rX, rX, rX' or 'or rX, rX, 0' (let's assume the immediate is always the third argument). This leaves the future CPU designer with fewer no-effect operations they can use to signal things to the CPU, but makes the life of code generators simpler because there are fewer instructions they have to screen out as special exceptions. If you wanted to you could include some other related types of instructions as well, for example to say that 'xor rX, rX, 0' is also a reserved unofficial NOP.

A CPU architecture can pick whichever answer it wants to here, but I hope I've convinced my readers that there's more than one answer here (and that there are tradeoffs).

PS: Another way to put this is that when an architecture makes some number of otherwise valid instructions into 'unofficial NOPs' that you must avoid, it's reducing the regularity of the architecture in practice. We know that the less regular the architecture is, the more annoying it can be to generate code for.


Comments on this page:

By Sam James at 2023-01-29 23:06:17:

Before I read the article, I was expecting this to be about Control Flow Technology (CET) which uses (ab)uses NOPs in this way too.

Very interesting, thanks!

I am reminded of Raymond Chen writing about Windows' use of a four-byte NOP. It's not the architecture's official (1-byte) NOP, but it allows patching in a branch instruction without any threads having a PC in the middle of a series of 1-byte NOPs.

Actually, PPC canonical nop is ori 0,0,0.

By cks at 2023-01-30 10:50:08:

Oops, you're right. I've added an update correcting my mistake.

(For other people reading this, PowerPC assembly mnemonics sometimes seem to leave out saying that something is a register instead of an immediate value if the instruction only accepts a register as that argument. So 'ori 0, 0, 0' is 'ori r0, r0, 0' because ori only accepts registers as the first two arguments, and writing '0' implicitly means 'register 0' instead of 'literal 0'.)

Written on 29 January 2023.
« I should assume contexts aren't retained in Go APIs
One reason I still prefer BIOS MBR booting over UEFI »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sun Jan 29 22:23:44 2023
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.