A little puzzle with printf() and C argument passing

December 28, 2020

In The Easy Ones – Three Bugs Hiding in the Open, Bruce Dawson gave us a little C puzzle in passing:

The variable arguments in printf formatting means that it is easy to get type mismatches. The practical results vary considerably:

  1. printf(“0x%08lx”, p); // Printing a pointer as an int – truncation or worse on 64-bit
  2. printf(“%d, %f”, f, i); // Swapping float and int – could print nonsense, or might actually work (!)
  3. printf(“%s %d”, i, s); // Swapping the order of string and int – will probably crash

[...] (aside: understanding why #2 often prints the desired result is a good ABI puzzle)

I had to think about this for a bit, and then I realized why and how it can work (and why similar integer versus float argument confusion can also work for other functions, even ones with fixed argument lists). What it comes down to is that in some ABIs, arguments are passed in registers (at least early arguments, before you run out of registers), and floating point arguments are passed in different registers than integers (and pointers). This is true even for functions that take variable arguments and will walk through them using stdarg macros (or at least it can be, depending on the ABI).

Because floating point and non floating point arguments are passed in different sets of registers, what matters isn't the total order of arguments but the order of floating point or non-fp arguments. So here, regardless of where '%f' is in the printf format, it always causes printf() to get the first floating point argument, which can never be confused with an integer argument. Similarly, the first '%d' causes printf() to look for the second non-fp argument, regardless of where it was in the argument order; it could be at the end of several floating point arguments and still work.

(The '%d' makes printf() look for the second non-fp argument because the first one was the format string. In an ABI that passed pointers in a separate place than integers, it would still work out, since now the first '%d' would be looking for the first integer argument.)

Using the excellent services of godbolt.org, we can see this in action on 64-bit x86 in a very small example (I used a very small example and a decent optimization level to get clear, minimal assembly code). The floating point argument is passed in xmm0, while the format string and the integer argument are passed in edi and esi respectively (I don't know what eax is doing, but it probably has something to do with the ABI). A similar thing happens on 64-bit ARM v8 (aka Aarch64), as we can also see on godbolt with the same example on Aarch64.

(Based on this page, the Aarch64 x0 and w1 are in the same set of registers. Apparently d0 is a 64-bit version of the first floating point register, from here [pdf]. I wound up looking up all of this to be sure I understood what was going on in the Aarch64 call, so I might as well write it down here.)

Since pointers and integers are normally passed in the same set of registers (at least on 64-bit x86 and Aarch64), we can also see why the third example is very likely to fail. Since the same set of registers is used for both argument types, it's possible to use an integer argument as a pointer argument, with a segmentation fault as the likely result. Similarly, we can predict that 'printf("%s %f", f, s);' might well work.

PS: This confusion can happen in any language that follows the C ABI on a platform with this sort of split usage of registers (although many languages may prevent this sort of argument type confusion). Not all languages do; famously, Go currently passes all arguments on the stack (as of Go 1.15 and soon Go 1.16).

Written on 28 December 2020.
« Our alerts are quiet most of the time (as they should be)
It feels like the broad Unix API is being used less these days »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Mon Dec 28 00:13:49 2020
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.