A little puzzle with printf()
and C argument passing
In The Easy Ones – Three Bugs Hiding in the Open, Bruce Dawson gave us a little C puzzle in passing:
The variable arguments in printf formatting means that it is easy to get type mismatches. The practical results vary considerably:
- printf(“0x%08lx”, p); // Printing a pointer as an int – truncation or worse on 64-bit
- printf(“%d, %f”, f, i); // Swapping float and int – could print nonsense, or might actually work (!)
- printf(“%s %d”, i, s); // Swapping the order of string and int – will probably crash
[...] (aside: understanding why #2 often prints the desired result is a good ABI puzzle)
I had to think about this for a bit, and then I realized why and how it can work (and why similar integer versus float argument confusion can also work for other functions, even ones with fixed argument lists). What it comes down to is that in some ABIs, arguments are passed in registers (at least early arguments, before you run out of registers), and floating point arguments are passed in different registers than integers (and pointers). This is true even for functions that take variable arguments and will walk through them using stdarg macros (or at least it can be, depending on the ABI).
Because floating point and non floating point arguments are passed in different sets of registers, what matters isn't the total order of arguments but the order of floating point or non-fp arguments. So here, regardless of where '%f' is in the printf format, it always causes printf() to get the first floating point argument, which can never be confused with an integer argument. Similarly, the first '%d' causes printf() to look for the second non-fp argument, regardless of where it was in the argument order; it could be at the end of several floating point arguments and still work.
(The '%d' makes printf() look for the second non-fp argument because the first one was the format string. In an ABI that passed pointers in a separate place than integers, it would still work out, since now the first '%d' would be looking for the first integer argument.)
Using the excellent services of godbolt.org, we can see this in
action on 64-bit x86 in a very small example (I used a very small example and a
decent optimization level to get clear, minimal assembly code). The
floating point argument is passed in xmm0
, while the format string
and the integer argument are passed in edi
and esi
respectively
(I don't know what eax
is doing, but it probably has something
to do with the ABI). A similar thing happens on 64-bit ARM v8 (aka
Aarch64), as we can also see on godbolt with the same example on
Aarch64.
(Based on this page,
the Aarch64 x0
and w1
are in the same set of registers. Apparently
d0
is a 64-bit version of the first floating point register, from
here
[pdf]. I wound up looking up all of this to be sure I understood
what was going on in the Aarch64 call, so I might as well write it
down here.)
Since pointers and integers are normally passed in the same set of
registers (at least on 64-bit x86 and Aarch64), we can also see why
the third example is very likely to fail. Since the same set of
registers is used for both argument types, it's possible to use an
integer argument as a pointer argument, with a segmentation fault
as the likely result. Similarly, we can predict that 'printf("%s
%f", f, s);
' might well work.
PS: This confusion can happen in any language that follows the C ABI on a platform with this sort of split usage of registers (although many languages may prevent this sort of argument type confusion). Not all languages do; famously, Go currently passes all arguments on the stack (as of Go 1.15 and soon Go 1.16).
Comments on this page:
|
|