My understanding of modern C undefined behavior and its effects
Back in the old days, it was famously said that using undefined behavior
in your C program gave the compiler license to delete all of your files
if it felt like it. When people heard that we laughed, nodded sagely,
and went cheerfully on our way because of course no actual compiler was
ever going to react to undefined behavior in that way and everyone knew
it. (The closest real compilers ever came to that was how early versions
of GCC reacted to #pragma
.)
This left a whole generation of programmers with the attitude that C's large collection of undefined and implementation defined behavior was no big deal. Different CPUs or compilers might behave differently but the whole result would be fundamentally sane and often even predictable in advance (given knowledge of CPU behavior).
In the modern world, as John Regehr has taught me, this is both wrong and dangerous. Modern compilers do not delete your files or launch ICBMs when they encounter undefined behavior, because that would still be very stupid. Instead they do something much more dangerous: modern compilers will assume that undefined behavior can't happen. This knowledge that certain things can't happen is then used in optimization; for example, the compiler may deduce things about variable values which then gets fed through into dead code elimination and pretty soon you are removing a security check because the compiler knows it can 'never' trigger (in proper code).
(That led to a cute Linux kernel security vulnerability, by the way.)
The practical upshot is that it is now basically impossible to reason about how a chunk of code will behave in the face of undefined behavior and anyways, it changes. To even start requires a thorough understanding of modern compiler optimizations and a ruthlessly objective skeptic's eye so that you can see what the code actually says, not what you think it does. Only then are you in a position to start following the implications of, say, dereferencing a structure pointer as part of local variable initialization before you explicitly check said pointer to see if it's NULL.
Or in short modern C compilers do terrifying things with undefined behavior.
PS: I recommend you read John Regehr's blog. It's hair-raising.
(This was inspired by C J Silverio pointing to this HN comment.)
|
|