An example of optimizing C in the face of undefined behavior

September 22, 2013

In my entry about my understanding of modern C undefined behavior I alluded to how modern C created a security vulnerability in the Linux kernel because of variable initialization. Today I feel like showing you roughly the code involved and explaining this, rather than alluding to it indirectly.

We'll start with perfectly functional code:

int func(struct foo *arg)
{
    struct bar *p;

    if (arg == NULL)
        return -EINVAL;
    p = arg->f_p;

[.....]

This works right. Then someone comes along and fiddles with the code a bit:

int func(struct foo *arg)
{
    struct bar *p = arg->f_p;

    if (arg == NULL)
        return -EINVAL;

[....]

In the Linux kernel dereferencing a NULL pointer does not immediately segfault so this code is (sort of) okay as written; if arg is NULL we will read some bit of low memory and then return. Then the optimizing compiler shows up.

The compiler knows that no conforming C program can ever dereference a null pointer; to do so is undefined behavior. Since the code dereferences arg without checking for NULL, the compiler is entitled to assume that arg is never NULL. This makes the if's expression into a constant value (it must always be false) and the whole if can then be eliminated as dead code. Now this function will continue on even if arg is NULL and in the process use whatever it fished out of memory as p (this enables the actual exploit).

This is of course bad code (since you should never dereference potentially NULL pointers). But it would not naturally be fatal; it took a C compiler optimizing code to make it that.

(The exact code is used as an example here if you want to read it. It is somewhat more intricate than my version but not much so.)


Comments on this page:

By Ben Hutchings at 2013-09-22 19:41:21:

It used to be possible for Linux userland to specifically map memory at address 0, so that this wouldn't segfault. In fact this is necessary for DOS and Win16 emulation. However, that is now disabled by default since it made it easy to exploit kernel bugs that involve using null pointers. The vm.mmap_min_addr sysctl controls this.

Written on 22 September 2013.
« Nested conditional expressions in Python (and code golf)
ZFS filesystem compression and quotas »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sun Sep 22 01:04:31 2013
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.