== What can go wrong if your compiler is not thread aware Courtesy of [[Pete Zaitcev http://zaitcev.livejournal.com/107185.html]], here's a great example of what happens when optimizing compilers aren't thread aware. Start with code of the form: > struct a b; > > b.pos = *ppos; > ret = foo(&b, ..., b.pos); Modern versions of gcc 4 on x86 will optimize the function call into: > _ret = foo(&b, ..., ~~*ppos~~);_ (This is a less stupid optimization than it looks; _ppos_ is a function parameter and I believe it's in a register, so it may well be faster to perform an indirect load from it than a computed indirect load off the stack pointer.) What goes wrong in a multi-threaded environment (in this case, the Linux kernel) is that the value of ((*ppos)) can change between the store into _b.pos_ and the function call, and that the _foo_ function expects the two values to be the same. Of course, the authors of the function that this appeared in didn't think that they needed to do any locking, because after all they only dereference ((*ppos)) once. (To deal with one possible code nitpick, the Linux kernel makes liberal use of implicit atomic reads and writes. This probably makes purists cringe, but the odds of a major CPU architecture ever not having large atomic reads and writes are pretty small by now.) I don't think we can blame either the compiler or the programmers for this. What's really happening is that we've been tripped up because we have different implicit assumptions about the code than the compiler does. And a good part of the reason that these sorts of assumptions stay implicit is that we don't have good tools for making them explicit in non-annoying ways. (So while we can blithely talk about a 'thread aware compiler', I'm not sure we know what one should actually look like.)