Another issue with C's volatile
I think that part of the issue with volatile is that it can be
used for at least three different things:
- stuff that can change behind the compiler's back (eg flags set in
a signal handler), where mostly you need the compiler to not cache
values loaded from
volatile variables.
- hardware registers, where reads and writes have side effects, so
you need the compiler to do all loads, even apparently pointless
ones (as John Mashey and MIPS found out in the early ages of
optimizing compilers being used on Unix kernels).
- shared-state stuff across multiple processors, which needs memory
barriers.
The second case is still the one usually mentioned in discussions of
volatile, generally without any mention that these days you need
memory and IO barriers when dealing with hardware. (You need IO barriers
because the CPU memory controller isn't the only thing doing weird
reordering and merging tricks.)
The compiler can at least theoretically issue sufficiently strong
memory barriers to handle shared-state stuff. However, IO barriers are
device specific (for example, PCI devices only guarantee that writes
have posted once you read something from that device), so on modern
hardware the compiler has no chance of getting #2 completely correct.
In practical terms, this means that the compiler might as well not
even bother trying to generate memory barriers; they're not needed in
one case and not sufficient in another. Omitting them entirely and
documenting this (which I will fault gcc on) at least makes sure that
no one accidentally relies on them and gets burned.
(The counter-argument is that shared-state stuff is going to be
increasingly common due to SMP becoming more and more common, so the
compiler should support it directly to help people out. I don't believe
that, because I don't think shared-state concurrency is usable in
general; people who try to program in it will routinely blow their feet
off whether or not the C compiler puts memory barriers around their
volatiles.)