2007-01-07
Another issue with C's volatile
I think that part of the issue with volatile is that it can be
used for at least three different things:
- stuff that can change behind the compiler's back (eg flags set in
a signal handler), where mostly you need the compiler to not cache
values loaded from
volatilevariables. - hardware registers, where reads and writes have side effects, so you need the compiler to do all loads, even apparently pointless ones (as John Mashey and MIPS found out in the early ages of optimizing compilers being used on Unix kernels).
- shared-state stuff across multiple processors, which needs memory barriers.
The second case is still the one usually mentioned in discussions of
volatile, generally without any mention that these days you need
memory and IO barriers when dealing with hardware. (You need IO barriers
because the CPU memory controller isn't the only thing doing weird
reordering and merging tricks.)
The compiler can at least theoretically issue sufficiently strong memory barriers to handle shared-state stuff. However, IO barriers are device specific (for example, PCI devices only guarantee that writes have posted once you read something from that device), so on modern hardware the compiler has no chance of getting #2 completely correct.
In practical terms, this means that the compiler might as well not even bother trying to generate memory barriers; they're not needed in one case and not sufficient in another. Omitting them entirely and documenting this (which I will fault gcc on) at least makes sure that no one accidentally relies on them and gets burned.
(The counter-argument is that shared-state stuff is going to be
increasingly common due to SMP becoming more and more common, so the
compiler should support it directly to help people out. I don't believe
that, because I don't think shared-state concurrency is usable in
general; people who try to program in it will routinely blow their feet
off whether or not the C compiler puts memory barriers around their
volatiles.)
2007-01-05
Henry Spencer on C's volatile
Henry Spencer sent me a reply to my previous entry
on volatile that corrects some of my (mis)information, and I am
putting it here with his permission:
I fear you've miscalled this one. ANSI C pre-dates aggressive memory controllers in consumer CPUs, but it had extensive participation by people in the supercomputer world as well, and such things were common there long ago.
The key issue you've missed is that ANSI C doesn't really acknowledge the existence of a "compiler". It's phrased almost completely in terms of an "implementation", which takes in source code and spits out execution results. It is the implementation's responsibility to ensure that accesses to `volatile' variables happen in the order specified. If the implementation consists of a compiler plus a CPU, it is the compiler's job to make the CPU do the right thing, however difficult that may be.
There is some leeway here, because some aspects of the precise definition of the "right thing" are up to the implementation... but ANSI C does require that they be documented.
Compiler implementers -- except on supercomputers -- are notoriously casual about these issues, but it is technically their responsibility to establish exact rules and then force the hardware to live up to them.
It is unfortunately true that memory barriers etc. are a complicated subject, and `volatile' is too simplistic (and has too many different uses) to capture all of what people want done.
Henry
I suspect that the gcc people would counter-argue from two possible positions:
- because threading and memory semantics are not defined in ANSI C,
no strictly conforming ANSI C program can have shared state issues
that require memory barriers, and so the compiler is not required to
implement them for
volatileaccesses; the compiler-level actions are good enough. - almost all code (mis)uses
volatilein ways that do not require memory barriers and which would slow down significantly if memory barriers were introduced. Thus, regardless of the letter of ANSI C,volatileaccesses will not introduce memory barriers.
Especially if the first is correct, this reduces gcc's sins to at most failing to document what volatile does and doesn't do (and I'm not convinced that they're even failing at that; they do have a small section about it in the info docs, which might be good enough in that it doesn't say that they do do memory barriers et al).
2007-01-03
The problem with C's volatile qualifier
The real problem with C's volatile (well, one of them) is that
volatile only affects the compiler; it doesn't normally make the
compiler do anything to affect the CPU's interaction with memory. On
modern hardware, merely affecting the compiler is not enough; the
CPU/memory interface has its own optimizations, which must be dealt with
too for shared-state code to work right.
From the perspective of history this makes a lot of sense, because
ANSI C mostly predates CPU memory controllers that can do significant
reordering. Back in those days, it made sense to only worry about
the compiler's optimization model; you could assume that loads and
stores would be seen by memory (and IO devices) in the same order that
they appeared in the assembly code, which would be the same order
they appeared in C unless the compiler decided to optimize things. So
volatile could be specified (to the extent that it was specified) as
more or less 'turn off compiler optimizations for this', and it all
worked out.
Unfortunately, modern CPUs have blown vast holes in this assumption
and in the process reduced volatile to being useful for little
more than forcing spinloops to always load values, eg:
while (!ready) ;
(Of course a really smart compiler could work out that there is no point
to this loop if it doesn't reload ready every time through.)
I suspect that no one has tried to make compilers generate explicit memory barriers because it is a rather complicated field and it's not obvious what sort of barriers are needed when. (The Linux kernel source has an enlightening and rather large discussion of the whole issue in Documentation/memory-barriers.txt.)
(The other problem with generating explicit memory barriers for access
to volatile variables is that it would instantly hose all of the
code using volatile for something other than shared-state variables,
which is probably most of it. Compiler authors are generally opposed to
changes with significant and unnecessary performance impacts for the
majority of the code their compiler compilers.)
(I can't imagine that this is a unique observation, but it bubbled up in my mind recently and I feel like writing it down explicitly, if only to cement it in my own understanding.)