Chris's Wiki :: blog/programming/CAsAbstractMachine Commentshttps://utcc.utoronto.ca/~cks/space/blog/programming/CAsAbstractMachine?atomcommentsDWiki2023-02-04T21:07:39ZRecent comments in Chris's Wiki :: blog/programming/CAsAbstractMachine.By Flatfinger on /blog/programming/CAsAbstractMachinetag:CSpace:blog/programming/CAsAbstractMachine:c081ef65180f380983170eb2a376ea6ee7d1951fFlatfinger<div class="wikitext"><p>In many cases, it may be useful to treat a program as running on an abstract machine whose semantics aren't as precise as the underlying hardware, but are still much tighter than "Anything can happen" UB. One major limitation of the Standard's abstract machine is that it has no sensible way of treating a function like `test2()` below.</p>
<pre>
unsigned char arr[65537];
unsigned test(unsigned x)
{
unsigned i=1;
while((i & 0xFFFF) != x)
i*=3;
if (x < 65536)
arr[x] = 1;
return i;
}
void test2(unsigned x)
{
test(x);
}
</pre>
<p>Here, when `test()` is invoked from `test2()`, no iteration of the loop would perform any action that was observably sequenced before the following code, and it would thus be useful to postpone execution of the loop indefinitely (which would, of course, yield observable behavior equivalent to simply omitting the execution of the loop).</p>
<p>Unfortunately, the C Standard's Abstract Machine model requires that any situation where optimizations might yield behavior inconsistent with precise sequential program execution must be classified as Undefined Behaivor. Under the abstraction model processed by clang, the code would invoke UB any time `x` exceeds 65535, and thus the store to `arr[x]` may be performed unconditionally.</p>
</div>2023-02-04T21:07:39ZBy Verisimilitude on /blog/programming/CAsAbstractMachinetag:CSpace:blog/programming/CAsAbstractMachine:a9576c6483110bf1b170b40cf6c2c581806076d4Verisimilitudehttp://verisimilitudes.net<div class="wikitext"><p>What some people refuse to understand is that true portability requires formal semantics, but low-level languages needn't contort themselves to achieve it. Ada is a language simultaneously lower-level and higher-level than the C language, because it avoids unnecessarily specifying irrelevant details and has many dedicated ways to specify those same details when relevant, whereas the C language specifies irrelevant details and relies on implicit corner cases to permit certain behaviour whenever wanted. It also results in the traditional scattering of documentation everywhere.</p>
<p>All of this could've been solved with foresight, but then it wouldn't be the C language. The clever compilers are needed to work around the gross inefficiency of the language. People point at TCC, but next to no one uses it in any serious way.</p>
</div>2023-02-02T21:37:24ZBy moshev on /blog/programming/CAsAbstractMachinetag:CSpace:blog/programming/CAsAbstractMachine:4a86eebe72dad817ebdf70ebf6e0936339afeef7moshev<div class="wikitext"><p>I have long thought that C compilers ought to have a standardised "system semantics" mode where "undefined behaviour" means "whatever the system (CPU and OS if any) does". That already de-facto exists as various flags for GCC, Clang, MSVC, ICC - in general any compiler that aggressively optimises based on undefined behaviour has a flag to turn those off. C is currently used mainly for two goals - low-level system programming and high-performance computing. It would be a boon to the former to standardise a mode where the language behaves like the system you're compiling for.</p>
</div>2023-02-02T14:21:48Z