== What performance anomalies mean I'm currently engaged in a slow-moving effort to improve the performance of Python's _os.walk()_ filesystem walking function (I have a [[long-standing interest ../python/SlowOsWalk]] in it and recently [[wrote about ../unix/DirectoryLinkCounts]] optimizing filesystem walking on Unix). As part of the work I've been benchmarking several variants of my code, and I stumbled over a performance anomaly where the normal _os.walk()_ is often faster than what should be a slightly optimized version of it. It's awfully easy to dismiss minor performance anomalies, especially if they happen in the early stages of optimizing something (in what you already expect to be only a minor improvement at best). But to do so is to miss what they mean: > ~~Performance anomalies are a sign that you don't understand something > important about what's really going on in your system~~. You might ask why this matters, especially if your full optimizations are still faster than the original code. We don't optimize code by making random changes and seeing if they speed the code up. Instead, we have a mental model of what is making the code slow and thus what can be done to make it faster. A performance anomaly means that some part of this mental model is wrong; either we don't actually understand why the existing code is slow, or we don't understand something about the runtime environment that makes our new code slower than we think it ought to be. (I am going to assume that you have a clear explanation for why your new code ought to be faster, such as 'it does only one _stat()_ instead of two'.) Understanding performance anomalies is especially important in modern high level languages because those languages do so much under the surface, hidden behind their high levels of abstraction. But you can't do deep optimizations without understanding what's happening down in the depths of your runtime environment; performance is one of the things that always leaks through abstractions. So a mysterious performance anomaly is a sign that you don't understand what's behind the abstractions as well as you thought you did. (PS: this doesn't mean that I understand why my little _os.walk()_ optimization turned out to be often slower. I haven't had the time to look into it, especially because I expect it to be difficult to track down.)