How dynamic language code gets optimized

April 28, 2014

Yesterday I mentioned in passing that code that used Python dictionaries in predictable struct-like ways could probably be optimized just as similar uses of Python class instances is (in JIT environments like eg PyPy). This may raise some people's eyebrows; after all, aren't Python dictionaries too complex and flexible to get optimized?

It's my understanding that this is based on a misunderstanding of how modern dynamic languages are optimized. To simplify, the core of optimizing dynamic languages like Javascript, Python, and so on is the fact that most actual programs are not very dynamic. Instead in practice they're mostly static programs with mostly static types and so on. Optimizing a dynamic language program mostly consists of finding the static code that exists inside it and directly implementing that. In other words, dynamic languages are optimized by recognizing behavior. In theory it doesn't matter where this behavior happens or what language-level constructs it uses; all that matters is that you can reliably spot the behavior and optimize it.

(Sometimes this behavior is recognized dynamically, as the code executes, and sometimes it's recognized statically as the code is inspected. And of course there are schemes that do both.)

But there's still two reasons to care about what language level features and data structures get used for something, because all of this optimization is a heuristic. First, you care about how likely it is that your attempts to find behavior will work and this is related to how people actually use the language. If most people use classes as structures instead of dictionaries, then it's must more likely that your structure-recognition heuristics will succeed if you look at class usage. Time spent carefully tracking dictionary usage to spot structure usage patterns may generally be wasted time.

Second, even when you think you recognize a pattern of behavior you may turn out to be wrong; after behaving 'right' for a while the code may then turn around and do something that invalidates the pattern. This can waste some or all of your effort in optimizing for the pattern, so you also care about how often this happens with any particular sort of object. Again this will be related to how people use the language; if most people use dictionaries flexibly, you're much more likely to have false positives on temporary 'structure like' behavior with a particular dictionary and section of code that's invalidated later.

Lack of freedom in the language for how something is used certainly helps optimization, in part because it means that you don't even have to consider or worry about certain possibilities. Python's __slots__ is an example of this. But it's generally not an essential prerequisite.

Written on 28 April 2014.
« Thoughts about Python classes as structures and optimization
Static sites are stable sites »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Mon Apr 28 00:28:17 2014
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.