== Thoughts about Python classes as structures and optimization I recently watched yet another video of a talk on getting good performance out of Python. One of the things it talked about was the standard issue of 'dictionary abuse', in this case in the context of creating structures. If you want a collection of data, the equivalent of a C _struct_, things that speed up Python will do much better if you say what you mean by representing it as a class: .pn prewrap on > class AStruct(object): > def __init__(self, a, b, c): > self.a = a > self.b = b > self.c = c Even though Python is a dynamic language and _AStruct_ instances could in theory be rearranged in many ways, in practice they generally aren't and when they aren't we know a lot of ways to speed them up and make them use minimal amounts of memory. If you instead just throw them into a dictionary, much less optimization is (currently) done. (I suspect that many of these dynamic language optimizations could be applied to dictionary usage as well, it's just that people are hoping to avoid it for various reasons.) My problem with this is that even small bits of extra typing tempt me into unwise ways to reduce it. In [[this early example EmulatingStructsInPython]] I both skipped having an ((__init__)) function and just directly assigned attributes on new instances and wrote a generic function to do it ([[this StructsWithDefaults]] has a better version). This is all well and good in ordinary CPython, but now I have to wonder how far one can go before the various optimizers and JIT engines will throw up their hands and give up on clever things. (I suspect that the straightforward ((__init__)) version is easiest for optimizers to handle, partly because it's a common pattern that attributes aren't added to an instance after ((__init__)) finishes.) It's tempting to ask for standard library support for simple structures in the form of something that makes them easy to declare. You could do something like '_AStruct = structs.create('a', 'b', 'c')_' and then everything would work as expected (and optimizers would have a good hook to latch on to). Unfortunately such a function is hard to create today in Python, especially in a form that optimizers like PyPy are likely to recognize and accelerate. Probably this is a too petty and limited wish. PS: of course the simplest and easiest to optimize version today is just a class that just has a ((__slots__)) and no ((__init__)). PyPy et al are guaranteed that no other attributes will ever be set on instances, so they can pack things as densely as they want.