Wandering Thoughts archives

2018-03-22

Why seeing what current attributes a Python object has is hard

Back when I wrote some notes on __slots__ and class hierarchies, I said in passing that there was no simple way to see what attributes an object currently has (I was sort of talking about objects that use __slots__, but it's actually more general). Today, for reasons beyond the scope of this entry, I feel like talking about why things work out this way.

To see where things get tricky, I'll start out by talking about where they're simple. If what we have is some basic struct object and we want to see what fields it has, the most straightforward approach is to look at its __dict__. We can get the same result indirectly by taking the dir() of the object and subtracting the dir() of its class:

>>> class A:
...   def __init__(self):
...      self.a = 10
...      self.b = 20
... 
>>> a = A()
>>> set(dir(a)) - set(dir(a.__class__))
{'b', 'a'}

(This falls out of the definition of dir(), but note that this only works on simple objects that don't do a variety of things.)

The first problem is that neither version of this approach works for instances of classes that use __slots__. Such objects have no __dict__, and if you look at dir() it will tell you that they have no attributes of their own:

>>> class B:
...   __slots__ = ('a', 'b')
...   def __init__(self):
...      self.a = 10
...
>>> b = B()
>>> set(dir(b)) - set(dir(b.__class__))
set()

This follows straightforwardly from how __slots__ are defined, particularly this bit:

  • __slots__ are implemented at the class level by creating descriptors (Implementing Descriptors) for each variable name. [...]

Descriptors are attributes on the class, not on instances of the class, although they create behavior in those instances. As we can see in dir(), the class itself has a and b attributes:

>>> B.a
<member 'a' of 'B' objects>

(In CPython, these are member_descriptor objects.)

For an instance of a __slots__ using class, we still have a somewhat workable definition of what attributes it has. For each __slots__ attribute, an instance has the attribute if hasattr() is true for it, which means that you can access it. Here our b instance of B has an a attribute but doesn't have a b attribute. You can at least write code that mechanically checks this, although it's a bit harder than it looks.

(One part is that you need the union of __slots__ on all base classes.)

However, we've now arrived at the tricky bit. Suppose that we have a general property on a class under the name par. When should we say that instances of this class have a par attribute? In one sense, instances never will, because at the mechanical level par will always be a class attribute and will never appear in an instance __dict__. In another sense, we could reasonably say that instances have a par attribute when hasattr() is true for it, ie when accessing inst.par won't raise AttributeError; this is the same definition as we used for __slots__ attributes. Or we might want to be more general and say that an attribute only 'exists' for our purposes when accessing it doesn't raise any errors, not just AttributeError (after all, this is when we can use the attribute). But what if this property actually computes the value for par on the fly from somewhere, in effect turning an attribute into a method; do we say that par is still an attribute of the instance, even though it doesn't really act like an attribute any more?

Python has a lot of ways to attach sophisticated behavior to instances of classes that's triggered when you try to access an attribute in some way. Once we have such sophisticated behavior in action, there's no clear or universal definition of when an instance 'has' an attribute and it becomes a matter of interpretation and opinion. This is one deep cause of why there's no simple way to see what attributes an object currently has; once we get past the simple cases, it's not even clear what the question means.

(Even if we come up with a meaning for ourselves, classes that define __getattr__ or __getattribute__ make it basically impossible to see what attribute names we want to check, as the dir() documentation gently notes. There are many complications here.)

Sidebar: The pragmatic answer

The pragmatic answer is that if it's sensible to ask this question about an object at all, we can get pretty much the answer we want by looking at the object's __dict__ (if it has one), then adding the merged __slots__ names for which hasattr() reports true.

That this answer blows up on things like proxy objects suggests that perhaps it's not a question we should be asking in the first place, at least not outside of limited and specialized situations.

(In other words, it's possible to get entirely too entranced with the theory of Python and neglect its practical applications. I'm guilty of this from time to time.)

python/KnowingObjectAttrsHard written at 00:14:03; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.