Attribute Access and Descriptors¶
We’ve covered the syntax for accessing attributes, and we’ve covered how to
create new classes in Python. We talked about one special method –
__init__
. We’re going to cover several special methods that all have to do
with attribute access.
The Four Things You Do With Attributes¶
It’s important to remember the 4 things you can do with attributes. These are the same as the things you can do with any namespace, as attributes for a namespace for objects.
Declare a new attribute (and give it an initial value.)
Access the attribute, retrieving its value.
Assign to the attribute, giving it a new value.
Delete the attribute, removing it from the namespace.
As for variables, you declare an attribute by assigning to it for the first time, so we will treat these two cases as just “assignment”.
When we go about messing with how Python handles attributes, we need to think about what it is we are trying to accomplish and why. Here are some common scenarios I encounter.
Default Attributes¶
Sometimes I want certain default attributes to exist whether or not I assign
to them in __init__
. While I can use class attributes to handle this, I
can also customize the attribute access pattern.
Calculated Attributes¶
Some attributes may be calculated from the object. For instance, in the case
of complex numbers, the magnitude of the number is calculated as the square
root of the sum of the squares of the real and imaginary parts. I don’t want
to calculate this for every complex number, but I do want it to appear as an
attribute. I can modify how the attribute is accessed such that instead of
looking in the __dict__
I can calculate it on the fly.
Note that I don’t recommend this sort of behavior. It is surprising to see that an attribute access has caused a function to be called. It’s weird to get exceptions from them.
Cached Attribute Values¶
Sometimes want to cache values after we have accessed them, and we can store them as an attribute. I’ll cover how to do this in different ways, and give my recommendations.
Indefinite Attributes¶
Sometimes you don’t know what attributes your object will have, or there is virtually an unlimited number of possibilities. In this case, you simply can’t assign all the possible attributes, so you need to calculate them instead.
Remembering Attribute Manipulation¶
Another special case arises in things like SQLAlchemy. SQLAlchemy allows you to create objects that represent rows in database tables. Attributes are the columns of that row. Assigning to an object’s attribute signals your intention to update a column in that row. However, SQLAlchemy is written such that you can accumulate all the changes you’d like to make in a transaction before flushing them out to the database.
Limiting Attribute Values¶
We might want to limit what values we can assign to certain attributes.
Usually we’re interested in only values of a specific type but sometimes we
might want to enforce limits to the values. For instance, it makes no sense to
have circles with negative radii, so we might want to limit the values of
radius
to numbers greater than or equal to 0.
__getattribute__(self, name)
¶
The “granddaddy” of all the attribute access special methods is
__getattribute__
. The object
base class of all base classes for all
time ever defines its own __getattribute__
that implements all of the
magical powers I describe in this video.
I have never written this method for any class that I have ever defined in all my years of Python programming. So I’ll tell you what it does and the one or two cases where you might be tempted to use it, and why you should use something else instead.
This method is called for every attribute access. If the attribute is in the namespace, or a descriptor, or handled by the attribute access methods, this is still called. (Note, by every I really mean everything you can think of right now. There are special attributes that will not be accessed through this method, for everyone’s sanity.)
If you were to write this method, either you would need to call the attribute
access methods, descriptors, etc… then you would need to write that code.
Alternatively, you can raise an AttributeError
and Python will then
proceed to pretend that this method doesn’t exist.
Why would you do this? Well, you might want to have very special rules for when to create a new attribute, what values to assign to attributes, or how to calculate an attribute on the fly, or what to do when you delete an attribute. However, each of these cases are handled by the methods below exactly the way you imagine it should be.
The only case where I think this might be useful is if you want some sort of super-descriptor scenario, but even then, I’d encourage you to find a way to make descriptors work. When you include metaclasses in your arsenal, you’ll find that mucking with this is truly unnecessary.
__dir__(self)
¶
This returns the list of attributes, when the object is called as an argument
to dir()
. I never override this, and I don’t think you should either.
__getattr__(self, name)
¶
This returns a value given a name, but only if the name does not exist in
__dict__
.
The neat thing about this is that you don’t even need the name passed in to be
stored in the __dict__
of the object. You can just make things up as
needed. Here’s an example:
class Attributor:
def __getattr__(self, name):
return name
a = Attributor()
a.foo # -> 'foo'
a.foo = 6
a.foo # -> 6, because 'foo' is in self.__dict__
It’s been a long time since I’ve found a good use for this. Descriptors just do this better.
The only case where this might be beneficial is in the indefinite number of attributes scenario.
__setattr__(self, name, value)
¶
This special method will be called when you try to assign to any attribute. And by “any”, I really mean “any”.
If you want to proceed with the normal assignment behavior of attributes, you
must call super
or just object.__setattr__(self, name, value)
. You
may also assign directly to the __dict__
attribute of the object.
In this example, I have Vector class that allows you to set the x, y, and z parameters by attribute assignment.
class Vector3:
def __init__(self):
self.vector = [0, 0, 0]
def __setattr__(self, name, value):
if name == 'x': self.vector[0] = value
elif name == 'y': self.vector[1] = value
elif name == 'z': self.vector[2] = value
else: object.__setattr__(self, name, value)
Like __getattr__
, it’s been a long time since I’ve seen a use for this.
Descriptors do it better.
The only case where this might be beneficial is in the indefinite number of attributes scenario.
Remember that every attribute assignment will go through this function, so plan accordingly!
__delattr__(self, name)
¶
This is parallel to __setattr__
except for attribute deletion. I don’t
think I’ve ever written this in my entire career.
It may be useful if deleting an attribute is meaningful. I must not have a good imagination because I can’t think of any case where that would be the case.
Attribute Dictionary¶
Given the three methods above, you might get the idea that you can write an Attribute Dictionary, a class that behaves like a dictionary but also allows you to access the values via the attribute syntax. This has been done multiple times.
I don’t encourage it, however, for a number of reasons, not the least of which is you can have keys in the dictionary that are not valid attribute names, and you might have collisions between actual attributes and dictionary elements.
Descriptors¶
At this point, I get to introduce one of the truly unique and scary ideas in Python: Descriptors.
The confusing part about descriptors is understanding exactly when the special methods get called.
In short, if the descriptor is an attribute of an instance, not the class, the instance, then it is not treated as a descriptor, and the attribute access does nothing special.
But there is one important exception: classes which have attributes that are descriptors, when accessed directly through the class, have the descriptor special methods invoked. We call classes with attributes that are descriptors owner classes.
Let me try to simplify this all. Suppose we have the following:
A descriptor called
desc
that defines the special methods__get__
,__set__
, or__delete__
.A class called
Owner
that has an attribute calleddesc
that is a descriptor.An instance of class
Owner
calledinst
but does not have an attribute calleddesc
assigned at the instance level. (Meaning, you never calledinst.__dict__['desc'] = ...
. If you triedinst.desc = ...
that would invoke the special methods.)
These are the only four cases where the descriptor methods get called:
If you call the method directly:
desc.__get__(...)
,desc.__set__(...)
ordesc.__delete__(...)
.When you access, assign, or delete
desc
throughinst
:inst.desc
,inst.desc = x
,del inst.desc
.When you access
desc
through the classOwner
:Owner.desc
. Deleting or assigningdesc
through the classOwner
does not invoke the special methods.Something to do with
super
that we’ll cover in inheritance and really not that important because it hardly ever becomes an issue.
If you’re thoroughly confused, don’t feel bad. This is confusing. The way I remember it is as follows:
Descriptors that are just variables are not special.
Instances with descriptor attributes defined at that level are not special. It is just like descriptors that are variables.
Classes with descriptor attributes are special, both for the class and instances of that class.
Example¶
Here’s some code that will help clarify things.
Data vs. Non-data Descriptors¶
You may hear people talk about “data” or “non-data” descriptors. This is rather easy to explain:
Data Descriptors have either or both
__set__
and__delete__
defined.Non-data descriptors don’t have either defined.
Note that you can have a descriptor that doesn’t have __get__
defined,
but I have never seen a use for this. The net effect of such a descriptor is
that you can only assign to it or delete. If you tried to access it, it
would raise an AttributeError
.
__get__(self, instance, owner=None)
¶
This special method is called whenever the descriptor is accessed using one of
the four methods above. Note that owner
may or may not be set to something
other than None
.
If the descriptor was accessed directly through the class, IE,
desc_class.desc
, thenowner
isNone
andinstance
isdesc_class
.If the descriptor was accessed through an instance of the class, IE,
instance.desc
, theninstance
isinstance
andowner
isdesc_class
.
You should return the value that should be the value of the attribute for this
descriptor. Note that you are not given the name that was used to find the
attribute, so unless you stored it previously, you won’t have that available.
(We’ll talk about __set_name__
later.)
This is actually a pretty big problem to solve and it causes a bit of a headache. See, each instance of a class with a descriptor for an attribute is using the same descriptor for attribute access. This means you need to know in this code which attributes to look at in the instance to calculate the value of this attribute access lookup.
However, with a bit of imagination, you can come up with good solutions. And
Python 3.6 gave us __set_name__
, which will help.
This function should either return a value (remember that no return statement
means return None
) or raise AttributeError
if the attribute shouldn’t
exist.
Typically, especially in the case of cachign attribute values, you’ll want to store the value you calculated in the instance of the class so that you don’t have to recalculate it again. This means that I typically see the following pattern for this method:
class MyDescriptor:
def __get__(self, instance, owner=None):
if owner: # We're being accessed through an instance
try:
return instance._cached[self.name]
except KeyError:
pass
value = ...
instance._cached[self.name] = value
return value
else: # We're being access through the class
return self
In order to use the descriptor, we need to do something like this:
class MyClass:
foo = MyDescriptor()
foo.name = 'foo'
Using the descriptor looks like this:
MyClass.foo # -> __get__(self, MyClass, None)
a = MyClass()
a.foo # -> __get__(self, a, MyClass)
When should we use descriptors? Pretty much anytime we want to override the
default behavior of attribute access. There may be a descriptor you want that
does what you want (we’ll look at property
, classmethod
and
staticmethod
in this video), so I’d typically use one of those, especially
property
. Rarely do I ever write an entirely new descriptor.
__set__(self, instance, value)
¶
This is called when you try to assign to a descriptor using one of the four methods I mentioned early.
Again, we run into the same problem we have with __get__
and names of
attributes. Unless you’ve recorded the name of the attribute, you won’t know
what name the attribute was accessed through.
Typically, we use __set__
to modify the value before storing it,
especially if we want to make sure that the value is of an acceptable type or
value. However, we might also want to store the value in something other than
the attribute’s namespace under the same name.
Here’s a typical pattern I might see for a __set__
method:
class Desc:
def __set__(self, instance, value):
instance.__dict__['_'+self.name] = int(value)
And the ways it might get invoked:
class Owner:
foo = Desc()
foo.name = 'foo'
Owner.foo = "5" # Owner._foo <- 5
instance = Owner()
instance.foo = 7.0 # instance._foo <- 7
As for __get__
, I don’t tend to write my own descriptors. property
actually does everything I need.
__delete__(self, instance)
¶
This method, not to be confused with __del__
, which is invoked when the
object is garbage collected, is invoked when a descriptor is deleted under one
of the four special ways mentioned above.
I don’t have a whole lot to say about this, other than you probably want to
use property
instead.
__set_name__(self, owner, name)
¶
This isn’t really a descriptor method, but it goes along closely with it. What
python does is when it creates the class (in type(name, bases, dict)
) it
searched the dict
, the namespace of the class, looking for values with
this method. If it sees it, then it will call value.__set_name__(value,
new_class, name)
, where the name is the name of the value in that namespace.
It’s quite convenient, especially when you think about how hard it is to get
the name of the attribute.
If we didn’t have this, we have to explicitly set the name, as I did in the examples above. We could use descriptors, or parameters to new instances of objects or whatnot. This greatly simplifies that process.
Built-In Descriptors¶
There are three built-in descriptors that I will mention here. All three are decorators, and serve a very special purpose.
@staticmethod
¶
Sometimes you want a method on a class that doesn’t rely at all on the class
attributes or instance attributes. In this case, staticmethod
provides a
convenient decorator:
class MyClass:
@staticmethod
def sum(a, b): return a+b
MyClass.sum(1,2) # 3
a = MyClass()
a.sum(1,5) # 6
@classmethod
¶
Parallel to @staticmethod
is this decorator, which ensures that the first
parameter is always the class, even when it is invoked from an instance of the
class.
class MyClass:
@classmethod
def what_class_am_i(cls):
return cls
MyClass.what_class_am_i()
a = MyClass()
a.what_class_am_i()
This is convenient because when you invoke classmethods without it, you have to pass in something as the first parameter.
I use this especially when I am creating a singleton-style class. Why not just use the class as the singleton? You’ll see things like this a lot with libraries like Flask and CherryPy.
property
¶
This descriptor is arguable one of the most useful descriptors ever invented.
Indeed, it can be said that descriptors were invented specifically to make
property
possible.
99.9% of the time, when you want to modify how attributes are accessed,
assigned, or deleted, property
has you covered.
Let’s look at how it might be used:
class MyClass:
@property
def a(self):
return self.b*6
@a.setter
def a(self, value):
self.b = value/6
@a.deleter
def a(self):
del self.b
MyClass.a
i = MyClass()
i.a # Attribue Error - no b!
i.a = 1
i.b # 0.16666...
i.a # 1.0
del i.a
i.b # AttributeError
i.a # AttributeError -- no b!
Note that assigning to or deleting MyClass.a
will obliterate the
descriptor, removing it completely.
Keep in mind that creating setters or deleters for the property is entirely optional.
It is really fun to figure out how property
is implemented. Keep in mind
the following:
After the setter and deleter decorators are applied, Python will assign the results to the
a
variable in the class namespace. How doesproperty
not overwrite itself?How does the
property
know to do the right thing forMyClass.a = ...
anddel MyClass.a
?
__slots__
¶
Before I let you go, I want to mention __slots__
. In the class suite, if you
define a variable called __slots__
, then Python won’t create a
__dict__
for the class. Instead, it reserves a spot for each of the
attribute names listed in __slots__
.
If you try to assign new attributes, an AttributeError
will be raised.
__slots__
creates a special descriptor for each attribute. That means you
can’t use the class attributes as defaults – they will get overridden.
There are some more caveats and warnings and I encourage you to read the Python documentation on it if you think this might be useful.
Speaking of which, when is this useful? If you’re creatings tons and tons of
instances and you are worried about the overhead of each object having its own
__dict__
, this is a way you can reduce the creation time and memory
overhead of creating new instances of the class. That’s about all it is useful
for.