Attribute Access and Descriptors

We’ve covered the syntax for accessing attributes, and we’ve covered how to create new classes in Python. We talked about one special method – __init__. We’re going to cover several special methods that all have to do with attribute access.

The Four Things You Do With Attributes

It’s important to remember the 4 things you can do with attributes. These are the same as the things you can do with any namespace, as attributes for a namespace for objects.

  1. Declare a new attribute (and give it an initial value.)

  2. Access the attribute, retrieving its value.

  3. Assign to the attribute, giving it a new value.

  4. Delete the attribute, removing it from the namespace.

As for variables, you declare an attribute by assigning to it for the first time, so we will treat these two cases as just “assignment”.

When we go about messing with how Python handles attributes, we need to think about what it is we are trying to accomplish and why. Here are some common scenarios I encounter.

Default Attributes

Sometimes I want certain default attributes to exist whether or not I assign to them in __init__. While I can use class attributes to handle this, I can also customize the attribute access pattern.

Calculated Attributes

Some attributes may be calculated from the object. For instance, in the case of complex numbers, the magnitude of the number is calculated as the square root of the sum of the squares of the real and imaginary parts. I don’t want to calculate this for every complex number, but I do want it to appear as an attribute. I can modify how the attribute is accessed such that instead of looking in the __dict__ I can calculate it on the fly.

Note that I don’t recommend this sort of behavior. It is surprising to see that an attribute access has caused a function to be called. It’s weird to get exceptions from them.

Cached Attribute Values

Sometimes want to cache values after we have accessed them, and we can store them as an attribute. I’ll cover how to do this in different ways, and give my recommendations.

Indefinite Attributes

Sometimes you don’t know what attributes your object will have, or there is virtually an unlimited number of possibilities. In this case, you simply can’t assign all the possible attributes, so you need to calculate them instead.

Remembering Attribute Manipulation

Another special case arises in things like SQLAlchemy. SQLAlchemy allows you to create objects that represent rows in database tables. Attributes are the columns of that row. Assigning to an object’s attribute signals your intention to update a column in that row. However, SQLAlchemy is written such that you can accumulate all the changes you’d like to make in a transaction before flushing them out to the database.

Limiting Attribute Values

We might want to limit what values we can assign to certain attributes. Usually we’re interested in only values of a specific type but sometimes we might want to enforce limits to the values. For instance, it makes no sense to have circles with negative radii, so we might want to limit the values of radius to numbers greater than or equal to 0.

__getattribute__(self, name)

The “granddaddy” of all the attribute access special methods is __getattribute__. The object base class of all base classes for all time ever defines its own __getattribute__ that implements all of the magical powers I describe in this video.

I have never written this method for any class that I have ever defined in all my years of Python programming. So I’ll tell you what it does and the one or two cases where you might be tempted to use it, and why you should use something else instead.

This method is called for every attribute access. If the attribute is in the namespace, or a descriptor, or handled by the attribute access methods, this is still called. (Note, by every I really mean everything you can think of right now. There are special attributes that will not be accessed through this method, for everyone’s sanity.)

If you were to write this method, either you would need to call the attribute access methods, descriptors, etc… then you would need to write that code. Alternatively, you can raise an AttributeError and Python will then proceed to pretend that this method doesn’t exist.

Why would you do this? Well, you might want to have very special rules for when to create a new attribute, what values to assign to attributes, or how to calculate an attribute on the fly, or what to do when you delete an attribute. However, each of these cases are handled by the methods below exactly the way you imagine it should be.

The only case where I think this might be useful is if you want some sort of super-descriptor scenario, but even then, I’d encourage you to find a way to make descriptors work. When you include metaclasses in your arsenal, you’ll find that mucking with this is truly unnecessary.

__dir__(self)

This returns the list of attributes, when the object is called as an argument to dir(). I never override this, and I don’t think you should either.

__getattr__(self, name)

This returns a value given a name, but only if the name does not exist in __dict__.

The neat thing about this is that you don’t even need the name passed in to be stored in the __dict__ of the object. You can just make things up as needed. Here’s an example:

class Attributor:
    def __getattr__(self, name):
        return name

a = Attributor()
a.foo # -> 'foo'

a.foo = 6
a.foo # -> 6, because 'foo' is in self.__dict__

It’s been a long time since I’ve found a good use for this. Descriptors just do this better.

The only case where this might be beneficial is in the indefinite number of attributes scenario.

__setattr__(self, name, value)

This special method will be called when you try to assign to any attribute. And by “any”, I really mean “any”.

If you want to proceed with the normal assignment behavior of attributes, you must call super or just object.__setattr__(self, name, value). You may also assign directly to the __dict__ attribute of the object.

In this example, I have Vector class that allows you to set the x, y, and z parameters by attribute assignment.

class Vector3:
    def __init__(self):
        self.vector = [0, 0, 0]

    def __setattr__(self, name, value):
        if name == 'x': self.vector[0] = value
        elif name == 'y': self.vector[1] = value
        elif name == 'z': self.vector[2] = value
        else: object.__setattr__(self, name, value)

Like __getattr__, it’s been a long time since I’ve seen a use for this. Descriptors do it better.

The only case where this might be beneficial is in the indefinite number of attributes scenario.

Remember that every attribute assignment will go through this function, so plan accordingly!

__delattr__(self, name)

This is parallel to __setattr__ except for attribute deletion. I don’t think I’ve ever written this in my entire career.

It may be useful if deleting an attribute is meaningful. I must not have a good imagination because I can’t think of any case where that would be the case.

Attribute Dictionary

Given the three methods above, you might get the idea that you can write an Attribute Dictionary, a class that behaves like a dictionary but also allows you to access the values via the attribute syntax. This has been done multiple times.

I don’t encourage it, however, for a number of reasons, not the least of which is you can have keys in the dictionary that are not valid attribute names, and you might have collisions between actual attributes and dictionary elements.

Descriptors

At this point, I get to introduce one of the truly unique and scary ideas in Python: Descriptors.

The confusing part about descriptors is understanding exactly when the special methods get called.

In short, if the descriptor is an attribute of an instance, not the class, the instance, then it is not treated as a descriptor, and the attribute access does nothing special.

But there is one important exception: classes which have attributes that are descriptors, when accessed directly through the class, have the descriptor special methods invoked. We call classes with attributes that are descriptors owner classes.

Let me try to simplify this all. Suppose we have the following:

  • A descriptor called desc that defines the special methods __get__, __set__, or __delete__.

  • A class called Owner that has an attribute called desc that is a descriptor.

  • An instance of class Owner called inst but does not have an attribute called desc assigned at the instance level. (Meaning, you never called inst.__dict__['desc'] = .... If you tried inst.desc = ... that would invoke the special methods.)

These are the only four cases where the descriptor methods get called:

  • If you call the method directly: desc.__get__(...), desc.__set__(...) or desc.__delete__(...).

  • When you access, assign, or delete desc through inst: inst.desc, inst.desc = x, del inst.desc.

  • When you access desc through the class Owner: Owner.desc. Deleting or assigning desc through the class Owner does not invoke the special methods.

  • Something to do with super that we’ll cover in inheritance and really not that important because it hardly ever becomes an issue.

If you’re thoroughly confused, don’t feel bad. This is confusing. The way I remember it is as follows:

  • Descriptors that are just variables are not special.

  • Instances with descriptor attributes defined at that level are not special. It is just like descriptors that are variables.

  • Classes with descriptor attributes are special, both for the class and instances of that class.

Example

Here’s some code that will help clarify things.

Data vs. Non-data Descriptors

You may hear people talk about “data” or “non-data” descriptors. This is rather easy to explain:

  • Data Descriptors have either or both __set__ and __delete__ defined.

  • Non-data descriptors don’t have either defined.

Note that you can have a descriptor that doesn’t have __get__ defined, but I have never seen a use for this. The net effect of such a descriptor is that you can only assign to it or delete. If you tried to access it, it would raise an AttributeError.

__get__(self, instance, owner=None)

This special method is called whenever the descriptor is accessed using one of the four methods above. Note that owner may or may not be set to something other than None.

  • If the descriptor was accessed directly through the class, IE, desc_class.desc, then owner is None and instance is desc_class.

  • If the descriptor was accessed through an instance of the class, IE, instance.desc, then instance is instance and owner is desc_class.

You should return the value that should be the value of the attribute for this descriptor. Note that you are not given the name that was used to find the attribute, so unless you stored it previously, you won’t have that available. (We’ll talk about __set_name__ later.)

This is actually a pretty big problem to solve and it causes a bit of a headache. See, each instance of a class with a descriptor for an attribute is using the same descriptor for attribute access. This means you need to know in this code which attributes to look at in the instance to calculate the value of this attribute access lookup.

However, with a bit of imagination, you can come up with good solutions. And Python 3.6 gave us __set_name__, which will help.

This function should either return a value (remember that no return statement means return None) or raise AttributeError if the attribute shouldn’t exist.

Typically, especially in the case of cachign attribute values, you’ll want to store the value you calculated in the instance of the class so that you don’t have to recalculate it again. This means that I typically see the following pattern for this method:

class MyDescriptor:
    def __get__(self, instance, owner=None):
        if owner: # We're being accessed through an instance
            try:
                return instance._cached[self.name]
            except KeyError:
                pass

            value = ...
            instance._cached[self.name] = value
            return value
        else: # We're being access through the class
            return self

In order to use the descriptor, we need to do something like this:

class MyClass:

    foo = MyDescriptor()
    foo.name = 'foo'

Using the descriptor looks like this:

MyClass.foo # -> __get__(self, MyClass, None)
a = MyClass()
a.foo # -> __get__(self, a, MyClass)

When should we use descriptors? Pretty much anytime we want to override the default behavior of attribute access. There may be a descriptor you want that does what you want (we’ll look at property, classmethod and staticmethod in this video), so I’d typically use one of those, especially property. Rarely do I ever write an entirely new descriptor.

__set__(self, instance, value)

This is called when you try to assign to a descriptor using one of the four methods I mentioned early.

Again, we run into the same problem we have with __get__ and names of attributes. Unless you’ve recorded the name of the attribute, you won’t know what name the attribute was accessed through.

Typically, we use __set__ to modify the value before storing it, especially if we want to make sure that the value is of an acceptable type or value. However, we might also want to store the value in something other than the attribute’s namespace under the same name.

Here’s a typical pattern I might see for a __set__ method:

class Desc:
    def __set__(self, instance, value):
        instance.__dict__['_'+self.name] = int(value)

And the ways it might get invoked:

class Owner:

    foo = Desc()
    foo.name = 'foo'

Owner.foo = "5"    # Owner._foo <- 5
instance = Owner()
instance.foo = 7.0 # instance._foo <- 7

As for __get__, I don’t tend to write my own descriptors. property actually does everything I need.

__delete__(self, instance)

This method, not to be confused with __del__, which is invoked when the object is garbage collected, is invoked when a descriptor is deleted under one of the four special ways mentioned above.

I don’t have a whole lot to say about this, other than you probably want to use property instead.

__set_name__(self, owner, name)

This isn’t really a descriptor method, but it goes along closely with it. What python does is when it creates the class (in type(name, bases, dict)) it searched the dict, the namespace of the class, looking for values with this method. If it sees it, then it will call value.__set_name__(value, new_class, name), where the name is the name of the value in that namespace. It’s quite convenient, especially when you think about how hard it is to get the name of the attribute.

If we didn’t have this, we have to explicitly set the name, as I did in the examples above. We could use descriptors, or parameters to new instances of objects or whatnot. This greatly simplifies that process.

Built-In Descriptors

There are three built-in descriptors that I will mention here. All three are decorators, and serve a very special purpose.

@staticmethod

Sometimes you want a method on a class that doesn’t rely at all on the class attributes or instance attributes. In this case, staticmethod provides a convenient decorator:

class MyClass:

    @staticmethod
    def sum(a, b): return a+b

MyClass.sum(1,2) # 3
a = MyClass()
a.sum(1,5) # 6

@classmethod

Parallel to @staticmethod is this decorator, which ensures that the first parameter is always the class, even when it is invoked from an instance of the class.

class MyClass:

    @classmethod
    def what_class_am_i(cls):
        return cls

MyClass.what_class_am_i()
a = MyClass()
a.what_class_am_i()

This is convenient because when you invoke classmethods without it, you have to pass in something as the first parameter.

I use this especially when I am creating a singleton-style class. Why not just use the class as the singleton? You’ll see things like this a lot with libraries like Flask and CherryPy.

property

This descriptor is arguable one of the most useful descriptors ever invented. Indeed, it can be said that descriptors were invented specifically to make property possible.

99.9% of the time, when you want to modify how attributes are accessed, assigned, or deleted, property has you covered.

Let’s look at how it might be used:

class MyClass:

    @property
    def a(self):
        return self.b*6

    @a.setter
    def a(self, value):
        self.b = value/6

    @a.deleter
    def a(self):
        del self.b

MyClass.a
i = MyClass()
i.a # Attribue Error - no b!
i.a = 1
i.b # 0.16666...
i.a # 1.0
del i.a
i.b # AttributeError
i.a # AttributeError -- no b!

Note that assigning to or deleting MyClass.a will obliterate the descriptor, removing it completely.

Keep in mind that creating setters or deleters for the property is entirely optional.

It is really fun to figure out how property is implemented. Keep in mind the following:

  • After the setter and deleter decorators are applied, Python will assign the results to the a variable in the class namespace. How does property not overwrite itself?

  • How does the property know to do the right thing for MyClass.a = ... and del MyClass.a?

__slots__

Before I let you go, I want to mention __slots__. In the class suite, if you define a variable called __slots__, then Python won’t create a __dict__ for the class. Instead, it reserves a spot for each of the attribute names listed in __slots__.

If you try to assign new attributes, an AttributeError will be raised.

__slots__ creates a special descriptor for each attribute. That means you can’t use the class attributes as defaults – they will get overridden.

There are some more caveats and warnings and I encourage you to read the Python documentation on it if you think this might be useful.

Speaking of which, when is this useful? If you’re creatings tons and tons of instances and you are worried about the overhead of each object having its own __dict__, this is a way you can reduce the creation time and memory overhead of creating new instances of the class. That’s about all it is useful for.