Object Inheritance

Python, like many Object-Oriented programming languages, allows you to create new types based off of old ones. There are a number of reasons why you might want to do this. In this video lecture, I’ll cover how it works, why you might want to do it, and give you some warnings and cautions.

Why Inheritance

If you spend any time creating classes, you’ll quickly note some patterns that you want to make use of.

  • Many of your objects will have very similar, if not identical, methods.

  • Many of your objects will share the same attributes, with the exact same behavior for the attributes.

  • Sometimes you want to create a new class by just extending, that is, adding new things to an existing class, but you don’t want to touch the existing class.

  • Occasionally you want to modify a method or an attribute for the new class.

All of these things are handled by inheritance.

Terminology

The fundamental idea of inheritance is that you can create classes that inherit from or derive from another class. This means that unless overridden, the attributes and methods of that class will appear as-is in the derived class.

We call the class that is inherited the base, parent, or super class. I prefer “super” because that is the name of a special function in Python that allows you to access the class’s methods.

The derived classes are called derived, child, or sub classes.

Note that you can inherit a class that inherits from another class, and so this relationship can be many levels deep. That is, a single class can have a super class that itself has a super class which has a super class.

You can also have a subclass that has multiple superclasses. We’ll talk about Method Resolution Order (MRO) later, which will explain how Python figures out which class to look at next to determine where an attribute or method lives.

Ultimately, all classes should have the object class as its final super class. We don’t talk a lot about object because there isn’t much to do with it. It has some of the most important special methods defined (like __getattribute__) so we generally don’t want to mess with it.

Function

The core element of inheritance is the following:

  • Subclasses get all of the methods and attributes of the superclass.

  • Subclasses can specify entirely new attributes or methods. This does not affect the superclass.

  • Subclasses can override attributes or methods from the superclass. Even methods specified in the superclass will see these overriding attributes and methods.

  • Even if you override a method, you can still access that method through the super() function.

Alternatives

Inheritance might not be the right solution. There are a number of alternatives you may want to consider.

  • Rather than inheriting from a class, can you just have an instance of the class as an attribute? This is useful if you don’t really share many attributes with the inherited class.

  • Maybe a shared method or attribute is really a global function or variable.

  • Rather than relying on superclasses to derive information about taxonomy (how you organize your objects) consider using an attribute with the taxonomy. Typically, things will fall into multiple categories, and it’s not always clear which categories your users will be interested in. IE, a man is an animal, and accountants are a man, but some accountants are female which makes them a woman, so not all men are a man? This sort of nonsense can be resolved by remembering that the object is of species “Homo Sapien” and has the job of “Accountant” and has the sex “Female”, rather than relying on the class structure.

How to Create Derived Types

First, when you declare a new class, you can specify one or more types as parameters to the class name. This is the syntax:

class NAME(bases): SUITE

The bases can be one or more types, separated by commas.

This is equivalent to creating a new type with the type function:

NAME = type('NAME', bases, dict)

(The dict, as you recall, is the namespace that results from the SUITE being executed exactly once.)

The bases appear as the __bases__ attribute in the class, in the order they were specified.

At the time of the class creation, Python figures out and assigns the Method Resolution Order (MRO). We’ll talk more about that later.

The way that I think of it is that just like an instance of a class falls back to the class in determining what an attribute or method is, a subclass falls back to its superclass if it doesn’t directly specify an attribute or method. Let’s walk through a simple example.

class A: a = "a"

class B(A): b = "b"

class C(B): c = "c"

c = C()
c.c # -> "c", from the C class.
c.b # -> "b", from the C's B class.
c.a # -> "a", from the C's B's A class.
c.d = "d"
c.d # -> "d", from the c instance.

And as you might expect, you can override attributes and methods just by specifying your own attribute at the appropriate class. Here’s an example of that.

class A: name = "A"
class B: name = "B"
class C: name = "C"

a = A(); a.name # -> "A" from A.name
b = B(); b.name # -> "B" from B.name
c = C(); c.name # -> "C" from C.name
del C.name
c.name # -> "B" from C's B.name

In my head, I have an image of a stack of attribute namespaces. Python will look through the stack from the top, and then return the first attribute it sees.

(The instance is on top, with its class right below it. Below the class are its super classes, in MRO.)

Method Resolution Order

There is a bit of a problem, however, when you consider that types may depend on other types, and those types may depend on the same types.

If we were to draw shapes on how types can inherit or derive from each other, the following shapes are possible:

  • Spreads out into a tree. No two classes derive from the same classes.

  • A tree but with diamonds. Some of the classes derive from the same classes.

Of note, circular paths are not possible. IE, A -> B -> C -> A This is because you can’t define a class as a superclass until it is actually created, and you can’t change its inheritance after it has been created.

The question arises which classes do we look at first, and in what order? This is the problem of Method Resolution Order, abbreviated MRO. In short, we are turning an arbitrary graph that follows the rules above into a straight line that visits all of the nodes on the graph in a particular order.

Python uses the C3 MRO algorithm, which isn’t very easy to describe but gives more-or-less obvious results. The general rules it tries to satisfy are:

  1. Visit classes in the order they appear in the bases. IE, if your bases are (A, B, C) it will visit A before B and then C.

  2. If two classes inherit from the same superclasses, visit the subclasses before visiting any superclasses. This is a little bit more complicated to describe, but works as follows. Say you have A which inherits from B and C which both inherit from D. Rather than visiting A then B then D and then C, it will visit A then B then C. and then D. That’s because it will visit the superclasses after it has visited all the subclasses.

  3. If two classes inherit from the same superclasses but in a different order, this is an error. You cannot define such a class. A simple example of this is you have A which derives from B and C. However, B derives from D and then E, while C derives from E and D. Since it would violate the first rule no matter whether it visits D then E or the other way around, this is illegal. (It raises a

You can check out the method resolution order by lookin at the __mro__ attribute of the class.

Let’s look at some examples.

This shows that it will visit subclasses before their superclasses.

>>> class D: pass

>>> class B(D): pass

>>> class C(D): pass

>>> class A(B, C): pass

>>> A.__mro__
(<class '__main__.A'>, <class '__main__.B'>, <class '__main__.C'>, <class '__main__.D'>, <class 'object'>)

This example will raise a TypeError with a message about consistent MRO.

>>> class D: pass

>>> class E: pass

>>> class B(D,E): pass

>>> class C(E,D): pass

>>> class A(B,C): pass

Traceback (most recent call last):
  File "<pyshell#57>", line 1, in <module>
    class A(B,C): pass
TypeError: Cannot create a consistent method resolution
order (MRO) for bases D, E

The key takeaways from this is that Python chooses a really good MRO that shouldn’t surprise anyone. The key is that you must choose a consistent ordering of superclasses if you use multiple inheritance.

Some people don’t like to use multiple inheritance, but I don’t see a reason why you shouldn’t use it. If it fits, it fits.

object

All classes derive from the object type. This is always included and contains some rather useful methods.

isinstance

isinstance will walk the bases to find any base that matches the type you pass in. So in essence, you’re asking, “Is this object of type X or any type that derives from X?”

super

Sometimes you override a method, but you want to fall back to the superclass’s method in specific circumstances. A simple example might be you want to convert the value to a float or something before calling the superclass’s method.

super() does exactly that for you.

class B:
    def foo(self, param):
        return param

class A(B):
    def foo(self, param):
        param = float(param)
        return super().foo(param)

How does super() work? It’s a bit magical, but it looks up the next class in the MRO that has the foo method, and calls that.

Before we had a fully functional super(), we had to remember what the base classes were and call the methods directly, or rely on super(instance, class) to figure it out or us. Nowadays things are much nicer than they were!

Typically, you’ll be using super() with the __init__ method. That’s because you almost always need to call the __init__ of the superclasses to make sure everything is initialized properly.

You might also call super() for the special methods so that it falls back to previous implementations. For instance, if I overrode __getattr__, I might want to fall back to the superclass’s version.

class B:

    def __getattr__(self, name):
        if name == 'b': return 'b'
        else: raise AttributeError("No such attribute {!r}".format(name))


class A(B):

    def __getattr__(self, name):
        if name == 'a': return 'a'
        else: return super().__getattr__(name)


A().a # -> 'a'
A().b # -> 'b'
A().c # -> AttributeError

super() and Descriptors

There is a special edge case that shouldn’t be too surprising when you invoke super() in relation to descriptors.

Suppose we have Owner, a class with a descriptor in the desc attribute. If you have Owner as a superclass and super() is looking for a method with the name desc, then it will invoke __get__ on the descriptor. However, when it invokes __get__, the class parameter will not be Owner – it will be the class of the instance.

If you don’t understand what that means, it’s probably not important. When you actually run into this, super() will do the least surprising thing.

Law of Demeter

There is a useful rule of thumb that I have used and it has greatly simplified how my code reads. A problem arises when you have six or seven things in an attribute access / write, For instance, a.b.thing.name.title.language. This can get really confusing, because the person reading and writing the code needs to know what an a.b is, what an a.b.thing is, and what an a.b.thing.name is, and so on.

In order to simplify the code, as well as to simplify the interface, the Law of Demeter was introduced and I find it is quite useful.

The Law of Demeter, sometimes called “The principle of least knowledge”, says that you should never dig so deep in an attribute chain. In terms of Python, it says that in a class, you should not access the attributes of other instances of other classes. You may only access their methods.

Let’s look at a concrete example:

class MyClass:
    def __init__(self):
      self.a = SomeOtherClass()

    def method(self, p1, p2):
        self.a # yes
        self.a.some_method() # yes
        self.a.some_attribute # no

        p1 # yes
        p1.some_method() # yes
        p1.some_attribute # no

        r1 = p2.some_method() # yes
        r1.some_method() # no
        r1.some_attribute # no

The general rule is “use only one dot”, except for self., in which case you can use 2 dots – one of them being part of self..

The benefit of the Law of Demeter is that there are fewer bugs, because the interfaces between objects is simpler. However, it also tends to increase the number of methods, and the more methods there are the more likely there is to be a bug. So it’s a tradeoff.

In practice, I try to keep the Law of Demeter strictly, but I always find some cases where violating it a little bit reduces the complexity of the code. Specifically, in Python attributes aren’t hidden and can sometimes be reliably used.

I like to document the attributes and then only carefully change that if need be – finding every place it is used and updating it. Or just avoiding it altogether!